gcp_vertex_ai_embeddings

beta

Available in: Cloud, Self-Managed

Generates vector embeddings to represent a text string, using the Vertex AI API.

# Configuration fields, showing default values
label: ""
gcp_vertex_ai_embeddings:
  project: "" # No default (required)
  credentials_json: "" # No default (optional)
  location: us-central1
  model: text-embedding-004 # No default (required)
  task_type: RETRIEVAL_DOCUMENT
  text: "" # No default (optional)
  output_dimensions: 0 # No default (optional)

This processor sends text strings to the Vertex AI API, which generates vector embeddings for them. By default, the processor submits the entire payload of each message as a string, unless you use the text field to customize it.

For more information, see the Vertex AI documentation.

Fields

`credentials_json`

Set your Google Service Account Credentials as JSON.

This field contains sensitive information that usually shouldn’t be added to a configuration directly. For more information, see Manage Secrets before adding it to your configuration.

Type: string

`location`

The location of the Vertex AI large language model (LLM) that you want to use.

Type: string

Default: us-central1

`model`

The name of the LLM to use. For a full list of models, see the Vertex AI Model Garden.

Type: string

# Examples:
model: text-embedding-004
model: text-multilingual-embedding-002

`output_dimensions`

The maximum length of a generated vector embedding. If this value is set, generated embeddings are truncated to this size.

Type: int

`project`

The ID of your Google Cloud project.

Type: string

`task_type`

Use the following options to optimize embeddings that the model generates for specific use cases.

Type: string

Default: RETRIEVAL_DOCUMENT

Option Summary

Option	Summary
`CLASSIFICATION`	optimize for being able classify texts according to preset labels
`CLUSTERING`	optimize for clustering texts based on their similarities
`FACT_VERIFICATION`	optimize for queries that are proving or disproving a fact such as "apples grow underground"
`QUESTION_ANSWERING`	optimize for search proper questions such as "Why is the sky blue?"
`RETRIEVAL_DOCUMENT`	optimize for documents that will be searched (also known as a corpus)
`RETRIEVAL_QUERY`	optimize for queries such as "What is the best fish recipe?" or "best restaurant in Chicago"
`SEMANTIC_SIMILARITY`	optimize for text similarity

CLASSIFICATION

optimize for being able classify texts according to preset labels

CLUSTERING

optimize for clustering texts based on their similarities

FACT_VERIFICATION

optimize for queries that are proving or disproving a fact such as "apples grow underground"

QUESTION_ANSWERING

optimize for search proper questions such as "Why is the sky blue?"

RETRIEVAL_DOCUMENT

optimize for documents that will be searched (also known as a corpus)

RETRIEVAL_QUERY

optimize for queries such as "What is the best fish recipe?" or "best restaurant in Chicago"

SEMANTIC_SIMILARITY

optimize for text similarity

`text`

The text you want to generate vector embeddings for. By default, the processor submits the entire payload as a string. This field supports interpolation functions.

Type: string

Was this helpful?

group Ask in the community

mail Share your feedback

group_add Make a contribution

What do you think of this page?

Let us know more:

Let us contact you about your feedback:

gcp_vertex_ai_embeddings

Fields

credentials_json

location

model

output_dimensions

project

task_type

text

Simple online edits

Contribution guide

`credentials_json`

`location`

`model`

`output_dimensions`

`project`

`task_type`

`text`