Skip to main content

LLM Store Embeddings

The LLM Store Embeddings task is used to store the generated embeddings produced by the LLM Generate Embeddings task in a vector database. The stored embeddings serve as a repository of information that can be later accessed by the LLM Get Embeddings task for efficient and quick retrieval of related data.

The LLM Store Embeddings task takes the embeddings generated by the LLM Generate Embeddings task and stores them in a specified vector database. This involves specifying parameters such as the vector database provider, index, namespace, and embedding model details. The task ensures the embeddings are organized and accessible for future retrieval operations.

Prerequisites

Task parameters

Configure these parameters for the LLM Store Embeddings task.

ParameterDescriptionRequired/ Optional
inputParameters.vectorDBThe vector database to store the data.

Note: If you haven’t configured the vector database on your Orkes Conductor cluster, navigate to the Integrations tab and configure your required provider.
Required.
inputParameters.indexThe index in your vector database where the text or data will be stored.

The terminology of the index field varies depending on the integration:
  • For Weaviate, the index field indicates the collection name.
  • For other integrations, it denotes the index name.
Required.
inputParameters.namespaceNamespaces are separate isolated environments within the database to manage and organize vector data effectively. Enter the namespace the task will utilize.

The usage and terminology of the namespace field vary depending on the integration:
  • For Pinecone, the namespace field is applicable.
  • For Weaviate, the namespace field is not applicable.
  • For MongoDB, the namespace field is referred to as “Collection” in MongoDB.
  • For Postgres, the namespace field is referred to as “Table” in Postgres.
Required.
inputParameters.idAn arbitrary vector ID to identify the vector in the database.Optional.
inputParameters.embeddingModelProviderThe LLM provider for generating the embeddings.

Note: If you haven’t configured your AI/LLM provider on your Orkes console, navigate to the Integrations tab and configure your required provider.
Required.
inputParameters.embeddingModelThe embedding model provided by the selected LLM provider to generate the embeddings.Required.
inputParameters.embeddingsThe vector representation of the input text, generated by an embedding model. This value is used to store vectors in the vector database or to perform similarity search.Required.
inputParameters.metadataA map of key value pairs associated with the embeddings. Metadata is stored alongside the vectors and can include additional context, such as the original text or identifiers, to enrich retrieval results.Optional.

The following are generic configuration parameters that can be applied to the task and are not specific to the LLM Store Embeddings task.

Caching parameters

You can cache the task outputs using the following parameters. Refer to Caching Task Outputs for a full guide.

ParameterDescriptionRequired/ Optional
cacheConfig.ttlInSecondThe time to live in seconds, which is the duration for the output to be cached.Required if using cacheConfig.
cacheConfig.keyThe cache key is a unique identifier for the cached output and must be constructed exclusively from the task’s input parameters.
It can be a string concatenation that contains the task’s input keys, such as ${uri}-${method} or re_${uri}_${method}.
Required if using cacheConfig.
Other generic parameters

Here are other parameters for configuring the task behavior.

ParameterDescriptionRequired/ Optional
optionalWhether the task is optional.

If set totrue, any task failure is ignored, and the workflow continues with the task status updated to COMPLETED_WITH_ERRORS. However, the task must reach a terminal state. If the task remains incomplete, the workflow waits until it reaches a terminal state before proceeding.
Optional.

Task configuration

This is the task configuration for an LLM Store Embeddings task.

{
"name": "llm_store_embeddings",
"taskReferenceName": "llm_store_embeddings_ref",
"inputParameters": {
"vectorDB": "Pinecone",
"index": "doc",
"namespace": "docs",
"id": "${workflow.input.id}",
"embeddingModelProvider": "openAI",
"embeddingModel": "chatgpt-4o-latest",
"embeddings": "${llm_generate_embeddings_ref.output}",
"metadata": {
"SomeKey": "Some-value"
}
},
"type": "LLM_STORE_EMBEDDINGS"
}

Task output

There is no output. The LLM Store Embeddings task will store the embeddings in the specified vector database.

Adding an LLM Store Embeddings task in UI

To add an LLM Store Embeddings task:

  1. In your workflow, select the (+) icon and add an LLM Store Embeddings task.
  2. In Vector Database Configuration, select the Vector database, Index, and Namespace to store the embeddings.
  3. In Vector ID, enter an arbitrary ID to identify the vector in the database.
  4. In Embedding Model, select the Embedding model provider and Embedding model to generate the embeddings.
  5. In Embeddings, provide the embedding vectors to store. This is typically the output of an LLM Generate Embeddings task.
  6. (Optional) In Metadata, add key value pairs to store additional context with the embeddings, such as the original text or document identifiers.

LLM Store Embeddings Task - UI

Examples

Here are some examples for using the LLM Store Embeddings task.

Using an LLM Store Embeddings task in a workflow