Create a Retrieval Augmented Generation (RAG) pipeline
type: "io.kestra.plugin.ai.rag.ChatCompletion"Examples
Chat with your data using Retrieval Augmented Generation (RAG). This flow will index documents and use the RAG Chat task to interact with your data using natural language prompts. The flow contrasts prompts to LLM with and without RAG. The Chat with RAG retrieves embeddings stored in the KV Store and provides a response grounded in data rather than hallucinating. WARNING: the Kestra KV embedding store is for quick prototyping only, as it stores the embedding vectors in Kestra's KV store and loads them all into memory.
id: rag
namespace: company.ai
tasks:
- id: ingest
type: io.kestra.plugin.ai.rag.IngestDocument
provider:
type: io.kestra.plugin.ai.provider.GoogleGemini
modelName: gemini-embedding-exp-03-07
apiKey: "{{ kv('GEMINI_API_KEY') }}"
embeddings:
type: io.kestra.plugin.ai.embeddings.KestraKVStore
drop: true
fromExternalURLs:
- https://raw.githubusercontent.com/kestra-io/docs/refs/heads/main/content/blogs/release-0-24.md
- id: parallel
type: io.kestra.plugin.core.flow.Parallel
tasks:
- id: chat_without_rag
type: io.kestra.plugin.ai.completion.ChatCompletion
provider:
type: io.kestra.plugin.ai.provider.GoogleGemini
messages:
- type: USER
content: Which features were released in Kestra 0.24?
- id: chat_with_rag
type: io.kestra.plugin.ai.rag.ChatCompletion
chatProvider:
type: io.kestra.plugin.ai.provider.GoogleGemini
embeddingProvider:
type: io.kestra.plugin.ai.provider.GoogleGemini
modelName: gemini-embedding-exp-03-07
embeddings:
type: io.kestra.plugin.ai.embeddings.KestraKVStore
systemMessage: You are a helpful assistant that can answer questions about Kestra.
prompt: Which features were released in Kestra 0.24?
pluginDefaults:
- type: io.kestra.plugin.ai.provider.GoogleGemini
values:
apiKey: "{{ kv('GEMINI_API_KEY') }}"
modelName: gemini-2.5-flashRAG chat with a web search content retriever (answers grounded in search results)
id: rag_with_websearch_content_retriever
namespace: company.ai
tasks:
- id: chat_with_rag_and_websearch_content_retriever
type: io.kestra.plugin.ai.rag.ChatCompletion
chatProvider:
type: io.kestra.plugin.ai.provider.GoogleGemini
modelName: gemini-2.5-flash
apiKey: "{{ kv('GEMINI_API_KEY') }}"
contentRetrievers:
- type: io.kestra.plugin.ai.retriever.TavilyWebSearch
apiKey: "{{ kv('TAVILY_API_KEY') }}"
systemMessage: You are a helpful assistant that can answer questions about Kestra.
prompt: What is the latest release of Kestra?Store chat memory as a Kestra KV pair
id: chat_with_memory
namespace: company.ai
inputs:
- id: first
type: STRING
defaults: Hello, my name is John and I'm from Paris
- id: second
type: STRING
defaults: What's my name and where do I live?
tasks:
- id: first
type: io.kestra.plugin.ai.rag.ChatCompletion
chatProvider:
type: io.kestra.plugin.ai.provider.GoogleGemini
embeddingProvider:
type: io.kestra.plugin.ai.provider.GoogleGemini
modelName: gemini-embedding-exp-03-07
embeddings:
type: io.kestra.plugin.ai.embeddings.KestraKVStore
memory:
type: io.kestra.plugin.ai.memory.KestraKVStore
ttl: PT1M
systemMessage: You are a helpful assistant, answer concisely
prompt: "{{inputs.first}}"
- id: second
type: io.kestra.plugin.ai.rag.ChatCompletion
chatProvider:
type: io.kestra.plugin.ai.provider.GoogleGemini
embeddingProvider:
type: io.kestra.plugin.ai.provider.GoogleGemini
modelName: gemini-embedding-exp-03-07
embeddings:
type: io.kestra.plugin.ai.embeddings.KestraKVStore
memory:
type: io.kestra.plugin.ai.memory.KestraKVStore
systemMessage: You are a helpful assistant, answer concisely
prompt: "{{inputs.second}}"
pluginDefaults:
- type: io.kestra.plugin.ai.provider.GoogleGemini
values:
apiKey: "{{ kv('GEMINI_API_KEY') }}"
modelName: gemini-2.5-flashClassify recent Kestra releases into MINOR or PATCH using a JSON schema. Note: not all LLMs support structured outputs, or they may not support them when combined with tools like web search. This example uses Mistral, which supports structured output with content retrievers.
id: chat_with_structured_output
namespace: company.ai
tasks:
- id: categorize_releases
type: io.kestra.plugin.ai.rag.ChatCompletion
chatProvider:
type: io.kestra.plugin.ai.provider.MistralAI
apiKey: "{{ kv('MISTRAL_API_KEY') }}"
modelName: open-mistral-7b
contentRetrievers:
- type: io.kestra.plugin.ai.retriever.TavilyWebSearch
apiKey: "{{ kv('TAVILY_API_KEY') }}"
maxResults: 8
chatConfiguration:
responseFormat:
type: JSON
jsonSchema:
type: object
required: ["releases"]
properties:
releases:
type: array
minItems: 1
items:
type: object
additionalProperties: false
required: ["version", "date", "semver"]
properties:
version:
type: string
description: "Release tag, e.g., 0.24.0"
date:
type: string
description: "Release date"
semver:
type: string
enum: ["MINOR", "PATCH"]
summary:
type: string
description: "Short plain-text summary (optional)"
systemMessage: |
You are a release analyst. Use the Tavily web retriever to find recent Kestra releases.
Determine each release's SemVer category:
- MINOR: new features, no major breaking changes (y in x.Y.z)
- PATCH: bug fixes/patches only (z in x.y.Z)
Return ONLY valid JSON matching the schema. No prose, no extra keys.
prompt: |
Find most recent Kestra releases (within the last ~6 months).
Output their version, release date, semver category, and a one-line summary.Properties
chatProvider *RequiredNon-dynamicAmazonBedrockAnthropicAzureOpenAIDashScopeDeepSeekGoogleGeminiGoogleVertexAIHuggingFaceLocalAIMistralAIOciGenAIOllamaOpenAIOpenRouterWorkersAIZhiPuAI
Chat model provider
prompt *Requiredstring
User prompt
The user input for this run. May be templated from flow inputs.
chatConfiguration Non-dynamicChatConfiguration
{}Chat configuration
contentRetrieverConfiguration Non-dynamicChatCompletion-ContentRetrieverConfiguration
{
"maxResults": 3,
"minScore": 0
}Content retriever configuration
contentRetrievers GoogleCustomWebSearchSqlDatabaseRetrieverTavilyWebSearch
Additional content retrievers
Some content retrievers like WebSearch can also be used as tools, but using them as content retrievers will ensure that they are always called whereas tools are only used when the LLM decides to.
embeddingProvider Non-dynamicAmazonBedrockAnthropicAzureOpenAIDashScopeDeepSeekGoogleGeminiGoogleVertexAIHuggingFaceLocalAIMistralAIOciGenAIOllamaOpenAIOpenRouterWorkersAIZhiPuAI
Embedding model provider
Optional. If not set, the embedding model is created from chatProvider. Ensure the chosen chat provider supports embeddings.
embeddings Non-dynamicChromaElasticsearchKestraKVStoreMariaDBMilvusMongoDBAtlasPGVectorPineconeQdrantRedisTablestoreWeaviate
Embedding store
Optional when at least one entry is provided in contentRetrievers.
memory Non-dynamicKestraKVStorePostgreSQLRedis
Chat memory
Stores conversation history and injects it into context on subsequent runs.
systemMessage string
System message
Instruction that sets the assistant's role, tone, and constraints for this task.
tools Non-dynamicA2AAgentAIAgentCodeExecutionDockerMcpClientGoogleCustomWebSearchKestraFlowKestraTaskSseMcpClientStdioMcpClientStreamableHttpMcpClientTavilyWebSearch
Optional tools the LLM may call to augment its response
Outputs
finishReason string
STOPLENGTHTOOL_EXECUTIONCONTENT_FILTEROTHERFinish reason
jsonOutput object
LLM output for JSON response format
The result of the LLM completion for response format of type JSON, null otherwise.
outputFiles object
URIs of the generated files in Kestra's internal storage
requestDuration integer
Request duration in milliseconds
textOutput string
LLM output for TEXT response format
The result of the LLM completion for response format of type TEXT (default), null otherwise.
thinking string
Model's Thinking Output
Contains the model's internal reasoning or 'thinking' text, if the model supports it and 'returnThinking' is enabled. This may include intermediate reasoning steps, such as chain-of-thought explanations. Null if thinking is not supported, not enabled, or not returned by the model.
tokenUsage TokenUsage
Token usage
Metrics
input.token.count counter
tokenLarge Language Model (LLM) input token count
output.token.count counter
tokenLarge Language Model (LLM) output token count
total.token.count counter
tokenLarge Language Model (LLM) total token count
Definitions
MongoDB Atlas Embedding Store
collectionName *Requiredstring
The collection name
host *Requiredstring
The host
indexName *Requiredstring
The index name
scheme *Requiredstring
The scheme (e.g., mongodb+srv)
type *Requiredobject
createIndex booleanstring
Create the index
database string
The database
metadataFieldNames array
The metadata field names
options object
The connection string options
password string
The password
username string
The username
MariaDB Embedding Store
createTable *Requiredbooleanstring
Whether to create the table if it doesn't exist
databaseUrl *Requiredstring
Database URL of the MariaDB database (e.g., jdbc: mariadb://host: port/dbname)
fieldName *Requiredstring
Name of the column used as the unique ID in the database
password *Requiredstring
The password
tableName *Requiredstring
Name of the table where embeddings will be stored
type *Requiredobject
username *Requiredstring
The username
columnDefinitions array
Metadata Column Definitions
List of SQL column definitions for metadata fields (e.g., 'text TEXT', 'source TEXT'). Required only when using COLUMN_PER_KEY storage mode.
indexes array
Metadata Index Definitions
List of SQL index definitions for metadata columns (e.g., 'INDEX idx_text (text)'). Used only with COLUMN_PER_KEY storage mode.
metadataStorageMode string
Metadata Storage Mode
Determines how metadata is stored: - COLUMN_PER_KEY: Use individual columns for each metadata field (requires columnDefinitions and indexes). - COMBINED_JSON (default): Store metadata as a JSON object in a single column. If columnDefinitions and indexes are provided, COLUMN_PER_KEY must be used.
Chroma Embedding Store
baseUrl *Requiredstring
The database base URL
collectionName *Requiredstring
The collection name
type *Requiredobject
ZhiPu AI Model Provider
apiKey *Requiredstring
API Key
modelName *Requiredstring
Model name
type *Requiredobject
baseUrl string
https://open.bigmodel.cn/API base URL
The base URL for ZhiPu API (defaults to https://open.bigmodel.cn/)
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
maxRetries integerstring
The maximum retry times to request
maxToken integerstring
The maximum number of tokens returned by this request
stops array
With the stop parameter, the model will automatically stop generating text when it is about to contain the specified string or token_id
Redis Embedding Store
host *Requiredstring
The database server host
port *Requiredintegerstring
The database server port
type *Requiredobject
indexName string
embedding-indexThe index name
Call a Kestra runnable task as a tool
io.kestra.plugin.ai.domain.AIOutput-ToolExecution
requestArguments object
requestId string
requestName string
result string
io.kestra.plugin.ai.domain.AIOutput-AIResponse
completion string
Generated text completion
The result of the text completion
finishReason string
STOPLENGTHTOOL_EXECUTIONCONTENT_FILTEROTHERFinish reason
id string
Response identifier
requestDuration integer
Request duration in milliseconds
tokenUsage TokenUsage
Token usage
io.kestra.plugin.ai.domain.ChatConfiguration-ResponseFormat
jsonSchema object
JSON Schema (used when type = JSON)
Provide a JSON Schema describing the expected structure of the response. In Kestra flows, define the schema in YAML (it is still a JSON Schema object). Example (YAML):
responseFormat:
type: JSON
jsonSchema:
type: object
required: ["category", "priority"]
properties:
category:
type: string
enum: ["ACCOUNT", "BILLING", "TECHNICAL", "GENERAL"]
priority:
type: string
enum: ["LOW", "MEDIUM", "HIGH"]
Note: Provider support for strict schema enforcement varies. If unsupported, guide the model about the expected output structure via the prompt and validate downstream.
jsonSchemaDescription string
Schema description (optional)
Natural-language description of the schema to help the model produce the right fields. Example: "Classify a customer ticket into category and priority."
type string
TEXTTEXTJSONResponse format type
Specifies how the LLM should return output. Allowed values:
- TEXT (default): free-form natural language.
- JSON: structured output validated against a JSON Schema.
Google Custom Search web tool
apiKey *Requiredstring
API key
csi *Requiredstring
Custom search engine ID (cx)
type *Requiredobject
Code execution tool using Judge0
apiKey *Requiredstring
RapidAPI key for Judge0
You can obtain it from the RapidAPI website.
type *Requiredobject
OpenAI Model Provider
apiKey *Requiredstring
API Key
modelName *Requiredstring
Model name
type *Requiredobject
baseUrl string
https://api.openai.com/v1API base URL
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
Web search content retriever for Google Custom Search
apiKey *Requiredstring
API key
csi *Requiredstring
Custom search engine ID (cx)
type *Requiredobject
maxResults integerstring
3Maximum number of results
com.alicloud.openservices.tablestore.model.search.vector.VectorOptions
dataType string
dimension integer
metricType string
EUCLIDEANCOSINEDOT_PRODUCTElasticsearch Embedding Store
connection *RequiredElasticsearch-ElasticsearchConnection
indexName *Requiredstring
The name of the index to store embeddings
type *Requiredobject
io.kestra.plugin.ai.rag.ChatCompletion-ContentRetrieverConfiguration
maxResults integer
3Maximum results to return from the embedding store
minScore number
0Minimum similarity score (0-1 inclusive). Only results with score ≥ minScore are returned.
io.kestra.plugin.ai.domain.AIOutput-AIResponse-ToolExecutionRequest
arguments object
Tool request arguments
id string
Tool execution request identifier
name string
Tool name
Qdrant Embedding Store
apiKey *Requiredstring
The API key
collectionName *Requiredstring
The collection name
host *Requiredstring
The database server host
port *Requiredintegerstring
The database server port
type *Requiredobject
Google VertexAI Model Provider
endpoint *Requiredstring
Endpoint URL
location *Requiredstring
Project location
modelName *Requiredstring
Model name
project *Requiredstring
Project ID
type *Requiredobject
baseUrl string
Base URL
Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
OciGenAI Model Provider
compartmentId *Requiredstring
OCID of OCI Compartment with the model
modelName *Requiredstring
Model name
region *Requiredstring
OCI Region to connect the client to
type *Requiredobject
authProvider string
OCI SDK Authentication provider
baseUrl string
Base URL
Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
Call a remote AI agent via the A2A protocol.
description *Requiredstring
Agent description
The description will be used to instruct the LLM what the tool is doing.
serverUrl *Requiredstring
Server URL
The URL of the remote agent A2A server
type *Requiredobject
name string
toolAgent name
It must be set to a different value than the default in case you want to have multiple agents used as tools in the same task.
WebSearch content retriever for Tavily Search
apiKey *Requiredstring
API Key
type *Requiredobject
maxResults integerstring
3Maximum number of results to return
Chat Memory backed by Redis
host *Requiredstring
Redis host
The hostname of your Redis server (e.g., localhost or redis-server)
type *Requiredobject
drop string
NEVERNEVERBEFORE_TASKRUNAFTER_TASKRUNDrop memory: never, before, or after the agent's task run
By default, the memory ID is the value of the system.correlationId label, meaning that the same memory will be used by all tasks of the flow and its subflows.
If you want to remove the memory eagerly (before expiration), you can set drop: AFTER_TASKRUN to erase the memory after the taskrun.
You can also set drop: BEFORE_TASKRUN to drop the memory before the taskrun.
memoryId string
{{ labels.system.correlationId }}Memory ID - defaults to the value of the system.correlationId label. This means that a memory is valid for the entire flow execution including its subflows.
messages integerstring
10Maximum number of messages to keep in memory. If memory is full, the oldest messages will be removed in a FIFO manner. The last system message is always kept.
port integerstring
6379Redis port
The port of your Redis server
ttl string
PT1HdurationMemory duration - defaults to 1h
Milvus Embedding Store
token *Requiredstring
Token
Milvus auth token. Required if authentication is enabled; omit for local deployments without auth.
type *Requiredobject
autoFlushOnDelete booleanstring
Auto flush on delete
If true, flush after delete operations.
autoFlushOnInsert booleanstring
Auto flush on insert
If true, flush after insert operations. Setting it to false can improve throughput.
collectionName string
Collection name
Target collection. Created automatically if it does not exist. Default: "default".
consistencyLevel string
Consistency level
Read/write consistency level. Common values include STRONG, BOUNDED, or EVENTUALLY (depends on client/version).
databaseName string
Database name
Logical database to use. If not provided, the default database is used.
host string
Host
Milvus host name (used when uri is not set). Default: "localhost".
idFieldName string
ID field name
Field name for document IDs. Default depends on collection schema.
indexType string
Index type
Vector index type (e.g., IVF_FLAT, IVF_SQ8, HNSW). Depends on Milvus deployment and dataset.
metadataFieldName string
Metadata field name
Field name for metadata. Default depends on collection schema.
metricType string
Metric type
Similarity metric (e.g., L2, IP, COSINE). Should match the embedding provider’s expected metric.
password string
Password
Required when authentication/TLS is enabled. See https://milvus.io/docs/authenticate.md
port integerstring
Port
Milvus port (used when uri is not set). Typical: 19530 (gRPC) or 9091 (HTTP). Default: 19530.
retrieveEmbeddingsOnSearch booleanstring
Retrieve embeddings on search
If true, return stored embeddings along with matches. Default: false.
textFieldName string
Text field name
Field name for original text. Default depends on collection schema.
uri string
URI
Connection URI. Use either uri OR host/port (not both).
Examples:
- gRPC (typical): "milvus://host: 19530"
- HTTP: "http://host: 9091"
username string
Username
Required when authentication/TLS is enabled. See https://milvus.io/docs/authenticate.md
vectorFieldName string
Vector field name
Field name for the embedding vector. Must match the index definition and embedding dimensionality.
Anthropic AI Model Provider
apiKey *Requiredstring
API Key
modelName *Requiredstring
Model name
type *Requiredobject
baseUrl string
Base URL
Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
maxTokens integerstring
Maximum Tokens
Specifies the maximum number of tokens that the model is allowed to generate in its response.
WebSearch tool for Tavily Search
apiKey *Requiredstring
Tavily API Key - you can obtain one from the Tavily website
type *Requiredobject
In-memory Chat Memory that stores its data as Kestra KV pairs
type *Requiredobject
drop string
NEVERNEVERBEFORE_TASKRUNAFTER_TASKRUNDrop memory: never, before, or after the agent's task run
By default, the memory ID is the value of the system.correlationId label, meaning that the same memory will be used by all tasks of the flow and its subflows.
If you want to remove the memory eagerly (before expiration), you can set drop: AFTER_TASKRUN to erase the memory after the taskrun.
You can also set drop: BEFORE_TASKRUN to drop the memory before the taskrun.
memoryId string
{{ labels.system.correlationId }}Memory ID - defaults to the value of the system.correlationId label. This means that a memory is valid for the entire flow execution including its subflows.
messages integerstring
10Maximum number of messages to keep in memory. If memory is full, the oldest messages will be removed in a FIFO manner. The last system message is always kept.
ttl string
PT1HdurationMemory duration - defaults to 1h
io.kestra.plugin.ai.embeddings.Elasticsearch-ElasticsearchConnection
hosts *Requiredarray
1List of HTTP Elasticsearch servers
Must be a URI like https://example.com: 9200 with scheme and port
basicAuth Elasticsearch-ElasticsearchConnection-BasicAuth
Basic authorization configuration
headers array
List of HTTP headers to be sent with every request
Each item is a key: value string, e.g., Authorization: Token XYZ
pathPrefix string
Path prefix for all HTTP requests
If set to /my/path, each client request becomes /my/path/ + endpoint. Useful when Elasticsearch is behind a proxy providing a base path; do not use otherwise.
strictDeprecationMode booleanstring
Treat responses with deprecation warnings as failures
trustAllSsl booleanstring
Trust all SSL CA certificates
Use this if the server uses a self-signed SSL certificate
LocalAI Model Provider
baseUrl *Requiredstring
API base URL
modelName *Requiredstring
Model name
type *Requiredobject
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
PGVector Embedding Store
database *Requiredstring
The database name
host *Requiredstring
The database server host
password *Requiredstring
The database password
port *Requiredintegerstring
The database server port
table *Requiredstring
The table to store embeddings in
type *Requiredobject
user *Requiredstring
The database user
useIndex booleanstring
falseWhether to use use an IVFFlat index
An IVFFlat index divides vectors into lists, and then searches a subset of those lists closest to the query vector. It has faster build times and uses less memory than HNSW but has lower query performance (in terms of speed-recall tradeoff).
com.alicloud.openservices.tablestore.model.search.FieldSchema
analyzer string
SingleWordMaxWordMinWordSplitFuzzyanalyzerParameter AnalyzerParameter
dateFormats array
enableHighlighting boolean
enableSortAndAgg boolean
fieldName string
fieldType string
LONGDOUBLEBOOLEANKEYWORDTEXTNESTEDGEO_POINTDATEVECTORFUZZY_KEYWORDIPJSONUNKNOWNindex boolean
indexOptions string
DOCSFREQSPOSITIONSOFFSETSisArray boolean
jsonType string
FLATTENNESTEDsourceFieldNames array
store boolean
vectorOptions VectorOptions
Mistral AI Model Provider
apiKey *Requiredstring
API Key
modelName *Requiredstring
Model name
type *Requiredobject
baseUrl string
Base URL
Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
Model Context Protocol (MCP) Stdio client tool
command *Requiredarray
MCP client command, as a list of command parts
type *Requiredobject
env object
Environment variables
logEvents booleanstring
falseLog events
Call a Kestra flow as a tool
type *Requiredobject
description string
Description of the flow if not already provided inside the flow itself
Use it only if you define the flow in the tool definition. The LLM needs a tool description to identify whether to call it. If the flow has a description, the tool will use it. Otherwise, the description property must be explicitly defined.
flowId string
Flow ID of the flow that should be called
inheritLabels booleanstring
falseWhether the flow should inherit labels from this execution that triggered it
By default, labels are not inherited. If you set this option to true, the flow execution will inherit all labels from the agent's execution.
Any labels passed by the LLM will override those defined here.
inputs object
Input values that should be passed to flow's execution
Any inputs passed by the LLM will override those defined here.
labels arrayobject
Labels that should be added to the flow's execution
Any labels passed by the LLM will override those defined here.
namespace string
Namespace of the flow that should be called
revision integerstring
Revision of the flow that should be called
scheduleDate string
date-timeSchedule the flow execution at a later date
If the LLM sets a scheduleDate, it will override the one defined here.
io.kestra.plugin.ai.embeddings.Elasticsearch-ElasticsearchConnection-BasicAuth
password string
Basic authorization password
username string
Basic authorization username
Model Context Protocol (MCP) SSE client tool
type *Requiredobject
url *Requiredstring
URL of the MCP server
headers object
Custom headers
Useful, for example, for adding authentication tokens via the Authorization header.
logRequests booleanstring
falseLog requests
logResponses booleanstring
falseLog responses
timeout string
durationConnection timeout duration
Chat Memory backed by PostgreSQL
database *Requiredstring
Database name
The name of the PostgreSQL database
host *Requiredstring
PostgreSQL host
The hostname of your PostgreSQL server
password *Requiredstring
Database password
The password to connect to PostgreSQL
type *Requiredobject
user *Requiredstring
Database user
The username to connect to PostgreSQL
drop string
NEVERNEVERBEFORE_TASKRUNAFTER_TASKRUNDrop memory: never, before, or after the agent's task run
By default, the memory ID is the value of the system.correlationId label, meaning that the same memory will be used by all tasks of the flow and its subflows.
If you want to remove the memory eagerly (before expiration), you can set drop: AFTER_TASKRUN to erase the memory after the taskrun.
You can also set drop: BEFORE_TASKRUN to drop the memory before the taskrun.
memoryId string
{{ labels.system.correlationId }}Memory ID - defaults to the value of the system.correlationId label. This means that a memory is valid for the entire flow execution including its subflows.
messages integerstring
10Maximum number of messages to keep in memory. If memory is full, the oldest messages will be removed in a FIFO manner. The last system message is always kept.
port integerstring
5432PostgreSQL port
The port of your PostgreSQL server
tableName string
chat_memoryTable name
The name of the table used to store chat memory. Defaults to 'chat_memory'.
ttl string
PT1HdurationMemory duration - defaults to 1h
Deepseek Model Provider
apiKey *Requiredstring
API Key
modelName *Requiredstring
Model name
type *Requiredobject
baseUrl string
https://api.deepseek.com/v1API base URL
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
Pinecone Embedding Store
apiKey *Requiredstring
The API key
cloud *Requiredstring
The cloud provider
index *Requiredstring
The index
region *Requiredstring
The cloud provider region
type *Requiredobject
namespace string
The namespace (default will be used if not provided)
Call an AI Agent as a tool
description *Requiredstring
Agent description
The description will be used to instruct the LLM what the tool is doing.
provider *RequiredAmazonBedrockAnthropicAzureOpenAIDashScopeDeepSeekGoogleGeminiGoogleVertexAIHuggingFaceLocalAIMistralAIOciGenAIOllamaOpenAIOpenRouterWorkersAIZhiPuAI
Language model provider
type *Requiredobject
configuration ChatConfiguration
{}Language model configuration
contentRetrievers GoogleCustomWebSearchSqlDatabaseRetrieverTavilyWebSearch
Content retrievers
Some content retrievers, like WebSearch, can also be used as tools. However, when configured as content retrievers, they will always be used, whereas tools are only invoked when the LLM decides to use them.
maxSequentialToolsInvocations integerstring
Maximum sequential tools invocations
name string
toolAgent name
It must be set to a different value than the default in case you want to have multiple agents used as tools in the same task.
systemMessage string
System message
The system message for the language model
tools A2AAgentAIAgentCodeExecutionDockerMcpClientGoogleCustomWebSearchKestraFlowKestraTaskSseMcpClientStdioMcpClientStreamableHttpMcpClientTavilyWebSearch
Tools that the LLM may use to augment its response
OpenRouter Model Provider
apiKey *Requiredstring
API Key
modelName *Requiredstring
Model name
type *Requiredobject
baseUrl string
Base URL
Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
com.alicloud.openservices.tablestore.model.search.analysis.AnalyzerParameter
Model Context Protocol (MCP) Docker client tool
image *Requiredstring
Container image
type *Requiredobject
apiVersion string
API version
binds array
Volume binds
command array
MCP client command, as a list of command parts
dockerCertPath string
Docker certificate path
dockerConfig string
Docker configuration
dockerContext string
Docker context
dockerHost string
Docker host
dockerTlsVerify booleanstring
Whether Docker should verify TLS certificates
env object
Environment variables
logEvents booleanstring
falseWhether to log events
registryEmail string
Container registry email
registryPassword string
Container registry password
registryUrl string
Container registry URL
registryUsername string
Container registry username
Ollama Model Provider
endpoint *Requiredstring
Model endpoint
modelName *Requiredstring
Model name
type *Requiredobject
baseUrl string
Base URL
Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
SQL Database content retriever using LangChain4j experimental SqlDatabaseContentRetriever. ⚠ IMPORTANT: the database user should have READ-ONLY permissions.
databaseType *Requiredstring
POSTGRESQLMYSQLH2Type of database to connect to (PostgreSQL, MySQL, or H2)
Determines the default JDBC driver and connection format.
password *Requiredstring
Database password
provider *RequiredAmazonBedrockAnthropicAzureOpenAIDashScopeDeepSeekGoogleGeminiGoogleVertexAIHuggingFaceLocalAIMistralAIOciGenAIOllamaOpenAIOpenRouterWorkersAIZhiPuAI
Language model provider
type *Requiredobject
username *Requiredstring
Database username
configuration ChatConfiguration
{}Language model configuration
driver string
Optional JDBC driver class name – automatically resolved if not provided.
jdbcUrl string
JDBC connection URL to the target database
maxPoolSize integerstring
2Maximum number of database connections in the pool
io.kestra.plugin.ai.domain.AIOutput-ContentSource
content string
Extracted text segment
A snippet of text relevant to the user's query, typically a sentence, paragraph, or other discrete unit of text.
metadata object
Source metadata
Key-value pairs providing context about the origin of the content, such as URLs, document titles, or other relevant attributes.
io.kestra.plugin.ai.domain.ChatConfiguration
logRequests booleanstring
Log LLM requests
If true, prompts and configuration sent to the LLM will be logged at INFO level.
logResponses booleanstring
Log LLM responses
If true, raw responses from the LLM will be logged at INFO level.
maxToken integerstring
Maximum number of tokens the model can generate in the completion (response). This limits the length of the output.
responseFormat ChatConfiguration-ResponseFormat
Response format
Defines the expected output format. Default is plain text.
Some providers allow requesting JSON or schema-constrained outputs, but support varies and may be incompatible with tool use.
When using a JSON schema, the output will be returned under the key jsonOutput.
returnThinking booleanstring
Return Thinking
Controls whether to return the model's internal reasoning or 'thinking' text, if available. When enabled, the reasoning content is extracted from the response and made available in the AiMessage object. It Does not trigger the thinking process itself—only affects whether the output is parsed and returned.
seed integerstring
Seed
Optional random seed for reproducibility. Provide a positive integer (e.g., 42, 1234). Using the same seed with identical settings produces repeatable outputs.
temperature numberstring
Temperature
Controls randomness in generation. Typical range is 0.0–1.0. Lower values (e.g., 0.2) make outputs more focused and deterministic, while higher values (e.g., 0.7–1.0) increase creativity and variability.
thinkingBudgetTokens integerstring
Thinking Token Budget
Specifies the maximum number of tokens allocated as a budget for internal reasoning processes, such as generating intermediate thoughts or chain-of-thought sequences, allowing the model to perform multi-step reasoning before producing the final output.
thinkingEnabled booleanstring
Enable Thinking
Enables internal reasoning ('thinking') in supported language models, allowing the model to perform intermediate reasoning steps before producing a final output; this is useful for complex tasks like multi-step problem solving or decision making, but may increase token usage and response time, and is only applicable to compatible models.
topK integerstring
Top-K
Limits sampling to the top K most likely tokens at each step. Typical values are between 20 and 100. Smaller values reduce randomness; larger values allow more diverse outputs.
topP numberstring
Top-P (nucleus sampling)
Selects from the smallest set of tokens whose cumulative probability is ≤ topP. Typical values are 0.8–0.95. Lower values make the output more focused, higher values increase diversity.
io.kestra.plugin.ai.domain.TokenUsage
inputTokenCount integer
outputTokenCount integer
totalTokenCount integer
WorkersAI Model Provider
accountId *Requiredstring
Account Identifier
Unique identifier assigned to an account
apiKey *Requiredstring
API Key
modelName *Requiredstring
Model name
type *Requiredobject
baseUrl string
Base URL
Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
Azure OpenAI Model Provider
endpoint *Requiredstring
API endpoint
The Azure OpenAI endpoint in the format: https://{resource}.openai.azure.com/
modelName *Requiredstring
Model name
type *Requiredobject
apiKey string
API Key
baseUrl string
Base URL
Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientId string
Client ID
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
clientSecret string
Client secret
serviceVersion string
API version
tenantId string
Tenant ID
Tablestore Embedding Store
accessKeyId *Requiredstring
Access Key ID
The access key ID used for authentication with the database.
accessKeySecret *Requiredstring
Access Key Secret
The access key secret used for authentication with the database.
endpoint *Requiredstring
Endpoint URL
The base URL for the Tablestore database endpoint.
instanceName *Requiredstring
Instance Name
The name of the Tablestore database instance.
type *Requiredobject
Google Gemini Model Provider
apiKey *Requiredstring
API Key
modelName *Requiredstring
Model name
type *Requiredobject
baseUrl string
Base URL
Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
In-memory embedding store that stores data as Kestra KV pairs
type *Requiredobject
kvName string
{{flow.id}}-embedding-storeThe name of the KV pair to use
Model Context Protocol (MCP) SSE client tool
sseUrl *Requiredstring
SSE URL of the MCP server
type *Requiredobject
headers object
Custom headers
Could be useful, for example, to add authentication tokens via the Authorization header.
logRequests booleanstring
falseLog requests
logResponses booleanstring
falseLog responses
timeout string
durationConnection timeout duration
Weaviate Embedding Store
apiKey *Requiredstring
API key
Weaviate API key. Omit for local deployments without auth.
host *Requiredstring
Host
Cluster host name without protocol, e.g., "abc123.weaviate.network".
type *Requiredobject
avoidDups booleanstring
Avoid duplicates
If true (default), a hash-based ID is derived from each text segment to prevent duplicates. If false, a random ID is used.
consistencyLevel string
ONEQUORUMALLConsistency level
Write consistency: ONE, QUORUM (default), or ALL.
grpcPort integerstring
gRPC port
Port for gRPC if enabled (e.g., 50051).
metadataFieldName string
Metadata field name
Field used to store metadata. Defaults to "_metadata" if not set.
metadataKeys array
Metadata keys
The list of metadata keys to store - if not provided, it will default to an empty list.
objectClass string
Object class
Weaviate class to store objects in (must start with an uppercase letter). Defaults to "Default" if not set.
port integerstring
Port
Optional port (e.g., 443 for https, 80 for http). Leave unset to use provider defaults.
scheme string
Scheme
Cluster scheme: "https" (recommended) or "http".
securedGrpc booleanstring
Secure gRPC
Whether the gRPC connection is secured (TLS).
useGrpcForInserts booleanstring
Use gRPC for batch inserts
If true, use gRPC for batch inserts. HTTP remains required for search operations.
DashScope (Qwen) Model Provider from Alibaba Cloud
apiKey *Requiredstring
API Key
modelName *Requiredstring
Model name
type *Requiredobject
baseUrl string
https://dashscope-intl.aliyuncs.com/api/v1API base URL
If you use a model in the China (Beijing) region, you need to replace the URL with: https://dashscope.aliyuncs.com/api/v1,
otherwise use the Singapore region of: "https://dashscope-intl.aliyuncs.com/api/v1.
The default value is computed based on the system timezone.
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
enableSearch booleanstring
Whether the model uses Internet search results for reference when generating text or not
maxTokens integerstring
The maximum number of tokens returned by this request
repetitionPenalty numberstring
Repetition in a continuous sequence during model generation
Increasing repetition_penalty reduces the repetition in model generation,
1.0 means no penalty. Value range: (0, +inf)
Amazon Bedrock Model Provider
accessKeyId *Requiredstring
AWS Access Key ID
modelName *Requiredstring
Model name
secretAccessKey *Requiredstring
AWS Secret Access Key
type *Requiredobject
baseUrl string
Base URL
Custom base URL to override the default endpoint (useful for local tests, WireMock, or enterprise gateways).
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.
modelType string
COHERECOHERETITANAmazon Bedrock Embedding Model Type
HuggingFace Model Provider
apiKey *Requiredstring
API Key
modelName *Requiredstring
Model name
type *Requiredobject
baseUrl string
https://router.huggingface.co/v1API base URL
caPem string
CA PEM certificate content
CA certificate as text, used to verify SSL/TLS connections when using custom endpoints.
clientPem string
Client PEM certificate content
PEM client certificate as text, used to authenticate the connection to enterprise AI endpoints.