Tasks in Workflows
Tasks are the fundamental building blocks of Conductor workflows. Each task represents a discrete unit of work that can be executed within a larger business process. Understanding how tasks work is essential to building effective workflows in Conductor.
When building workflows, you can use the built-in system tasks and operators provided by Conductor and additionally write your own custom worker tasks to implement custom logic.
Built-in tasks
Conductor provides built-in system tasks and operators that let you build workflows without writing custom workers. These are managed by Conductor and run within its JVM (Java Virtual Machine), allowing you to get started quickly.
- System tasks perform units of work, such as calling APIs, transforming data, or interacting with external services. System tasks include AI tasks that can be used to build AI-powered and agentic applications.
- Operators control workflow execution, such as branching, looping, and parallel execution.
Operators
Use these operators as control structures for managing the execution flow in your Conductor workflows.
| Use Case | Task to Use |
|---|---|
| Conditional flow |
|
| Looping flow |
|
| Parallel flows |
|
| Jumps or state changes in flow |
|
| State querying |
|
| Waits in flow |
|
| Dynamic tasks in flow |
|
| For assigning variables |
|
System tasks
In most common cases, you can make use of existing Conductor tasks instead of creating a custom worker from scratch. These include tasks for data transformation, user journeys, and LLM chaining.
| Use Case | Task to Use |
|---|---|
| Publish or consume events | Event |
| Call an API or HTTP endpoint | HTTP |
| Poll an API or HTTP endpoint | HTTP Poll |
| Execute JavaScript scripts | Inline |
| Clean or transform JSON data | JSON JQ Transform |
| Evaluate and retrieve data in spreadsheets | Business Rule |
| Send email using SendGrid integration | SendGrid |
| Pause the current workflow for an incoming webhook signal. | Wait for Webhook |
| Modify SQL databases | JDBC |
| Create or update secrets in your Conductor cluster | Update Secret |
| Get authorized using a signed JWT | Get Signed JWT |
| Update the status of another ongoing task. | Update Task |
| Invoke remote endpoints in gRPC services | gRPC |
| Query data from Conductor Search API or Metrics | Query Processor |
| Send alerts to Opsgenie | Opsgenie |
| Generate text from an LLM based on a defined prompt | Text Complete |
| Generate text from an LLM based on a user query and additional system/assistant instructions | Chat Complete |
| Generate text embeddings | Generate Embeddings |
| Store text embeddings in a vector database | Store Embeddings |
| Retrieve data from a vector database | Get Embeddings |
| Chunk, generate, and store text embeddings in a vector database | Index Document |
| Retrieve text or JSON content from a URL | Get Document |
| Generate and store text embeddings in a vector database | Index Text |
| Retrieve data from a vector database based on a search query | Search Index |
| Divide text into chunks based on the document type | Chunk Text |
| Retrieve files from a specific location | List Files |
| Parse and chunk documents from various storage locations | Parse Document |
Custom worker tasks
A Worker task can be used to implement custom logic beyond Conductor's built-in tasks. Use custom worker tasks when you need to:
- Integrate with proprietary internal systems or APIs not covered by built-in tasks
- Implement complex business logic specific to your domain
- Execute operations that require access to resources outside Conductor's environment
- Perform computationally intensive processing that benefits from dedicated workers
These tasks can be written in any programming language of your choice (Python, Java, JavaScript, C#, Go, and Clojure). Unlike a built-in task, a Worker task requires setting up a worker outside the Conductor environment that polls for and executes the task. Your worker application will continuously poll Conductor for tasks to execute, process them, and report results back.
Before adding a Worker task in a workflow, it must be registered as a task definition in Conductor. This allows Conductor to apply retries, timeouts, rate limits, and RBAC controls.
Refer to Writing Workers for Conductor Workflows for more information.
Task definitions and task configurations
When working with tasks in Conductor, it's important to understand the distinction between task definitions and task configurations.
Task definition
A task definition specifies a task’s general implementation details, such as expected input and output keys, and failure-handling configurations, including rate limits, retries, and timeouts. This definition applies to all instances of the task across workflows.
All task definitions are stored as JSON. These parameters can be updated in real time without needing to redeploy your application.
Before adding a Worker task or a Human task in a workflow, it must be registered as a task definition in Conductor. This allows Conductor to apply retries, timeouts, rate limits, and RBAC controls. You may also create a task definition for other system tasks to configure task-specific retry, timeout, rate limit, and other settings.
Task definitions can be registered via the Conductor UI or through SDK/API. Once registered, they can be referenced and used in different workflows.
Example
Here is an example task definition JSON:
{
"createTime": 1721901586970,
"updateTime": 1725926875230,
"createdBy": "user@acme.com",
"updatedBy": "user@acme.com",
"name": "calculate-fx",
"description": "Calculates currency exchange",
"retryCount": 0,
"timeoutSeconds": 3600,
"inputKeys": [],
"outputKeys": [],
"timeoutPolicy": "TIME_OUT_WF",
"retryLogic": "EXPONENTIAL_BACKOFF",
"retryDelaySeconds": 30,
"responseTimeoutSeconds": 600,
"concurrentExecLimit": 20,
"inputTemplate": {},
"rateLimitPerFrequency": 10,
"rateLimitFrequencyInSeconds": 1,
"ownerEmail": "user@acme.com",
"pollTimeoutSeconds": 3600,
"backoffScaleFactor": 1,
"enforceSchema": false
}
Task configuration
The task configuration is a task’s configuration details, which are part of the workflow definition. It specifies workflow-specific implementation details, such as the task reference name, task type, and task input parameters.
Although each task type has its unique configuration, all tasks share several parameters in common.
- For all tasks, the configuration specifies the input parameters for the task.
- For custom worker tasks, the configuration contains a reference to a registered worker task definition.
- For system tasks and operators, the configuration includes parameters that control the task behavior. For example, the configuration for an HTTP task specifies the endpoint URL and the payload template, which will be used during task execution.
Refer to the Task Reference to learn more about the task configuration for each task type.
Common configuration parameters
The task configurations appear in the tasks array of the workflow definition JSON. For example:
{
"name": "WorkflowDefinition",
"description": "Workflow definition",
"version": 1,
"tasks": [], // The task configuration appears here
"inputParameters": [],
"outputParameters": {},
"schemaVersion": 2,
"restartable": true,
"workflowStatusListenerEnabled": false,
"ownerEmail": "john.doe@acme.com",
"timeoutPolicy": "ALERT_ONLY",
"timeoutSeconds": 0,
"failureWorkflow": ""
}
Each task configuration JSON object may contain the following parameters:
| Parameter | Description | Required/ Optional |
|---|---|---|
| name | Name of the task. The default value is the same as the task type. The name can be changed to something descriptive, like “getUsers”. To use a given task definition, the task name here must match the task definition name (case-sensitive). Note: It is recommended to use alphanumeric characters for task names. While special characters are allowed for backward compatibility, they are not fully supported and may cause unexpected behavior. | Required. |
| taskReferenceName | Reference name for the task. Must be a unique value in a given workflow. | Required. |
| type | The task type. For example, HTTP, SIMPLE. | Required. |
| inputParameters | Map of the task’s input parameters. | Depends on the task type. |
| optional | Whether the task is optional. If set to true, any task failure is ignored, and the workflow continues with the task status updated to COMPLETED_WITH_ERRORS. However, the task must reach a terminal state. If the task remains incomplete, the workflow waits until it reaches a terminal state before proceeding. | Optional. |
| asyncComplete | Whether the task is completed asynchronously. The default value is false. Supported values:
| Optional. |
| startDelay | The time in seconds to wait before making the task available for worker polling. The default value is 0. | Optional. |
| onStateChange | Configuration for publishing an event when the task status changes. | Optional. |
Dealing with data
Conductor provide several mechanisms to pass, validate, and protect data as it moves between tasks. This section covers how to pass data using dynamic references and input templates, mask sensitive values, and enforce schema validation for inputs and outputs.
Passing data between tasks
Using dynamic references, data can be passed from one task to another. These dynamic references are formatted as JSONPath expressions. Refer to Wiring Parameters to learn more.
Passing data using task input templates
Use the task input templates in a task definition to apply default parameters to all instances of the task. Refer to Using Task Input Templates to learn more.
Masking data in tasks
Masking parameters protect sensitive data from exposure in workflows. It ensures that sensitive values are hidden and not displayed in the workflow definitions or executions. Refer to Masking Parameters to learn more.
Input/output schema validation
Create schemas to define and enforce the payload structure of workflow or task inputs/outputs. Refer to Input/Output Schema Validations to learn more.
Task reuse
Since task workers typically perform a unit of work as part of a larger workflow, Conductor’s infrastructure is built to enable task reusability out of the box. Once a task is defined in Conductor, it can be reused numerous times:
- In the same workflow, using different task reference names.
- Across various workflows.
When reusing tasks, it's important to consider situations that a multi-tenant system faces. By default, all work assigned to a worker is placed in the same task queue. This could result in your worker not being polled quickly if a noisy neighbor in the ecosystem consumes most of the task queue capacity. You can address this situation by:
- Scaling up the number of workers to handle the task load
- Using task-to-domain to route the task load into separate queues, providing isolation between different users or projects
Task lifecycle
Understanding the lifecycle of a task helps you debug and monitor your workflows effectively. Learn more in Task State Transitions.
Common pitfalls to avoid
- Don't reuse the same
taskReferenceNamewithin a workflow - each instance must have a unique reference. - Always configure appropriate timeouts and retries in task definitions to prevent workflows from hanging indefinitely.
- Remember that task definitions are shared across workflows, changing retry logic affects all workflows using that task.
- Consider the impact on shared task queues when deploying high-volume workflows.
Learn how to configure input/output parameters to be used in tasks.
📄️ Wiring Parameters
Learn how to configure variable task inputs and create the right expressions to dynamically reference and pass data between tasks.
📄️ Masking Parameters
Learn to securely pass sensitive data in Conductor by masking parameters, ensuring privacy and preventing unauthorized access to confidential data.
📄️ Using Task Input Templates
Learn how to configure task input templates and use them in workflow definitions.
📄️ Caching Task Outputs
Learn how to cache task outputs for quick access.