Skip to main content

Fork Task

"type" : "FORK_JOIN"

Introduction

A Fork operation lets you run a specified list of other tasks or sub workflows in parallel. A fork task is followed by a join operation that waits on the forked tasks or sub workflows to finish. The JOIN task also collects outputs from each of the forked tasks or sub workflows.

a simple fork diagram

Use Cases

FORK_JOIN tasks are typically used when a list of tasks can be run in parallel. E.g In a notification workflow, there could be multiple ways of sending notifications, i,e e-mail, SMS, HTTP etc.. These notifications are not dependent on each other, and so they can be run in parallel. In such cases, you can create 3 sub-lists of forked tasks for each of these operations.

Configuration

A FORK_JOIN task, has a forkTasks attribute that expects an array. Each array, is a sub-list of tasks. Each of these sub-lists and then invoked in parallel. The tasks defined within each sublist can be sequential or any other way as desired.

A FORK_JOIN task has to be followed by a JOIN operation. The JOIN operator specifies which of the forked tasks to joinOn (wait for completion) before moving to the next stage in the workflow.

[
{
"name": "fork_join",
"taskReferenceName": "my_fork_join_ref",
"type": "FORK_JOIN",
"forkTasks": [
[
{
"name": "process_notification_payload",
"taskReferenceName": "process_notification_payload_email",
"type": "SIMPLE"
},
{
"name": "email_notification",
"taskReferenceName": "email_notification_ref",
"type": "SIMPLE"
}
],
[
{
"name": "process_notification_payload",
"taskReferenceName": "process_notification_payload_sms",
"type": "SIMPLE"
},
{
"name": "sms_notification",
"taskReferenceName": "sms_notification_ref",
"type": "SIMPLE"
}
],
[
{
"name": "process_notification_payload",
"taskReferenceName": "process_notification_payload_http",
"type": "SIMPLE"
},
{
"name": "http_notification",
"taskReferenceName": "http_notification_ref",
"type": "SIMPLE"
}
]
]
},
{
"name": "notification_join",
"taskReferenceName": "notification_join_ref",
"type": "JOIN",
"joinOn": [
"email_notification_ref",
"sms_notification_ref"
]
}
]

Here is how the output of notification_join will look like. The output is a map, where the keys are the names of task references that were being joinOn. The corresponding values are the outputs of those tasks.


{
"email_notification_ref": {
"email_sent_at": "2021-11-06T07:37:17+0000",
"email_sent_to": "test@example.com"
},
"sms_notification_ref": {
"sms_sent_at": "2021-11-06T07:37:17+0129",
"sms_sent_to": "+1-xxx-xxx-xxxx"
}
}

Input Configuration

AttributeDescription
nameTask Name. A unique name that is descriptive of the task function
taskReferenceNameTask Reference Name. A unique reference to this task. There can be multiple references of a task within the same workflow definition
typeTask Type. In this case, FORK_JOIN
inputParametersThe input parameters that will be supplied to this task
forkTasksA list of a list of tasks. Each of the outer list will be invoked in parallel. The inner list can be a graph of other tasks and sub-workflows

Output Configuration

This is the output configuration of the JOIN task that is used in conjunction with the FORK_JOIN task. The output of the JOIN task is a map, where the keys are the names of the task reference names where were being joinOn and the keys are the corresponding outputs of those tasks.

AttributeDescription
task_ref_name_1A task reference name that was being joinOn. The value is the output of that task
task_ref_name_2A task reference name that was being joinOn. The value is the output of that task
......
task_ref_name_NA task reference name that was being joinOn. The value is the output of that task

Example

Imagine a workflow that sends 3 notifications: email, SMS and HTTP. Since none of these steps are dependant on the others, they can be run in parallel with a fork.

The diagram will appear as:

fork diagram

Here's the JSON definition for the workflow:

[
{
"name": "fork_join",
"taskReferenceName": "my_fork_join_ref",
"type": "FORK_JOIN",
"forkTasks": [
[
{
"name": "process_notification_payload",
"taskReferenceName": "process_notification_payload_email",
"type": "SIMPLE"
},
{
"name": "email_notification",
"taskReferenceName": "email_notification_ref",
"type": "SIMPLE"
}
],
[
{
"name": "process_notification_payload",
"taskReferenceName": "process_notification_payload_sms",
"type": "SIMPLE"
},
{
"name": "sms_notification",
"taskReferenceName": "sms_notification_ref",
"type": "SIMPLE"
}
],
[
{
"name": "process_notification_payload",
"taskReferenceName": "process_notification_payload_http",
"type": "SIMPLE"
},
{
"name": "http_notification",
"taskReferenceName": "http_notification_ref",
"type": "SIMPLE"
}
]
]
},
{
"name": "notification_join",
"taskReferenceName": "notification_join_ref",
"type": "JOIN",
"joinOn": [
"email_notification_ref",
"sms_notification_ref"
]
}
]

Note: There are three parallel 'tines' to this fork, but only two of the outputs are required for the JOIN to continue. The diagram does draw an arrow from http_notification_ref to the notification_join, but it is not required for the workflow to continue.

Here is how the output of notification_join will look like. The output is a map, where the keys are the names of task references that were being joinOn. The corresponding values are the outputs of those tasks.


{
"email_notification_ref": {
"email_sent_at": "2021-11-06T07:37:17+0000",
"email_sent_to": "test@example.com"
},
"sms_notification_ref": {
"sms_sent_at": "2021-11-06T07:37:17+0129",
"sms_sent_to": "+1-xxx-xxx-xxxx"
}
}

See JOIN for more details on the JOIN aspect of the FORK.

Codelab Examples