Orkes Conductor provides a visual representation of workflows that aids in quickly troubleshooting issues. Conductor UI provides us with the ability to look into workflow executions which helps in quickly spotting and resolving issues.
Searching/Viewing Workflow Executions
All recent workflow executions are listed on the WORKFLOWS > Executions page. This view is filtered by the permissions of the user by default, i.e., users can view the execution of only the permitted workflows.
Data on this page can be filtered by searching through workflow name, ID, status, time period, and past days’ executions. Click on the required execution to view the individual execution of a workflow.
You can also search for workflow names by inputting partial values with wildcards (*) support. For example, if you want to search for workflow names containing “test”, then search for test*, and it will display all the workflow definitions with ‘test’ in their name.
A sample execution looks like this:
The page consists of the following sub-tabs:
- Diagram - Shows the visual representation of the workflow. If the workflow isn’t completed/failed, the diagram indicates the same.
- Task List - Includes the details of the tasks within the workflow.
- Timeline - A timeline showing the time taken by different tasks for execution.
- Summary - Includes the workflow details such as workflow ID, status, version, start/end time, and duration.
- Workflow Input/Output - Shows the list of the inputs and outputs of the workflow.
- JSON - Include the complete JSON of the workflow.
Each of these tabs gives the details that can help debug workflow issues.
The diagram tab on the workflow execution page shows the workflow diagram. All the successful tasks appear in green, failed ones appear in red, and the ones completed with errors appear in orange.
A task ends up with the status “Completed with Errors” only when it is marked as optional:true in the workflow definition. The default value of this setting is false, so it needs to be explicitly set to continue the workflow even when there are errors.
Clicking on the failed task gives the failure details. The following fields are helpful in debugging:
|Summary > Reason for Incompletion||This field displays the reason for task incompletion. It can capture details such as exceptions thrown by the worker, task-related exceptions such as failed to invoke HTTP endpoints, etc.|
|Summary > Worker||This field shows the worker instance that polled the task. It can help dig logs if not captured by the Conductor.|
|Input||We can verify if the task inputs were computed and provided correctly.|
|Output||We can verify the task’s output details here. If the task’s output is supplied as the next task’s input, such details can be verified from here.|
|Logs||We can get the logs from this tab if the task supplies logs.|
|Attempt||If your task was retried, we can see all the attempts and corresponding details here.|
Here is a screen grab of the fields referred above.
Recovering From Failures
Once the issues are resolved for the workflow execution failure, we might want to retry the failed workflows. The Actions button towards the top-right corner of the execution page provides the following options:
|Terminate||Terminates the workflow and changes the workflow status to TERMINATED.|
|Restart with Current Definitions||Restart the workflow from the beginning using the same version of the workflow definition that originally ran this execution. This is useful if the workflow definition has changed and we want to retain this instance in the original version.|
|Restart with Latest Definitions||Restart this workflow from the beginning using the latest definition of the workflow. If we’ve made changes to the definition, we could use this option to re-run the flow with the latest version.|
|Retry - From failed task||Retries the workflow from the failed task.|
|Re-run Workflow||Clicking on this takes us to the Run Workflow page, where we can rerun the workflow. While running, we can change the workflow input parameters and task to domain mapping.|
Orkes Conductor has native retry and error handling capabilities, allowing your task to be retried automatically for transient failures.