Skip to main content

Metrics and Observability

The Orkes Conductor Dashboard gives a quick overview of the metrics & alerts on your Conductor console. It provides a centralized intuitive interface to track and get insights on the behavior and performance of tasks and workflows that can aid in troubleshooting errors.

Orkes Conductor uses the popular platform Prometheus for recording a rich set of metrics that will be available automatically in your deployment and pushes the metrics to Grafana/Datadog on request to dedicated clusters.

Accessing Dashboard from Conductor Console

In this document, we’ve included a sample dashboard set using Prometheus & Grafana.

  1. To access your dashboard, navigate to Metrics from your Conductor cluster. If you cannot see this option on your Conductor cluster, please reach out to our team.

Accessing dashboard from Conductor Console

  1. It takes you to the Conductor dashboard set using Grafana. A sample one looks like this:

Sample Dashboard

Conductor Metrics

The server publishes the following metrics. You can use these metrics to configure alerts for your workflows and tasks.

Workflow and Task Metrics

MetricsSample VisualizationPurposeTags
Workflow Latencies (Name and Percentile)

workflow_completed_seconds

Workflow Latencies

Timer indicating the time taken for completing the workflows.workflowName, quantile
Workflow completion/sec

workflow_completed_seconds_count

Workflow completion/sec

Counter indicating the number of workflows completed per second.workflowName
Workflow failures/sec

workflow_completed_seconds_count (Ensure to add the filter "FAILED" to get the failed list)

Workflow failures/sec

Counter indicating the number of workflows failed per second.workflowName
No of workflows currently Running

workflow_running

No of workflows currently running

Gauge for the number of running workflows.workflowName
Workflow Start Rate/sec

workflow_start_request_seconds_count

Workflow Start Rate/sec

Counter for no. of workflows started.workflowName
Total no. of workflows started in the time period

workflow_start_request_seconds_count

Total no. of workflows started in the time period

Counter for no. of workflows started in the time period.workflowName
Workflow Search Latency Percentile

http_server_requests_seconds

Workflow Search Latency Percentile

Indicates the latency values for the search operation in workflows.quantile
Task Latencies (Name and Percentile)

task_completed_seconds

Task Latencies

Timer for completing the tasks.taskType, quantile
Task completion/sec

task_completed_seconds_count

Task completion/sec

Counter indicating the number of completed tasks per second.taskType
Task failures/sec

task_completed_seconds_count

Task failures/sec

Counter indicating the number of failed tasks per second.taskType