Orkes logo image
Product
Platform
Orkes Platform thumbnail
Orkes Platform
Orkes Agentic Workflows
Orkes Conductor Vs Conductor OSS thumbnail
Orkes vs. Conductor OSS
Orkes Cloud
How Orkes Powers Boat Thumbnail
How Orkes Powers BOAT
Try enterprise Orkes Cloud for free
Enjoy a free 14-day trial with all enterprise features
Start for free
Capabilities
Microservices Workflow Orchestration icon
Microservices Workflow Orchestration
Enable faster development cycles, easier maintenance, and improved user experiences.
Realtime API Orchestration icon
Realtime API Orchestration
Enable faster development cycles, easier maintenance, and improved user experiences.
Event Driven Architecture icon
Event Driven Architecture
Create durable workflows that promote modularity, flexibility, and responsiveness.
Human Workflow Orchestration icon
Human Workflow Orchestration
Seamlessly insert humans in the loop of complex workflows.
Process orchestration icon
Process Orchestration
Visualize end-to-end business processes, connect people, processes and systems, and monitor performance to resolve issues in real-time
Use Cases
By Industry
Financial Services icon
Financial Services
Secure and comprehensive workflow orchestration for financial services
Media and Entertainment icon
Media and Entertainment
Enterprise grade workflow orchestration for your media pipelines
Telecommunications icon
Telecommunications
Future proof your workflow management with workflow orchestration
Healthcare icon
Healthcare
Revolutionize and expedite patient care with workflow orchestration for healthcare
Shipping and logistics icon
Shipping and Logistics
Reinforce your inventory management with durable execution and long running workflows
Software icon
Software
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean leo mauris, laoreet interdum sodales a, mollis nec enim.
Docs
Developers
Learn
Blog
Explore our blog for insights into the latest trends in workflow orchestration, real-world use cases, and updates on how our solutions are transforming industries.
Read blogs
Check out our latest blog:
Conductor CLI Guide: Register, Run, Retry, and Recover Durable Workflows Without Leaving Your Terminal šŸ’»
Customers
Discover how leading companies are using Orkes to accelerate development, streamline operations, and achieve remarkable results.
Read case studies
Our latest case study:
Twilio Case Study Thumbnail
Orkes Academy New!
Master workflow orchestration with hands-on labs, structured learning paths, and certification. Build production-ready workflows from fundamentals to Agentic AI.
Explore courses
Featured course:
Orkes Academy Thumbnail
Events icon
Events
Videos icons
Videos
In the news icon
In the News
Whitepapers icon
Whitepapers
About us icon
About Us
Pricing
Get a demo
Signup
Slack FaviconDiscourse Logo icon
Get a demo
Signup
Slack FaviconDiscourse Logo icon
Orkes logo image

Company

Platform
Careers
HIRING!
Partners
About Us
Legal Hub
Security

Product

Cloud
Platform
Support

Community

Docs
Blogs
Events

Use Cases

Microservices Workflow Orchestration
Realtime API Orchestration
Event Driven Architecture
Agentic Workflows
Human Workflow Orchestration
Process Orchestration

Compare

Orkes vsĀ Camunda
Orkes vsĀ BPMN
Orkes vsĀ LangChain
Orkes vsĀ Temporal
Twitter or X Socials linkLinkedIn Socials linkYouTube Socials linkSlack Socials linkGithub Socials linkFacebook iconInstagram iconTik Tok icon
Ā© 2026 Orkes. All Rights Reserved.
Back to Blogs

Table of Contents

Share on:Share on LinkedInShare on FacebookShare on Twitter
Worker Code Illustration

Get Started for Free with Dev Edition

Signup
Back to Blogs
SOLUTIONS

Data Processing with Conductor

Orkes Team
Developer Relations
Last updated: May 12, 2022
May 12, 2022
5 min read

Related Blogs

How to Build a Simple, Modular, and AI-Powered Fraud Detection Workflow

Jul 8, 2025

How to Build a Simple, Modular, and AI-Powered Fraud Detection Workflow

Automating Serialization/Deserialization Tests with Orkes Conductor and LLMs

May 29, 2025

Automating Serialization/Deserialization Tests with Orkes Conductor and LLMs

Automating Insurance Claims Processing with AI and Conductor

Apr 24, 2025

Automating Insurance Claims Processing with AI and Conductor

Ready to Build Something Amazing?

Join thousands of developers building the future with Orkes.

Start for free

Data processing and data workflows are amongst the most critical processes for many companies today. Many hours are spent collecting, parsing and analyzing data sets. There is no need for these to repetitive processes to be manual - we can automate them. In this post, we'll build a Conductor workflow that handles ETL (Extraction, Transformation and Loading) data for a mission critical process here at Orkes.

As a member of the Developer Relations team here at Orkes, we use a tool called Orbit to better understand our users and our community. Orbit has a number of great integrations that allow for easy connections into platforms like Slack, Discord, Twitter and GitHub. By adding API keys from the Orkes implementations of these social media platforms, the integrations automatically update the community data from these platforms into Orbit.

This is great, but it does not solve all of our needs. Orkes Cloud is based on top of Conductor, and we'd like to also understand who is interacting and using that GitHub repository. However, since Conductor is owned by Netflix, our team is unable to leverage the automated Orbit integration.

However, our API keys do allow us to extract the data from GitHub, and our Orbit API key can allow us to upload the extracted data into our data collection. We could do this manually, but why not build a COnductor workflow to do this process for us automatically?

In this post, I'll give a high level view of the automation required for Extraction of the data from Github, Transformation the data to a form that Orbit can accept, and then to Load the data into our Orbit collection.

The steps for our workflow

We'll go into the nitty gritty details of all the tasks in our next post, sp let's look at what our Conductor workflow does to Extract, Transform and Load our data from Github into Orbit at a high level:

  1. There are APIs from Github that allow us to extract the "stargazers" (those who have starred the Conductor repository) and contributors to Netflix Conductor. The API allows us to extract 100 users at a time, so we will leverage the DO/WHILE loop to get as many users as we need. This requires an API key from Github.

  2. The data from GitHub is much more information that what we need to upload to Orbit. We can use JQ transform and INLINE JavaScript tasks to transform the data extracted from GitHub into the JSON format required for Orbit.

  3. Upload the data into Orbit. This requires an API key from Orbit.

There are a bunch of additional steps to make it all work, and the resulting workflow looks like:

orbit workflow

Running the workflow

The workflow input has a number of parameters that are required for the workflow to run:

json
{
  "gh_account": "netflix",
  "gh_repo": "conductor",
  "star_offset": 3800,
  "gh_token": "<github_api_key>",
  "orbit_workspace": "oss-stats",
  "activity_name": "starredConductor",
  "orbit_apikey": "<orbit_api_key>"
}
  • gh_account and gh_repo are the account and repository to extract the data from - in this case netflix/conductor.
  • There is no need to ping Github and extract all of the users who have starred Conductor. The first 3800 "stargazers" will be unchanged from one run to the next, so we can begin our pagination of the Github data at 3800 (and this number can be updated over time as the number of stars grow).
  • gh_token GitHub limits the number of API calls without a token, so I created a token to allow more frequent requests.
  • orbit_workspace This is the workspace in orbit where the data will be uploaded.
  • activity_name We are creating an activity in Orbit for the action taken. Since this is a list of people who have starred Conductor, starredConductor seems a good name,
  • orbit_apikey we also need an api key to upload into our Orbit account.

Scheduling workflows

This workflow is amazing and saves loads of time, but the workflow must be run in order for the data to appear in our Orbit data set.

This is where the new Orkes workflow scheduler comes in. By scheduling this workflow to run every 24 hours, we can ensure that every new Star and Github fork for the Conductor repository is being accounted for, and that the data in Orbit is accurate and up to date (within 24 hours). There is an additional task to filter the GH data to only stars that occurred in the last 24 hours.

The new Scheduler tool makes it easy to set up a recurring schedule for a workflow. In this case, it is run daily at midnight GMT.

Extending the workflow to other repositories

There are several other open source workflow orchestration platforms, and we wanted to do some analysis on who has starred more that one of these competing platforms. Using this workflow as a Sub workflow we can query multiple repositories at once:

orbit workflow

By adding additional terms to the input, we can aggregate all of the stars from all three repos, and collect them in another orbit instance. Scheduling this workflow ensures that this additional dataset is regularly updated and data from all three repositories are in sync.

Conclusion

With a Conductor workflow we are able securely perform an ETL process: Extracting data from GitHub, Transforming the data via JQ transforms and then Loading the data into Orbit.


Conductor is an enterprise-grade orchestration platform for process automation, API and microservices orchestration, agentic workflows, and more. Check out the full set of features, try it yourself using our Developer Edition sandbox, or get a demo of Orkes Cloud, a fully managed and hosted Conductor service.