Why WFCommons
Workflow research is evolving: heterogeneous nodes (CPU/GPU), data-intensive AI pipelines, and autonomous “agentic” control loops demand datasets and benchmarks that reflect reality.
HPC + data at scale
Study scheduling, throughput, and robustness for complex workflows with thousands of tasks and diverse I/O footprints—across clusters, clouds, and supercomputers.
AI-ready workflow data
Build training and evaluation corpora for anomaly detection, runtime prediction, and resource-aware optimization by working from a consistent, validated schema.
Agentic workflows & autonomy
Test LLM/agent planners that adapt DAGs on the fly (replanning, retries, provenance-aware decisions) using realistic workflow “digital twins” derived from production traces.
Ecosystem components
WFCommons is organized as interoperable building blocks—each useful on its own, stronger together.
WfFormat — a common schema
A JSON specification for representing workflow execution instances and synthetic workflows, enabling tools and simulators to consume data consistently.
WfInstances — open workflow executions
A curated collection of real workflow runs (instances), shared in WfFormat so they can be analyzed, simulated, or used to derive generators and benchmarks.
WfChef — derive recipes automatically
Analyze real instances to discover recurring dependency patterns and statistical distributions, producing “recipes” that capture workflow structure and behavior.
WfGen / WfBench / WfSim — generate & evaluate
Generate synthetic workflows from recipes, produce benchmark specs for repeatable experiments, and run simulations in compatible frameworks.
Cite WFCommons
Use these references in papers, reports, and benchmark artifacts.
Primary paper
Future Generation Computer Systems (FGCS), 2022.
@article{WFCommons,
title = {WFCommons: A Framework for Enabling Scientific Workflow Research and Development},
author = {Coleman, Taina and Casanova, Henri and Pottier, Loic and Kaushik, Manav and Deelman, Ewa and Ferreira da Silva, Rafael},
journal = {Future Generation Computer Systems},
volume = {128},
pages = {16--27},
year = {2022},
doi = {10.1016/j.future.2021.09.043}
}
Docs & repositories
These are the canonical entry points for users and reviewers.
Research Outcomes Enabled by WFCommons
WFCommons has enabled research in 54 research articles. These articles include research outcomes produced by our own team as well as other researchers from the workflows community.