Step Context provides a way to share workflow-level data with steps executing on remote Celery workers. Unlike the workflow context which is process-local, Step Context is serialized and passed to workers, making it accessible in distributed step execution.
Copy
from pyworkflow import workflow, step, StepContext, get_step_context, set_step_contextclass OrderContext(StepContext): workspace_id: str = "" user_id: str = "" order_id: str = ""@workflow(context_class=OrderContext)async def process_order(order_id: str, user_id: str): # Initialize context in workflow ctx = OrderContext(order_id=order_id, user_id=user_id) await set_step_context(ctx) # Steps can read the context result = await validate_order() return result@step()async def validate_order(): ctx = get_step_context() # Read-only access print(f"Validating order {ctx.order_id} for user {ctx.user_id}") return {"valid": True}
When steps execute in parallel on different workers, allowing them to modify shared context would cause race conditions:
Copy
Worker A: read context → modify → save ─┐ ├─> Lost update!Worker B: read context → modify → save ─┘
By making context read-only in steps, PyWorkflow follows the same pattern used by Temporal and Prefect - activities/tasks are stateless, and state mutations happen through return values in the workflow.
If you need to update context based on step results, do it in the workflow code after the step returns.
Store only essential cross-cutting data like IDs, user info, and configuration. Don’t use context as a data store - pass large data as step arguments instead.
Copy
# Good - small, essential dataclass GoodContext(StepContext): workspace_id: str user_id: str request_id: str# Bad - too much dataclass BadContext(StepContext): workspace_id: str user_data: dict # Could be large all_orders: list[dict] # Definitely too large
Use context for cross-cutting concerns
Step Context is ideal for data needed by many steps: auth info, workspace IDs, correlation IDs, feature flags.
Copy
class RequestContext(StepContext): workspace_id: str user_id: str correlation_id: str # For distributed tracing feature_flags: dict[str, bool] = {}
Initialize context early
Set up context at the beginning of your workflow before calling any steps.
Context is persisted to storage. Use secret managers or environment variables for sensitive data.
Copy
# Bad - secrets in contextclass BadContext(StepContext): api_key: str # Don't do this!# Good - reference to secret, not the secret itselfclass GoodContext(StepContext): secret_name: str # Reference to secret in vault@step()async def call_api(): ctx = get_step_context() api_key = await secret_manager.get(ctx.secret_name)