Error Handling
Runsight provides three mechanisms for handling errors in workflows: error_route for redirecting execution after a block failure, on_error for controlling sub-workflow failure behavior, and retry_config for automatic retry with backoff.
error_route
Section titled “error_route”The error_route field on any block redirects execution to a specific block when an error occurs, instead of failing the entire workflow.
blocks: risky_step: type: linear soul_ref: analyst error_route: fallback_step fallback_step: type: linear soul_ref: writerHow error_route works
Section titled “How error_route works”When a block with error_route raises an exception:
- The exception is caught by the workflow’s main execution loop.
- A
BlockResultis created for the failed block withexit_handle: "error"and the error details stored in metadata (error_type,error_message,block_id). - The error info is also written to
shared_memoryunder__error__{block_id}. - The execution queue is cleared and replaced with the error route target.
- Execution continues from the error route block.
The error route block can read the error details from shared memory:
blocks: risky_step: type: linear soul_ref: analyst error_route: handle_error handle_error: type: linear soul_ref: error_handlerThe handle_error block’s soul can access __error__risky_step in shared memory, which contains {"type": "...", "message": "..."}.
Soft-error routing
Section titled “Soft-error routing”Error routing also activates for soft errors --- blocks that complete normally but produce a BlockResult with exit_handle: "error". This happens when a workflow block uses on_error: "catch" and the child workflow fails. The parent block completes (no exception raised), but its result signals an error. If the parent block has an error_route, execution is redirected.
on_error (workflow blocks)
Section titled “on_error (workflow blocks)”The on_error field is specific to workflow blocks (type: "workflow"). It controls what happens when a child workflow fails.
| Value | Behavior |
|---|---|
"raise" | Default. The child’s exception propagates to the parent. The parent block fails, and normal error handling applies (error_route if set, otherwise the parent workflow fails). |
"catch" | The exception is swallowed. The parent block completes with exit_handle: "error" and a BlockResult containing the child error details. Execution continues from the parent. |
blocks: optional_enrichment: type: workflow workflow_ref: enrichment-pipeline on_error: catch error_route: skip_enrichment inputs: topic: "shared_memory.topic" outputs: "shared_memory.enriched": summary skip_enrichment: type: linear soul_ref: writerIn this example:
- If
enrichment-pipelinefails, the exception is caught (on_error: "catch"). - The
optional_enrichmentblock completes withexit_handle: "error". - Because
error_routeis set, execution redirects toskip_enrichment. - The workflow continues instead of failing.
Soft error detection in catch mode
Section titled “Soft error detection in catch mode”When on_error: "catch" is set, the workflow block also detects soft errors in the child’s results. If any child block produced a BlockResult with exit_handle: "error" (even without raising an exception), the parent block treats this as a failure and returns an error BlockResult. This prevents silently swallowing errors that were caught by error_route within the child workflow.
retry_config
Section titled “retry_config”The retry_config field adds automatic retry with configurable backoff to any block.
blocks: flaky_api_call: type: linear soul_ref: analyst retry_config: max_attempts: 3 backoff: exponential backoff_base_seconds: 2.0 non_retryable_errors: - AuthenticationErrorRetryConfig fields
Section titled “RetryConfig fields”| Field | Type | Default | Constraints | Description |
|---|---|---|---|---|
max_attempts | int | 3 | 1 -- 20 | Total attempts (1 = no retry) |
backoff | "fixed" | "exponential" | "fixed" | --- | Backoff strategy between attempts |
backoff_base_seconds | float | 1.0 | 0.1 -- 60.0 | Base delay between retries |
non_retryable_errors | list[str]? | None | --- | Exception type names that should not be retried |
Backoff strategies
Section titled “Backoff strategies”Fixed backoff (backoff: "fixed"): Waits backoff_base_seconds between every retry attempt.
Attempt 1 → fail → wait 1.0s → Attempt 2 → fail → wait 1.0s → Attempt 3Exponential backoff (backoff: "exponential"): Doubles the delay each attempt using backoff_base_seconds * 2^(attempt-1).
Attempt 1 → fail → wait 2.0s → Attempt 2 → fail → wait 4.0s → Attempt 3Retry behavior
Section titled “Retry behavior”- Each retry starts from the original pre-retry state. Failed-attempt messages are never carried over to the next attempt.
- On success after retries, the block result includes retry metadata in
shared_memoryunder__retry__{block_id}:
{ "attempt": 2, "max_attempts": 3, "last_error": "Connection timeout", "last_error_type": "TimeoutError", "total_retries": 1}- If an error’s type name matches an entry in
non_retryable_errors, the error is raised immediately without further retry attempts. KeyboardInterruptandSystemExitare never retried.
Error propagation lifecycle
Section titled “Error propagation lifecycle”Here is how errors flow through the execution engine:
-
Block raises an exception during
execute_block(). -
Retry check: If
retry_configis set andmax_attempts > 1, the block is retried according to the backoff strategy. Non-retryable errors skip this step. -
Error route check: If
error_routeis set on the block, the exception is caught, aBlockResultwith error metadata is written, and execution redirects to the error route target. -
Workflow-level propagation: If no error_route is set, the exception propagates up to
Workflow.run(). If the workflow is a child (running inside aworkflowblock), the parent’son_errordetermines what happens next. -
Terminal state: If the exception reaches the top-level workflow, the run is marked as
failedwith the error message and traceback stored on theRunrecord.
Combining error_route with retry_config
Section titled “Combining error_route with retry_config”You can use both on the same block. Retries are attempted first. If all retry attempts are exhausted and the block still fails, the error_route takes over:
blocks: api_call: type: linear soul_ref: analyst retry_config: max_attempts: 3 backoff: exponential backoff_base_seconds: 1.0 error_route: handle_api_failure handle_api_failure: type: linear soul_ref: error_handlerBlock timeout
Section titled “Block timeout”Every block has a timeout_seconds field (default: 300, range: 1 -- 3600) separate from the budget limits.max_duration_seconds. If the block execution exceeds this timeout, a BudgetKilledException is raised with limit_kind="timeout". This exception can be caught by error_route or retry_config like any other error.
See Budget & Limits for the full budget enforcement model.