Understanding Pipeline States
This guide helps you understand the different states your data pipelines can be in and how to interpret the status information displayed in your DataStori dashboard.
Table of Contents
- Overview
- State Hierarchy
- Pipeline States
- Task States
- Data Quality Test States
- Understanding State Transitions
Overview
DataStori tracks pipeline execution using a hierarchical state system with three levels:
- Pipeline Level - Overall status of your entire pipeline execution
- Task Level - Status of individual steps within your pipeline (download, transform, test)
- Test Level - Results of data quality checks
Understanding these states helps you monitor your pipeline executions and quickly identify any issues.
State Hierarchy
Your pipeline execution follows this hierarchy:
Pipeline Execution
├── Pipeline Status (RUNNING → COMPLETED/FAILED)
├── Download Task (RUNNING → RETRYING → COMPLETED/FAILED)
├── Transform Task (RUNNING → RETRYING → COMPLETED/FAILED)
└── Test Results (PASSED/FAILED/WARNING)
Understanding Overall vs. Task Status
The dashboard displays both pipeline-level and task-level statuses. Here's how to interpret them:
| Status Level | What It Shows | Interpretation |
|---|---|---|
| Pipeline Status | Overall execution status | The main status of your entire pipeline run |
| Task Status | Individual step progress | Status of specific tasks within the pipeline |
Key Points:
- ✅ The pipeline status is the primary indicator of your execution's overall state
- ✅ Task statuses show you the progress of individual steps
- ✅ Multiple tasks can show as
COMPLETEDwhile the pipeline is stillRUNNING - ✅ Only when the pipeline status shows
COMPLETEDorFAILEDis the entire execution finished
Pipeline States
Pipeline states represent the overall status of your data pipeline execution. This is the primary status you'll see on your dashboard.
RUNNING
What it means: Your pipeline has started and is currently processing data.
- All pipeline tasks are being executed
- Data is actively being moved from source to destination
- The pipeline is working through its configured steps
COMPLETED
What it means: Your pipeline finished successfully without any errors.
- All tasks completed successfully
- Data has been transferred and processed
- The pipeline run is finished
- You'll see a completion timestamp
What to do next: Review your destination to verify the data arrived as expected.
FAILED
What it means: Your pipeline encountered an error and stopped executing.
- An error occurred during execution
- The pipeline could not complete its tasks
- An error message will be provided to help diagnose the issue
- You'll see a failure timestamp
What to do next: Check the error message and review the troubleshooting guide or contact support if you need assistance.
Pipeline State Flow
Your pipeline follows this progression:
Pipeline Starts
↓
RUNNING
↓
├─→ COMPLETED ✓ (Success)
│
└─→ FAILED ✗ (Error occurred)
Task States
Each pipeline consists of multiple tasks (such as downloading data, transforming it, and running quality tests). Task states show you the progress of these individual steps.
RUNNING
What it means: A specific task within your pipeline is currently executing.
- The task is actively processing
- This is a normal part of pipeline execution
- You'll see different tasks move through this state as the pipeline progresses
Example tasks: Download, Transform, Data Quality Tests
RETRYING
What it means: A task encountered an issue, but DataStori is automatically retrying it for you.
- Temporary issues (like network hiccups) can occur
- DataStori automatically retries failed tasks to ensure data reliability
- You'll see which retry attempt is in progress
- Retries happen after a brief delay
What to do: No action needed - the system is handling it automatically. If all retries are exhausted, you'll be notified.
COMPLETED
What it means: An individual task within your pipeline finished successfully.
- The specific task completed its work
- The pipeline moves on to the next task
- Note: Individual tasks completing doesn't mean the entire pipeline is done
- The pipeline status will show
COMPLETEDwhen all tasks finish
FAILED
What it means: A task failed after all retry attempts were exhausted.
- The task couldn't complete despite multiple attempts
- The entire pipeline will stop and show a
FAILEDstatus - An error message will explain what went wrong
What to do: Check the error details and refer to our troubleshooting guide or contact support.
Task State Flow
Individual tasks follow this progression:
Task Starts
↓
RUNNING
↓
├─→ COMPLETED ✓ (Success on first try)
│
└─→ Temporary Issue
↓
RETRYING (automatic retry)
↓
├─→ COMPLETED ✓ (Success on retry)
│
└─→ FAILED ✗ (All retries exhausted)
Understanding Automatic Retries
DataStori automatically retries failed tasks to handle temporary issues:
- Network interruptions: Brief connectivity issues
- Source system delays: Temporary unavailability of your data source
- Rate limiting: When your source system asks us to slow down
Retries help ensure your data pipelines are resilient and reliable without requiring your intervention.
Data Quality Test States
DataStori can run automated data quality tests on your data to ensure it meets your standards. These tests produce their own status indicators.
PASSED
What it means: Your data passed the quality check.
- The data meets your configured quality standards
- No issues were detected
- Your pipeline will continue normally
Common tests that pass:
- No null values found in required fields
- All records have unique identifiers
- Data is fresh and up-to-date
WARNING
What it means: Your data passed the test, but something needs attention.
- The data is usable but approaching a concerning threshold
- Your pipeline will continue running
- You may want to investigate to prevent future failures
Example: Data is 23 hours old when your warning threshold is set to 24 hours.
What to do: Review the warning details and consider if action is needed to prevent future issues.
FAILED
What it means: Your data failed a quality check.
- The data doesn't meet your configured quality standards
- An issue was detected that requires attention
- Depending on your configuration, the pipeline may stop
Common test failures:
- Null values found in required fields
- Duplicate records detected when uniqueness is required
- Data is too old (stale)
What to do: Review the error details to understand what quality issue was found, then check your source data or adjust your pipeline configuration.
Understanding Data Quality Tests
DataStori supports several types of automated data quality checks:
- Null Checks: Ensures critical fields contain data
- Uniqueness Checks: Verifies no duplicate records exist
- Freshness Checks: Confirms your data is recent and up-to-date
You can configure these tests based on your data requirements and business needs.
Understanding State Transitions
Typical Successful Pipeline Execution
Here's what you'll see when a pipeline runs successfully:
Pipeline Starts
↓
Pipeline: RUNNING
↓
Download Task: RUNNING
↓
Download Task: COMPLETED ✓
↓
Transform Task: RUNNING
↓
Transform Task: COMPLETED ✓
↓
Test Task: RUNNING
├─→ Null Check: PASSED ✓
├─→ Uniqueness Check: PASSED ✓
└─→ Freshness Check: PASSED ✓
↓
Test Task: COMPLETED ✓
↓
Pipeline: COMPLETED ✓
Pipeline Execution with Retries
Sometimes tasks need to retry. Here's what that looks like:
Task Starts
↓
Task: RUNNING (First Attempt)
↓
Temporary Issue Encountered
↓
Task: RETRYING (Attempt 2)
↓
├─→ Success → Task: COMPLETED ✓
│ Pipeline continues...
│
└─→ Issue Persists
↓
Task: RETRYING (Attempt 3)
↓
├─→ Success → Task: COMPLETED ✓
│ Pipeline continues...
│
└─→ All Retries Exhausted
↓
Task: FAILED ✗
↓
Pipeline: FAILED ✗
Reading Your Pipeline Status
When viewing your pipeline execution, remember:
- Pipeline Status = Overall Status: This tells you if your entire pipeline succeeded or failed
- Task Statuses = Progress Indicators: These show you which steps are complete and which are in progress
- Test Results = Data Quality: These tell you if your data meets your quality standards
Example: You might see:
- Pipeline: RUNNING (overall status - still in progress)
- Download: COMPLETED ✓ (step 1 done)
- Transform: RUNNING (step 2 in progress)
- Tests: Not started yet
Common Scenarios
Scenario 1: Successful Pipeline Run
What you'll see:
- Pipeline status changes to
RUNNING - Each task shows
RUNNINGthenCOMPLETEDas it finishes - Data quality tests show
PASSEDresults - Pipeline status changes to
COMPLETED
What this means: Your data was successfully transferred from source to destination, meeting all quality standards.
Scenario 2: Temporary Network Issue with Recovery
What you'll see:
- Pipeline status:
RUNNING - Download task:
RUNNING - Download task:
RETRYING(first retry) - Download task:
RETRYING(second retry) - Download task:
COMPLETED✓ - Remaining tasks complete normally
- Pipeline status:
COMPLETED✓
What this means: A temporary network issue occurred, but DataStori automatically recovered by retrying. No action needed from you.
Scenario 3: Data Quality Issue Detected
What you'll see:
- Pipeline status:
RUNNING - Download and transform tasks:
COMPLETED - Test task:
RUNNING - Null check test:
FAILED(with error details) - Pipeline status:
FAILED
What this means: Your data was downloaded and transformed, but a quality check found an issue (e.g., missing required values). Review the error details to understand what needs to be fixed in your source data or pipeline configuration.
Scenario 4: Freshness Warning
What you'll see:
- Pipeline completes successfully
- Most tests:
PASSED - Freshness check:
WARNING(data is older than preferred but within acceptable limits)
What this means: Your data pipeline worked, but your data might be getting stale. Consider increasing sync frequency or checking your source data update schedule.
Monitoring Your Pipelines
Where to Find State Information
Your pipeline states are visible in the DataStori dashboard:
- Pipeline List View: Shows the current status of all your pipelines
- Pipeline Detail View: Shows detailed status for each task and test within a pipeline execution
- Execution History: View past pipeline runs and their final states
What to Watch For
Green Indicators (All Good):
- Pipeline: COMPLETED
- All tasks: COMPLETED
- All tests: PASSED
Yellow Indicators (Attention Recommended):
- Task: RETRYING (system is handling it, but worth monitoring)
- Test: WARNING (pipeline continues, but investigate the warning)
Red Indicators (Action Needed):
- Pipeline: FAILED
- Task: FAILED (after all retries)
- Test: FAILED
Timestamps and Duration
Each state includes timing information:
- Start time: When the pipeline or task began
- End time: When it completed or failed
- Duration: How long it took to execute
This helps you:
- Understand pipeline performance
- Identify bottlenecks
- Plan appropriate schedule frequencies
Best Practices
Monitoring Your Pipelines
Check regularly: Review your pipeline statuses periodically to catch issues early.
Set up notifications: Consider setting up alerts for failed pipelines so you can respond quickly.
Review warnings: Don't ignore WARNING states - they often indicate issues that will become failures if not addressed.
Understanding Failures
When a pipeline fails:
- Read the error message: Error messages provide specific details about what went wrong
- Check the failed task: Identify which step in the pipeline encountered the issue
- Review timing: Check when the failure occurred - this can provide context
- Look for patterns: If a pipeline fails repeatedly at the same step, it indicates a consistent issue
Working with Retries
Automatic retries are normal: Seeing a task in RETRYING state is expected for handling temporary issues.
Persistent retries indicate problems: If you frequently see tasks requiring multiple retries, investigate the root cause:
- Source system reliability
- Network connectivity
- Data volume or complexity
When to Contact Support
Reach out to DataStori support if:
- Pipelines fail repeatedly with the same error
- Error messages are unclear or unhelpful
- You see unexpected state behavior
- Retries seem excessive or never resolve
- You need help optimizing pipeline performance
Frequently Asked Questions
Why does my pipeline show RUNNING for a long time?
This is normal for pipelines processing large amounts of data. Check the individual task states to see progress. If a specific task seems stuck, contact support.
Can I cancel a running pipeline?
Yes, you can stop a running pipeline from the dashboard. The pipeline state will update to reflect the cancellation.
What happens to my data if a pipeline fails?
DataStori ensures data consistency. If a pipeline fails:
- Partial data may be written to your destination (depending on where the failure occurred)
- You can safely rerun the pipeline
- Deduplication logic ensures no data is duplicated when rerunning
How long does DataStori retry failed tasks?
DataStori automatically retries failed tasks up to 3 times by default, with a delay between attempts. If all retries are exhausted, the pipeline fails and you'll be notified.
What's the difference between a WARNING and a FAILED test?
- WARNING: Your data passed the test but is approaching a concerning threshold. The pipeline continues running.
- FAILED: Your data didn't meet the quality standards. Depending on your configuration, the pipeline may stop.
Can I see historical pipeline states?
Yes, the dashboard maintains a history of your pipeline executions, including their states, timing, and any errors encountered.
Summary
DataStori provides comprehensive status tracking for your data pipelines through a three-level state system:
- Pipeline States: Overall execution status (RUNNING → COMPLETED/FAILED)
- Task States: Individual step progress (RUNNING → RETRYING → COMPLETED/FAILED)
- Test States: Data quality results (PASSED/FAILED/WARNING)
Key Takeaways
✅ Pipeline status is your primary indicator - it tells you if your entire data pipeline succeeded or failed
✅ Task states show you the progress of individual steps and where issues occur
✅ Automatic retries handle temporary issues without your intervention
✅ Data quality tests ensure your data meets your standards before it's marked as complete
✅ Comprehensive tracking gives you full visibility into every aspect of your pipeline execution
Next Steps
- Monitor your pipelines: Regularly check your pipeline statuses in the dashboard
- Configure data quality tests: Set up tests to ensure your data meets your requirements
- Set up alerts: Get notified when pipelines fail so you can respond quickly
- Review performance: Use timing information to optimize your pipeline schedules
For additional support, refer to our other documentation or contact the DataStori support team.