Last updated

Pipelines and Automation: Building Smart Workflows 🔄

Learn how to create and manage pipelines in Dataloop - your key to automating workflows and data processing.

Getting Started with Pipelines 🚀

1. Creating a Pipeline

# Create a new pipeline
pipeline = project.pipelines.create(name='My-First-Pipeline')

# Print pipeline details
print(pipeline)

2. Getting Existing Pipelines

# Get pipeline by ID
pipeline = project.pipelines.get(pipeline_id='<pipeline_id>')

# List all pipelines in project
project.pipelines.list()

3. Pipeline Execution

# Execute pipeline
pipeline_execution = project.pipelines.execute(
    pipeline='pipeline_entity',
    execution_input={'item': 'item_id'}
)

# Alternative execution method
pipeline_execution = pipeline.pipeline_executions.create(
    pipeline_id='pipeline_id',
    execution_input={'item': 'item_id'}
)

Pipeline Management 📋

1. Basic Operations

# Delete a pipeline
is_deleted = project.pipelines.delete(pipeline_id='<pipeline_id>')

# Open pipeline in web UI
project.pipelines.open_in_web(pipeline_id='<pipeline_id>')

# Pause pipeline
project.pipelines.pause(pipeline='pipeline_entity')

# Reset pipeline
project.pipelines.reset(pipeline='pipeline_entity')

2. Pipeline Monitoring

# Get pipeline statistics
project.pipelines.stats(pipeline='pipeline_entity')

# Get pipeline execution object
pipeline_executions = pipeline.pipeline_executions.get(pipeline_id='pipeline_id')

# List project pipeline executions
pipeline.pipeline_executions.list()

Best Practices 👑

1. Pipeline Organization

  • Use clear, descriptive pipeline names
  • Document pipeline purpose and configuration
  • Keep track of pipeline versions
  • Monitor pipeline executions

2. Error Prevention

# Validate pipeline before operations
try:
    pipeline = project.pipelines.get(pipeline_id='pipeline_id')
    # Proceed with operations
except dl.exceptions.NotFound:
    print("Pipeline not found!")

3. Resource Management

  • Monitor pipeline statistics regularly
  • Clean up unused pipelines
  • Document pipeline configurations
  • Test pipelines before production use

Pro Tips 💡

  1. Pipeline Design

    • Plan pipeline flow before creation
    • Use meaningful node names
    • Document pipeline inputs and outputs
    • Consider error handling at each step
  2. Execution Management

    • Monitor pipeline executions
    • Handle execution errors gracefully
    • Keep track of execution statistics
    • Document common issues and solutions
  3. Pipeline Maintenance

    • Regularly check pipeline status
    • Update pipeline configurations as needed
    • Monitor resource usage
    • Document pipeline changes

Ready to explore integrations and APIs? Let's move on to the next chapter! 🚀