Deploy and run pipelines

This page describes the basics about deploying and running pipelines in Cloud Data Fusion.

Deploy pipelines

After you finish designing and debugging a data pipeline and are satisfied with the data you see in Preview, you're ready to deploy the pipeline.

When you deploy the pipeline, the Cloud Data Fusion Studio creates the workflow and corresponding Apache Spark jobs in the background.

Run pipelines

After you deploy a pipeline, you can run a pipeline in the following ways:

  • To run a pipeline on demand, open a deployed pipeline and click Run.
  • To schedule the pipeline to run at a certain time, open a deployed pipeline and click Schedule.
  • To trigger the pipeline based when another pipeline completes, open a deployed pipeline and click Incoming triggers.

The Pipeline Studio saves a pipeline's history each time it runs. You can toggle between different runtime versions of the pipeline.

If the pipeline has macros, set the runtime arguments for each macro. You can also review and change the pipeline configurations before running the deployed pipeline. You can see the status change during the phases of the pipeline run, such as Provisioning, Starting, Running, and Succeeded. You can also stop the pipeline at any time.

If you enable instrumentation, you can explore the metrics generated by the pipeline by clicking Properties on any node in your pipeline, such as a source, transformation, or sink.

For more information about the pipeline runs, click Summary.

What's next