A pipeline is a HPE ML Data Management primitive responsible for reading data from a specified source, such as a HPE ML Data Management repo, transforming it according to the pipeline specification, and writing the result to an output repo.
Pipelines subscribe to a branch in one or more input repositories, and every time the branch has a new commit, the pipeline executes a job that runs user code to completion and writes the results to a commit in the output repository.
Pipelines are defined declaratively using a JSON or YAML file (the pipeline specification), which must include the
transform parameters at a minimum. Pipelines can be chained together to create a directed acyclic graph (DAG).