Run Commands
Read the PPS series >

Transform PPS

Set the name of the Docker image that your jobs use.

December 4, 2023

Spec #

This is a top-level attribute of the pipeline spec.

{
    "pipeline": {...},
    "transform": {
        "image": string,
        "cmd": [ string ],
        "datumBatching": bool,
        "errCmd": [ string ],
        "env": {
            string: string
        },

        "secrets": [ {
            "name": string,
            "mountPath": string
        },
        {
            "name": string,
            "envVar": string,
            "key": string
        } ],
        "imagePullSecrets": [ string ],
        "stdin": [ string ],
        "errStdin": [ string ],
        "acceptReturnCode": [ int ],
        "debug": bool,
        "user": string,
        "workingDir": string,
        "dockerfile": string,
        "memoryVolume": bool,
    },
    ...
}

Attributes #

AttributeDescription
cmdPasses a command to the Docker run invocation.
datumBatchingEnables you to call your user code once for a batch of datums versus calling it per each datum.
stdinPasses an array of lines to your command on stdin.
errCmdPasses a command executed on failed datums.
errStdinPasses an array of lines to your error command on stdin.
envEnables a key-value map of environment variables that HPE ML Data Management injects into the container.
secretsPasses an array of secrets to embed sensitive data.
imagePullSecretsPasses an array of secrets that are mounted before the containers are created.
acceptReturnCodePasses an array of return codes that are considered acceptable when your Docker command exits.
debugEnables debug logging for the pipeline
userSets the user that your code runs as.
workingDirSets the directory that your command runs from.
memoryVolumeSets pachyderm-worker’s emptyDir.Medium to Memory, allowing Kubernetes to mount a memory-backed volume (tmpfs).

Behavior #

💡

**Using a private registry? **

You can use imagePullSecrets to mount a secret that contains your registry credentials.

{
  "pipeline": {
    "name": "pipeline-a"
  },
  "description": "...",
  "transform": {
    "cmd": [ "python3", "/example.py" ],
    "image": "<private container registry>/image:1.0",
    "imagePullSecrets": [ "k8s-secret-with-creds" ]
  },
  ...
}

When to Use #

You must always use the transform attribute when making a pipeline.