Changelog

2.9.0

2.9.0

Minor

Jupyter Mount Extension Now FUSE-less

GitHub
  • Enhancement: The Jupyter Mount Extension is now FUSE-less. This means that the Jupyter Mount Extension no longer requires FUSE to be installed on the host machine.
  • Enhancement: We’ve migrated to using goCDK to handle blob storage configuration in the Helm chart. This means that you can now use storage URLs and pass in supported query parameters for easy setup.
  • Security: HPE ML Data Management will no longer support Postgres versions older than Postgres 15 in the next minor release (~ 4 months from 02/14/2024). We recommend that you at least upgrade to Postgres 15 to ensure compatibility. If you encounter any issues, please let us know by filing a support ticket.

2.8.4

2.8.4

patch

General Updates

GitHub
  • Enhancement: The Console UI now has UX improvements that better surface the health of your project by providing a quick, searchable dropdown of pipeline and job errors. Users can now also search and filter against their previous jobs.

2.8.3

2.8.3

patch

Pipeline Orchestration Features in Console

GitHub
  • Feature: You can now start, stop, and duplicate pipelines in Console.

  • Feature: You can now set project defaults that are passed down to all pipeline specs within a given project. These defaults provide a consistent experience for your data scientists and help manage your cluster. You can manage defaults via the PachCTL CLI or within Console.

  • Enhancement: The “Set Active Project” dialogue in the Console UI has been removed since the action must be performed via the CLI.

2.8.0

2.8.0

minor

Pipeline CRUD in Console & More

GitHub
  • Feature: You can now create and manage pipelines in Console! To showcase this, we’ve added Console steps to all of our tutorials.
  • Feature: You can now set global defaults for your cluster that are passed down to all pipeline specs. These defaults provide a consistent experience for your data scientists and help manage your cluster. You can manage defaults via the PachCTL CLI or within Console.
  • Beta: You can now try out a beta version of our Unified Deployment experience with Determined.
  • Update: Branch triggers now require the trigger branch to exist before adding a --trigger setting to the target branch.
  • Enhancement: All pipeline specification references have been standardized to use camelCase format; use this format going forward when creating pipeline specifications.

2.7.5

2.7.5

patch

General Updates

GitHub
  • Fix: Fixed a bug where the pipeline status did not display in the Console DAG when a Global ID was applied.
  • Fix: Fixed a bug where input repos were not being listed in a Pipeline’s details sidebar.
  • Enhancement: You can now add commit messages for file uploads in the Console UI.
  • Security: Added security enhancements to prevent HTTP/2 Stream Cancellation Attacks

2.7.4

2.7.4

patch

Expanded Filetype Preview Support

GitHub
  • Enhancement: Console has expanded download preview support for more filetypes, such as HTML/XML. Users can also now click “View Raw” to preview any unsupported filetypes. The following table is a list of all supported filetypes:

    TypeExtensionsPreview
    markdown.markdn, .markdown, .md, .mdown, .mdwn, .mkdn, .mkdown, .mkd(preview and view source)
    html.html, .htm(preview and view source)
    xml.xml, .xsl(preview and view source)
    code.yaml, .yml, .json, .jsonl, .py, .js, .jsx, .ts, .tsx, .cjs, .mjs, .c, .cpp, .java, .php, .rs, .sql, .go, .sh, .jl, .rb Dockerfile, .css(view source)
    text.text, .txt, .textpb(view source)
    csv.csv, .tsv, .tab(preview and view source)
    image.jpeg, .jpg, .jfif, .jif, .jpe, .pjpg, .png, .apng, .gif, .avif, .avifs, .webp, .bmp, .ico, .tiff (Safari only), .tif (Safari only)(preview)
    svg.svg, .svgz(preview and view source)
    video.mpg, .mpeg, .mpe, .m1v, .m2v, .mpa, .mp4, .mp4v, .mpg4, .avi, .wmv, .mov, .qt, .rm, .ra, .ram, .webm(preview based on your browser/OS)
    audio.mp3, .m2a, .m3a, .mp2, .mp2a, .mpga, .wav, .ogg, .oga, .spx(preview based on your browser/OS)
  • Enhancement: Timestamps in Console now use 0-23 hour notation.

2.7.3

2.7.3

patch

Console DAG Improvements

GitHub
  • Enhancement: Refactored the Egress node to display as much of the URL as possible; for a full URL, you can find it in the Spec tab.
  • Feature: A new case for the Node component has been created to display connected projects and repos. These nodes are clickable and will route you to the appropriate resource.

2.7.2

2.7.2

patch

General Bug Fixes

GitHub
  • Fix: The Community Edition of Console now shows the pipeline count across projects in the banner.
  • Fix: Previously when a user’s active context was still set to a deleted project, the terminal would return an error – now, it returns a warning instead.
  • Enhancement: The Pachyderm SDK now catches serialization errors and converts them into human-readable errors.

2.7.1

2.7.1

patch

General Bug Fixes

GitHub
  • Fix: Resolved an issue with the pachyderm-sdk’s debug.dump method where it was using the incorrect argument type.
  • Fix: Full timestamps in pachctl have been standardized for consistency.
  • Fix: Resolved some issues in the Console log viewer UI.
  • Enhancement Tooltips have been added to Console for file table actions (Download, Delete).

2.7.0

2.7.0

minor

New Pachyderm SDK for Python

GitHub
  • Feature: The new Pachyderm SDK is now available. Check out the reference documentation, install guide, and example starter project.
  • Feature: Console now has a runtime visualization for jobs in your pipeline.
  • Feature: The documentation site now has a chatbot to help you find what you’re looking for. This feature is in beta, so please let us know if you have any feedback through our Slack community.
  • Feature: HPE ML Data Management’s helm chart now has a section for preflight checks, allowing you to easily validate whether the upgrade/migrations will be successful. This section can be found at pachd.preflightchecks. Simply set enabled: true and set the image.tag to the new version you want to upgrade to. If created the pod named pachyderm-preflight-check shows a status of Completed, you are ready to perform the upgrade. See the Upgrade steps for more information.
  • Enhancement: Console’s scalability has been improved to handle more concurrent users (50+) and power users who have many pipelines.
  • Enhancement: Console’s DAG visualization has been upgraded to include more information about the state of your pipelines.
  • Enhancement: The Jupyterlab Pipeline Specification Extension now supports GPUs.
  • Refactor: The functionality of the Branch Cron Trigger has been refactored to work more intuitively. Previously, cron triggers functioned more like rate limiters; now, they enable you to set up a scheduled reoccurring event on a repo branch that evaluates and fires the trigger. When a Cron Trigger fires, but no new data has been added, there are no new downstream commits or jobs. See our Cron glossary entry for more information on crons in HPE ML Data Management
  • Deprecation: The original Python SDK (python-pachyderm) will be deprecated in 9 months (May 2024). We recommend that you start trying out the new Pachyderm SDK (pachyderm-sdk) and begin planning your transition.

2.6.9

2.6.9

patch

Preflight Checks

GitHub
  • Feature: The preflight check features for 2.7.0 have been backported to 2.6.9. HPE ML Data Management’s helm chart now has a section for preflight checks, allowing you to easily validate whether the upgrade/migrations will be successful via a dry run. This section can be found at pachd.preflightchecks. Simply set enabled: true and set the image.tag to the new version you want to upgrade to. If created the pod named pachyderm-preflight-check shows a status of Completed, you are ready to perform the upgrade. See the Upgrade steps for more information.

2.6.8

2.6.8

patch

New PachD Probe Settings

GitHub
  • Enhancement: GPU support has been added to the Jupyter Pipeline Extension
  • Enhancement: When you create a service pipeline via the Jupyter Extension, the IP address of the created load balancer is now shown.
  • Fix: Fixed an issue with branch triggers where filtering by source branch was not working as expected.

2.6.7

2.6.7

patch

New PachD Probe Settings

GitHub
  • Enahncement: You can now create service pipelines via the Jupyter Extension.
  • Enhancement: The logs viewer in Console has been improved to be more performant.

2.6.6

2.6.6

patch

New PachD Probe Settings

GitHub
  • Enhancement: The DAG view in Console has been refreshed to have a new look and feel.
  • Refactor: Cron Triggers have been refactored to perform as expected, where you can set up a scheduled reoccurring event on a repo branch that evaluates and fires the trigger. When a Cron Trigger fires, but no new data has been added, there are no new downstream commits or jobs.
  • Fix: Fixed an issue in Console for users using Safari where the scrollbar wasn’t working as expected.

2.6.5

2.6.5

patch

New PachD Probe Settings

GitHub
  • Enhancement: Additional PachD Probe settings (readinessProbe, livenessProbe, and startupProbe) have been added to the Helm Chart values.yaml file for added flexibility. If not specified, the default values are used.
  • Enhancement: General improvements to the upgrade process from 2.4.x > 2.6.5

2.6.4

2.6.4

patch

General Enhancements

GitHub
  • Enhancement: Provenance migration for 2.6 now skips migrating any missing provenance.
  • Enhancement: The Jupyter Pipeline Extension’s entrypoint has been set to run unbuffered so that output is immediately written to stdout.
  • Enhancement: A returned_for field has been added to roles; This field indicates the resources the role is returned for, even if the role wasn’t initially bound to those specific resources.

2.6.3

2.6.3

patch

Bugfix for 2.6 Upgrade Error

GitHub
  • Fix: Resolved an issue where some users upgrading from 2.5.0 to 2.6.x would experience the following error: duplicate key value violates unique constraint 'commit_totals_pkey'.

2.6.2

2.6.2

patch

Jupterlab Mount Extension Projects Feature Improvements

GitHub
  • Enhancement: Projects without repos are now listed within the Jupyterlab Mount Extension UI.
  • Enhancement: A Project field has been added to the Jupyterlab Pipeline Extension. Previously, you would define the project using the Pipeline Name field (for example, project/pipeline). The default value for the Project field is the ‘default’ project.

2.6.1

2.6.1

patch

General Enhancements & Bug Fixes

GitHub
  • Enhancement: Improved handling of symlinks with nullptr check.
  • Enhancement You can now set the name of your connection when using pachctl connect.
  • Fix: Resolved an issue where pachctl list commit was showing inconsistent sizes when used with/without branch.
  • Fix: Resolved an issue that caused the file browser in Pachyderm’s PPS extension to unexpectedly jump back to the top level while inspecting notebook outputs within the /pfs/out directory.

2.6.0

2.6.0

minor

Datum Batching, JupyterLab Pipeline Extension, & Projects RBAC updates

GitHub
  • Feature: Datum Batching is now available. Datum Batching is a performance optimization process that enables processing multiple datums sequentially.
  • Feature: The JupyterLab Pipeline Extension (PPS Extension) is now available, allowing users to push notebook code directly into a pipeline to create and run it. This feature is in Alpha, so we encourage you to share your feedback with us as you use it.
  • Enhancement: New RBAC roles have been added to Projects: ProjectViewerRole, ProjectWriterRole, ProjectOwnerRole, and ProjectCreatorRole. You can read about the roles here.
  • Enhancement: The Console UI has undergone some substantial improvements, including a revamped file browser and more detailed information about pipeline and job performance.
  • Enhancement: The Documentation site has undergone a substantial information architecture overhaul, making it easier to find the information you need. Content is now stored in top-level folders that follow the natural progression of learning about and using HPE ML Data Management.

2.5.8

2.5.8

patch

General Updates

GitHub

2.5.7

2.5.7

patch

DB Call Retries

GitHub
  • Enhancement: We now wrap database calls in retries to catch connection flakiness.

2.5.6

2.5.6

patch

Get File URL Updates

GitHub
  • Fix: Corrected “Get File URL” functionality by removing leading slashes from output path and restoring support for output path prefixes that was available prior to 2.5.0

2.5.5

2.5.5

patch

Auth Migration and S3 Input Fixes

GitHub
  • Fix: Resolved an issue with “Get File URL” when authentication is enabled by modifying its functionality to operate on file sets instead of commits, leveraging the capability-based authentication of the file set API.
  • Fix: Resolved an issue where input files from an s3 were being downloaded by the worker’s storage container, causing worker pods to be evicted due to disk pressure. Now, the input files from s3 are not downloaded.
  • Enhancement: Previously, if a user turned off authentication before upgrading, the auth_tokens table would not get migrated. Now, if a user turns off authentication before upgrading, the auth_tokens table will still get migrated.

For releases older than 2.6.0,
check out the full release changelog on GitHub.