Learn about the concept of a glob pattern, which is a string of characters that specifies a set of filenames or paths in a file system.
November 29, 2023
A glob pattern is a string of characters that specifies a set of filenames or paths in a file system. The term “glob” is short for “global,” and refers to the fact that a glob pattern can match multiple filenames or paths at once. For HPE ML Data Management, you can use glob patterns to define the shape of your datums against your inputs, which are spread across HPE ML Data Management workers for distributing computing.
|Glob Pattern||Datum created|
|HPE ML Data Management denotes the whole repository as a single datum and sends all input data to a single worker node to be processed together.|
|HPE ML Data Management defines each top-level files / directories in the input repo, as a separate datum. For example, if you have a repository with ten files and no directory structure, HPE ML Data Management identifies each file as a single datum and processes them independently.|
|HPE ML Data Management processes each file / directory in each subdirectories as a separate datum.|
|HPE ML Data Management processes each file in all directories and subdirectories as a separate datum.|
Glob patterns can also use other special characters, such as the question mark (
?) to match a single character, or brackets (
[...]) to match a set of characters.