Announcing Comet Artifacts
Introducing Comet Artifacts
Comet Artifacts is a new set of tools that provides ML teams a convenient way to log, version, and browse data from all parts of their experimentation pipelines.
Machine learning typically involves experimenting with different models, hyperparameters and different versions of datasets.
In addition to the metrics and parameters that are being measured and tested, machine learning also involves keeping track of the inputs and outputs produced by an experiment. An experiment run can produce all sorts of interesting output data. These data artifacts can be files containing model predictions, model weights, and much more.
Often, the outputs from one experiment can be used as the inputs for other experiments—this can become complex to track without the right structure or a single source of truth.
We built Comet Artifacts to solve these specific challenges.
What Are Artifacts?
An Artifact is a versioned object, where each version is an immutable snapshot of files & assets, arranged in a folder-like logical structure. This snapshot can be tracked using metadata, a version number, tags, and aliases. A version tracks which experiments consumed it, and which experiment produced it.
This means that with Artifacts, you can structure your experiments as multi-stage pipelines or DAGs (Directed Acyclic Graphs), and ensure centralized, managed and versioned access to any of the intermediate data produced in the process.
Specifically, Artifacts enable you and your team to:
- Reuse data produced by intermediate or exploratory steps in experimentation pipelines, and allow it to be tracked, versioned, consumed, and analyzed in a managed way.
- Track and reproduce complex multi-experiment scenarios, where the output of one model would be used in the input of another experiment.
- Iterate on datasets over time, track which model used which version of the dataset, and schedule model re-training.
Getting Started with Artifacts
It takes only 3 lines of code to register an Artifact of any size in Comet:
artifact = Artifact("artifact-name", "dataset") artifact.add("path/to/my/file.csv") experiment.log_artifact(artifact)
And then just 2 lines of code to download and use a logged Artifact in an Experiment:
logged_artifact = experiment.get_artifact("artifact-name") local_artifact = logged_artifact.download()
For a deeper dive into working with Artifacts checkout these additional resources: