Skip to content

Using DataChain Commands

DataChain is a command-line tool for wrangling unstructured AI data at scale. Use datachain -h to list all available commands.

Typical DataChain Workflow

  1. Authentication with Studio

  2. Job Management

  3. Maintenance

    • Clean up temporary tables, failed versions, and outdated checkpoints with datachain gc