Snakemake Pipeline Setup#

This guide covers setting up and configuring the Snakemake workflow for automated analysis pipelines.

Installing Pipeline Dependencies#

Install the optional pipeline dependencies:

Using uv:

uv add neurodent[pipeline]

Using pip:

pip install neurodent[pipeline]

Note

The pipeline extra includes Snakemake and related dependencies needed for running the automated analysis workflow. If you only need the core NeuRodent library for Python-based analysis, the basic installation is sufficient.

SLURM Cluster Configuration#

If you’re running the Snakemake workflow on a SLURM cluster, the setup depends on your Snakemake version.

Snakemake 7.x (Python 3.10)#

Use a Snakemake SLURM profile generated with cookiecutter.

Recommended log path setting: To place SLURM job logs alongside Snakemake logs (making debugging easier), update your profile’s CookieCutter.py:

# ~/.config/snakemake/your-profile/CookieCutter.py
def get_cluster_logpath() -> str:
    return "logs/%r/slurm_%j"  # puts slurm_{jobid}.{out,err} in logs/<rule>/

Run the workflow with:

uv run snakemake --profile your-profile

Snakemake 8+ (Python 3.11+)#

Snakemake 8 uses a native SLURM executor plugin instead of cookiecutter profiles.

Install the plugin:

pip install snakemake-executor-plugin-slurm

Run the workflow with the --executor flag:

uv run snakemake --executor slurm --default-resources --jobs 30

The native plugin automatically:

Deletes SLURM log files for successful jobs (reduces clutter)
Preserves logs for failed jobs for 10 days
Uses --slurm-logdir to customize log location

To customize log directory, add to your profile:

# ~/.config/snakemake/your-profile/config.yaml
executor: slurm
slurm-logdir: "logs/slurm"

See the plugin documentation for full configuration options.

Local Configuration Overrides#

You can override any setting from config/config.yaml using a local configuration file. This is useful for adjusting analysis parameters or file paths for your specific environment without modifying the main configuration file (which is tracked by git).

To use local overrides:

Create a file named config/config.local.yaml.
Add the specific configuration keys you wish to override. You do not need to copy the entire configuration file; Snakemake performs a “deep merge”, so only the keys you specify will be updated.

Example:

If you want to change the analysis sampling rate but keep all other settings:

# config/config.local.yaml
analysis:
  sampling_rate: 2000

The config/config.local.yaml file is included in .gitignore and will not be pushed to the repository.

Testing with a Subset of Animals#

When testing the pipeline, you may want to run only a small number of animals instead of the full dataset. Use the truncate_animals setting under samples to limit processing to the first N animals in the samples file:

# config/config.local.yaml
samples:
  truncate_animals: 2   # only process the first 2 animals

Set truncate_animals to null (the default) to process all animals.

Tip

Combine this with a fast dataset such as mini_real for quick smoke-testing:

NEURODENT_DATASET=mini_real uv run snakemake --cores all

Running the Pipeline#

Basic Usage#

# Dry run to see what would be executed
uv run snakemake --dry-run

# Run pipeline locally (for testing)
uv run snakemake --cores all

# Run on SLURM cluster
uv run snakemake --profile your-profile

Useful Commands#

# Generate workflow visualization
uv run snakemake --rulegraph | dot -Tpng > workflow.png

# Clean results (be careful!)
uv run snakemake --delete-all-output

# Unlock workflow (if interrupted)
uv run snakemake --unlock

# Force re-run specific rule
uv run snakemake --forcerun rule_name

See also: Dataset Configuration for selecting and configuring different datasets.