Testing & Development ===================== This guide covers testing, documentation building, and code style for NeuRodent. Running Tests ------------- Run the test suite with pytest: .. code-block:: bash make test # or: uv run pytest Run with coverage: .. code-block:: bash uv run pytest --cov=neurodent Run integration tests only: .. code-block:: bash uv run pytest tests/integration/ -v -m integration Pipeline Testing Strategy ------------------------- The Snakemake pipeline is validated at three levels, all integrated into CI/CD via GitHub Actions: 1. **Component-level integration tests (pytest)** These tests exercise the core pipeline building blocks — ``FileDiscoverer``, ``AnimalOrganizer``, WAR generation, plotters, and FDSAR — against a tiny synthetic NWB dataset generated on the fly. They run as part of the normal ``uv run pytest`` invocation and are the fastest feedback loop. 2. **Snakemake DAG dry-run (pytest + subprocess)** ``TestSnakemakeDryRun`` in ``tests/integration/test_snakemake_flow.py`` invokes ``snakemake --dryrun`` as a subprocess for both the ``example`` and ``mini_real`` datasets. This validates the Snakefile, config files, sample JSONs, wildcard resolution, and rule definitions without processing any data. 3. **Snakemake real run (GitHub Actions, mini_real)** The ``test-build-docs.yml`` workflow runs a real full-pipeline ``snakemake`` execution on the committed mini_real dataset. This validates end-to-end processing (data loading, discovery, analysis, filtering, figures) on every push/PR. The ``example`` dataset uses a DAG dry-run since its synthetic NWB data is generated at test-fixture time, not committed to the repo. .. code-block:: bash # Quick: validate DAG only (seconds) NEURODENT_DATASET=example uv run snakemake --dryrun # Real run: execute full pipeline on committed mini data NEURODENT_DATASET=mini_real uv run snakemake --cores all # Component integration tests (seconds) uv run pytest tests/integration/ -v -m integration Example Dataset for Pipeline Testing ------------------------------------- The repository includes infrastructure for testing the Snakemake pipeline without processing full production-scale recordings. **Programmatic synthetic data (CI-friendly)** A ``create_synthetic_dataset()`` helper in ``tests/integration/generate.py`` builds a tiny directory tree with 8-channel NWB files that can be read back by SpikeInterface. The ``example_dataset`` pytest fixture in ``tests/conftest.py`` wraps this for convenient use: .. code-block:: python def test_my_pipeline_step(example_dataset): data_root = example_dataset["data_root"] samples_config = example_dataset["samples_config"] # ... Integration tests in ``tests/integration/test_snakemake_flow.py`` demonstrate discovery, filtering, and config-alias injection against this data. **Mini real dataset (committed bin/csv recordings)** Small real recordings are committed under ``.tests/integration/data/`` and exercised by pytest integration tests in ``TestMiniRealDataset``. The dataset uses ``{animal}/{index}`` placeholders with paired ``.bin`` / ``.csv`` files: .. code-block:: bash # Run mini-real integration tests uv run pytest tests/integration/ -v -k TestMiniRealDataset The full Snakemake run (via ``.github/workflows/snakemake-test.yml``) also exercises this dataset end-to-end. See ``config/datasets/mini_real.yaml`` and ``config/samples_mini_real.json`` for the corresponding configuration. Building Documentation ---------------------- Build docs locally: .. code-block:: bash make docs # or: cd docs && uv run sphinx-build -b html . _build/html Build with live reload (auto-refresh on changes): .. code-block:: bash make docs-live Validation Scripts ------------------ NeuRodent includes validation scripts to verify correctness and performance on real data. These scripts are located in the ``scripts/`` directory and should be run on the cluster. Dask vs Serial Spike Detection Validation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To verify that dask and serial multiprocess modes produce identical spike detection results: .. code-block:: bash # Request 10 cores for dask processing srun -c 10 --pty bash cd /mnt/isilon/marsh_single_unit/YY_PyEEG/neurodent_Joseph_devtree # Run validation (default: 30 minutes of data) uv run python scripts/validate_dask_serial_spike_consistency.py \ --recording-folder "/path/to/real/recording" \ --verbose This script validates exact spike indices (sample-level precision) and measures performance. Run this after modifying ``src/neurodent/core/frequency_domain_spike_detection.py``. **When to run:** - After modifying spike detection algorithm - Before releasing major versions - When investigating spike count discrepancies - Periodically (e.g., quarterly) as a sanity check **Expected output:** - ✅ All channels match perfectly at sample-level precision - Performance comparison (dask should be 1.5-2x faster) - Exit code 0 if identical, 1 if different For full usage: ``python scripts/validate_dask_serial_spike_consistency.py --help`` Writing Validation Scripts ~~~~~~~~~~~~~~~~~~~~~~~~~~~ When creating validation scripts for the neurodent project: **Structure:** - Location: ``/scripts/`` directory - CLI interface via ``argparse`` with ``--help`` output - Comprehensive module docstring (30-40 lines) with usage examples - Exit code 0 for success, 1 for failure - Use ``uv run python scripts/.py`` in documentation **Docstring requirements:** - Brief description of what the script validates - Important warnings (e.g., "Run on cluster only") - Installation/setup requirements - Usage examples with real data paths (or placeholders) - Expected output description **Code style:** - Follow project style (ruff/PEP8) - Helper functions with Google-style docstrings - Structured output with visual separators - Informative error messages **Documentation:** - Add section to ``/docs/contributing/testing.rst`` - Update CLAUDE.md if establishing new patterns - Reference in README.md if user-facing **Example structure:** .. code-block:: python #!/usr/bin/env python3 """ Brief description. Detailed explanation. IMPORTANT: - Requirements/warnings Usage: uv run python scripts/script_name.py [options] Example: uv run python scripts/script_name.py --input /path/to/data """ import argparse import sys def helper_function(): """Docstring.""" pass def main(): parser = argparse.ArgumentParser(description="...") # ... sys.exit(0 if success else 1) if __name__ == "__main__": main() Code Style ---------- We follow standard Python conventions: - **Formatting**: Use `ruff `_ for formatting and linting - **Style**: Follow PEP 8 guidelines - **Types**: Add type hints where appropriate - **Docstrings**: Use NumPy style