AnimalOrganizer#
- class neurodent.visualization.AnimalOrganizer(pattern: str | list[str], animal_id: str | None = None, skip_sessions: list[str] = [], truncate: bool | int = False, assume_from_number: bool = False, lro_kwargs: dict = {}, normalize_session: Callable[[str], str] | None = None)[source]#
Bases:
AnimalFeatureParserOrganizes and analyzes recording data from a single animal across multiple sessions.
AnimalOrganizer uses flexible pattern-based file discovery to locate recording files, groups them by session, and creates LongRecordingOrganizer instances for each session.
- Parameters:
pattern (
str | list[str]) –File pattern(s) for discovering recording files. - Single pattern: “/path/{animal}/{session}/{index}.rhd” - Multiple patterns: [“/path/{animal}/{session}/data.bin”, “/path/{animal}/{session}/meta.csv”]
- Placeholders:
{animal}: Animal ID (e.g., “A10”) {session}: Session identifier (e.g., “2025-01-24” or “day1”) {index}: File index within a session (e.g., “1”, “2”, “3”)
Examples
”/data/{animal}/{session}/{index}.rhd”
”/data/{animal}-{session}-{index}.edf”
”/data/{session}/*/{animal}-{index}.rhd”
”/data/**/{animal}-{session}-{index}.rhd”
”/data/{animal}/{index}.edf” (no session - will use “unknown”)
animal_id (
str | None, optional) – Animal ID to filter discovered files. If provided, only files matching this animal ID will be included.skip_sessions (
list[str], optional) – Glob patterns for sessions to exclude. Uses fnmatch-style wildcards (*,?,[seq]). E.g.["*bad*", "corrupted_*"]. Defaults to [].truncate (
bool | int, optional) – If True, truncate to first 10 sessions. If an integer, truncate to first n sessions. Defaults to False.assume_from_number (
bool, optional) – Whether to parse channel names as numbers (used for analysis, not discovery). Defaults to False.lro_kwargs (
dict, optional) – Keyword arguments passed to each LongRecordingOrganizer instance. Common options include ‘mode’, ‘extract_func’, ‘manual_datetimes’. Defaults to {}.normalize_session (
callable | None, optional) – A function that transforms session keys before grouping. For example, to merge split-day folders like “2023-01-15”, “2023-01-15(1)”, “2023-01-15(2)” into one session, passlambda s: re.sub(r"\(\d+\)$", "", s). Defaults to None (no normalization).
- Variables:
pattern (
str | list[str]) – The file pattern(s) used for discovery.animal_id (
str | None) – The ID of the animal being analyzed.unique_animaldays (
list[str]) – List of unique session identifiers (format: “{animal}_{session}”).animaldays (
list[str]) – Alias for unique_animaldays.genotype (
str) – Genotype of the animal (from ANIMAL_METADATA if available).sex (
str) – Sex of the animal (from ANIMAL_METADATA if available).long_recordings (
list[LongRecordingOrganizer]) – LRO instances, one per session.long_analyzers (
list[LongRecordingAnalyzer]) – Analysis instances, one per session.features_df (
pd.DataFrame) – Aggregated feature DataFrame across all sessions.features_avg_df (
pd.DataFrame) – Average features across sessions.
- __init__(pattern: str | list[str], animal_id: str | None = None, skip_sessions: list[str] = [], truncate: bool | int = False, assume_from_number: bool = False, lro_kwargs: dict = {}, normalize_session: Callable[[str], str] | None = None) None[source]#
- get_timeline_summary()[source]#
Get timeline summary as a DataFrame for user inspection and debugging.
- convert_colbins_to_rowbins(overwrite=False, multiprocess_mode: Literal['dask', 'serial'] = 'serial')[source]#
- compute_bad_channels(lof_threshold: float | None = None, force_recompute: bool = False, lof_chunk_duration_s: float = 60)[source]#
Compute bad channels using LOF analysis for all recordings.
- Parameters:
lof_threshold (
float, optional) – Threshold for determining bad channels from LOF scores. If None, only computes/loads scores without setting bad_channel_names.force_recompute (
bool) – Whether to recompute LOF scores even if they exist.lof_chunk_duration_s (
float) – Duration in seconds of each chunk used for the pairwise-distance computation in LOF. Defaults to 60.
- apply_lof_threshold(lof_threshold: float)[source]#
Apply threshold to existing LOF scores to determine bad channels for all recordings.
- Parameters:
lof_threshold (
float) – Threshold for determining bad channels.
- get_all_lof_scores() dict[source]#
Get LOF scores for all recordings.
- Returns:
Dictionary mapping animal days to LOF score dictionaries.
- Return type:
dict
- compute_windowed_analysis(features: list[str], exclude: list[str] = [], window_s=5, multiprocess_mode: Literal['dask', 'serial'] = 'serial', suppress_short_interval_error=False, apply_notch_filter=True, chunk_duration_s: float | None = 3600, **kwargs) WindowAnalysisResult[source]#
Computes windowed analysis of animal recordings. The data is divided into windows (time bins), then features are extracted from each window. The result is formatted to a Dataframe and wrapped into a WindowAnalysisResult object.
- Parameters:
features (
list[str]) – List of features to compute. See individualcompute_...()functions for output formatexclude (
list[str], optional) – List of features to ignore. Will override the features parameter. Defaults to [].window_s (
int, optional) – Length of each window in seconds. Note that some features break with very short window times. Defaults to 5.suppress_short_interval_error (
bool, optional) – If True, suppress ValueError for short intervals between timestamps in resulting WindowAnalysisResult. Useful for aggregated WARs. Defaults to False.apply_notch_filter (
bool, optional) – Whether to apply notch filtering to remove line noise. Uses constants.LINE_FREQ. Defaults to True.chunk_duration_s (
float, optional) – Duration in seconds of data to hold in memory at once during the Dask processing path. Internally converted to a number of fragments viaint(chunk_duration_s / window_s). WhenNone, all fragments are loaded into a single NumPy array before being written to the intermediate zarr store — the original behavior, which maximizes throughput but requires enough RAM to hold the entire recording at once. When set to a positive value, only the corresponding number of fragments are buffered at a time, streaming them to zarr incrementally; use a small value (e.g. 250) on memory-constrained machines and a larger value (e.g. 2500+) on high-memory nodes for maximum throughput. Only has an effect whenmultiprocess_mode="dask". Defaults to 3600.
- Raises:
AttributeError – If a feature’s
compute_...()function was not implemented, this error will be raised.- Returns:
A WindowAnalysisResult object containing extracted features for all recordings
- Return type:
- compute_frequency_domain_spike_analysis(detection_params: dict | None = None, chunk_duration_s: float = 3600, multiprocess_mode: Literal['dask', 'serial'] = 'serial')[source]#
Compute frequency-domain spike detection on all long recordings.
- Parameters:
detection_params (
dict, optional) – Detection parameters. Uses defaults if None.chunk_duration_s (
float) – Duration in seconds of each processing chunk. Defaults to 3600 (1 hour). The full recording is always analysed; this parameter controls peak RAM by processing in overlapping chunks.Noneloads the full recording at once (fastest).multiprocess_mode (
Literal["dask", "serial"]) – Processing mode
- Returns:
Results for each recording session
- Return type:
- Raises:
ImportError – If SpikeInterface is not available
- classmethod from_lros(lros: list[LongRecordingOrganizer], animal_id: str, genotype: str = 'Unknown', sex: str = 'Unknown', assume_from_number: bool = False) AnimalOrganizer[source]#
Create an AnimalOrganizer from an existing list of LongRecordingOrganizer objects.
This factory method bypasses the normal folder discovery logic and creates an AnimalOrganizer directly from pre-existing LROs. If multiple LROs share the same date, they will be automatically merged into a single LRO per unique date, matching the behavior of the normal __init__ path.
- Parameters:
lros (
list[LongRecordingOrganizer]) – List of LRO instances to wrap.animal_id (
str) – Animal identifier for this organizer.genotype (
str, optional) – Genotype string. Defaults to “Unknown”.sex (
str, optional) – Sex string (e.g. “Male”, “Female”). Defaults to “Unknown”.assume_from_number (
bool, optional) – Whether to assume channel aliases from numbers. Defaults to False.
- Returns:
- A new AnimalOrganizer instance wrapping the provided LROs
(with duplicates merged).
- Return type:
- Raises:
ValueError – If lros is empty, channel names are inconsistent, or LROs with the same date cannot be merged due to incompatible metadata.
Note
Multiple LROs with the same date will be automatically merged in temporal order (sorted by median timestamp). This ensures proper handling of multi-session recordings consolidated via generate_wars.py.
Example
>>> # After splitting a multi-animal recording across multiple sessions >>> all_lros = [] >>> for session_ao in session_aos: ... splits = session_ao.split({"AnimalA": ["Ch0", "Ch1"]}) ... all_lros.append(splits["AnimalA"]) >>> # from_lros automatically merges LROs with same date >>> child_ao = AnimalOrganizer.from_lros(all_lros, animal_id="AnimalA")
- split(groups: dict[str, list[str]], persist_base: str | Path | None = None, format: Literal['zarr', 'binary'] = 'zarr') dict[str, AnimalOrganizer][source]#
Split this multi-animal AnimalOrganizer into per-animal AnimalOrganizers.
For each group (animal), this method: 1. Iterates over all LROs in this AnimalOrganizer 2. Calls LRO.split() on each to extract the specified channels 3. Optionally persists each split LRO to disk 4. Creates a new AnimalOrganizer for each group
This enables processing of joint-animal recordings where multiple animals are recorded on different channels of the same files.
- Parameters:
groups (
dict[str, list[str]]) –Dictionary mapping group names (animal IDs) to lists of channel names. Example: {“AnimalA”: [“Ch0”, “Ch1”, “Ch2”, “Ch3”],
”AnimalB”: [“Ch4”, “Ch5”, “Ch6”, “Ch7”]}
persist_base (
Union[str, Path], optional) –Base directory for persisting split recordings. If None, LROs remain in-memory. Structure: persist_base/
- AnimalA/
day1.zarr day2.zarr
- AnimalB/
…
format (
Literal["zarr", "binary"], optional) – Format for persisted recordings. Defaults to “zarr”.
- Returns:
- Dictionary mapping group names to new
AnimalOrganizer instances.
- Return type:
dict[str, AnimalOrganizer]
- Raises:
ValueError – If requested channels are not found in recordings.
Example
>>> ao = AnimalOrganizer("/path/to/joint_data", "combined") >>> splits = ao.split( ... groups={"MouseA": ["Ch0", "Ch1"], "MouseB": ["Ch2", "Ch3"]}, ... persist_base="/output/split_data", ... ) >>> war_a = splits["MouseA"].compute_windowed_analysis(["all"]) >>> war_b = splits["MouseB"].compute_windowed_analysis(["all"])