Utilities#

Utility functions for the core module.

Functions#

neurodent.core.utils.convert_units_to_multiplier(current_units: str, target_units: str = 'µV') float[source]#

Convert between different voltage units and return the multiplication factor.

This function calculates the conversion factor needed to transform values from one voltage unit to another (e.g., from mV to µV).

Parameters:
  • current_units (str) – The current unit of the values. Must be one of: ‘µV’, ‘mV’, ‘V’, ‘nV’.

  • target_units (str, optional) – The target unit to convert to. Defaults to ‘µV’. Must be one of: ‘µV’, ‘mV’, ‘V’, ‘nV’.

Returns:

The multiplication factor to convert from current_units to target_units.

To convert values, multiply your data by this factor.

Return type:

float

Raises:

AssertionError – If current_units or target_units are not supported.

Examples

>>> convert_units_to_multiplier("mV", "µV")
1000.0
>>> convert_units_to_multiplier("V", "mV")
1000.0
>>> convert_units_to_multiplier("µV", "V")
1e-06
neurodent.core.utils.extract_mne_unit_info(raw_info: dict) tuple[str | None, float | None][source]#

Extract unit information from MNE Raw info object.

Parameters:

raw_info (dict) – MNE Raw.info object containing channel information

Returns:

(unit_name, mult_to_uV) where unit_name

is the consistent unit across all channels and mult_to_uV is the conversion factor to µV

Return type:

tuple[str | None, float | None]

Raises:

ValueError – If channel units are inconsistent across channels

neurodent.core.utils.is_day(dt: datetime, sunrise=6, sunset=18)[source]#

Check if a datetime object is during the day.

Parameters:
  • dt (datetime) – Datetime object to check

  • sunrise (int, optional) – Sunrise hour (0-23). Defaults to 6.

  • sunset (int, optional) – Sunset hour (0-23). Defaults to 18.

Returns:

True if the datetime is during the day, False otherwise

Return type:

bool

Raises:

TypeError – If dt is not a datetime object

neurodent.core.utils.convert_colpath_to_rowpath(rowdir_path: str | Path, col_path: str | Path, gzip: bool = True, aspath: bool = True) str | Path[source]#

Convert a ColMajor file path to its corresponding RowMajor file path.

This function transforms file paths from column-major format to row-major format, which is used when converting between different data storage layouts in NeuRodent.

Parameters:
  • rowdir_path (str | Path) – Directory path where the RowMajor file should be located.

  • col_path (str | Path) – Path to the ColMajor file to be converted. Must contain ‘ColMajor’ in the path.

  • gzip (bool, optional) – If True, append ‘.npy.gz’ extension. If False, append ‘.bin’. Defaults to True.

  • aspath (bool, optional) – If True, return as Path object. If False, return as string. Defaults to True.

Returns:

The converted RowMajor file path, either as string or Path object based on aspath parameter.

Return type:

str | Path

Raises:

ValueError – If ‘ColMajor’ is not found in col_path.

Examples

>>> convert_colpath_to_rowpath("/data/row/", "/data/col/file_ColMajor_001.bin")
PosixPath('/data/row/file_RowMajor_001.npy.gz')
>>> convert_colpath_to_rowpath("/data/row/", "/data/col/file_ColMajor_001.bin", gzip=False)
PosixPath('/data/row/file_RowMajor_001.bin')
>>> convert_colpath_to_rowpath("/data/row/", "/data/col/file_ColMajor_001.bin", aspath=False)
'/data/row/file_RowMajor_001.npy.gz'
neurodent.core.utils.filepath_to_index(filepath) int[source]#

Extract the index number from a filepath.

This function extracts the last number found in a filepath after removing common suffixes and file extensions. For example, from “/path/to/data_ColMajor_001.bin” it returns 1.

Parameters:

filepath (str | Path) – Path to the file to extract index from.

Returns:

The extracted index number, or 0 if no number is found in the filename.

Return type:

int

Examples

>>> filepath_to_index("/path/to/data_ColMajor_001.bin")
1
>>> filepath_to_index("/path/to/data_2023_015_ColMajor.bin")
15
>>> filepath_to_index("/path/to/data_Meta_010.json")
10
neurodent.core.utils.parse_truncate(truncate: int | bool) int[source]#

Parse the truncate parameter to determine how many characters to truncate.

If truncate is a boolean, returns 10 if True and 0 if False. If truncate is an integer, returns that integer value directly.

Parameters:

truncate (int | bool) – If bool, True=10 chars and False=0 chars. If int, specifies exact number of chars.

Returns:

Number of characters to truncate (0 means no truncation)

Return type:

int

Raises:

ValueError – If truncate is not a boolean or integer

neurodent.core.utils.get_feature_label(feature_name: str) str[source]#

Convert a feature column name to a human-readable label.

Handles: - Base features: “rms” -> “RMS” - Banded features: “logpsdband_delta” -> “Log Band Power - Delta” - Baseline-subtracted: “logrms_nobase” -> “Log(RMS) - Baseline”

Parameters:

feature_name – Column name (e.g., “logpsdband_delta_nobase”)

Returns:

Human-readable label. Falls back to the original name if not found.

Examples

>>> get_feature_label("logpsdband_delta")
'Log Band Power (Delta)'
>>> get_feature_label("alphadelta")
'Alpha/Delta Ratio'
>>> get_feature_label("logrms_nobase")
'Log(RMS) - Baseline'
neurodent.core.utils.nanaverage(A: ndarray, weights: ndarray, axis: int = -1) ndarray[source]#

Compute weighted average of an array, ignoring NaN values.

This function computes a weighted average along the specified axis while properly handling NaN values by masking them out of the calculation.

Parameters:
  • A (np.ndarray) – Input array containing the values to average.

  • weights (np.ndarray) – Array of weights corresponding to the values in A. Must be broadcastable with A along the specified axis.

  • axis (int, optional) – Axis along which to compute the average. Defaults to -1 (last axis).

Returns:

Weighted average with NaN values properly handled. If all values

along an axis are NaN, the result will be NaN for that position.

Return type:

np.ndarray

Examples

>>> import numpy as np
>>> A = np.array([[1.0, 2.0, np.nan], [4.0, np.nan, 6.0]])
>>> weights = np.array([1, 2, 1])
>>> nanaverage(A, weights, axis=1)
array([1.66666667, 5.        ])

Note

Be careful with zero or negative weights as they may produce unexpected results. The function uses numpy’s masked array functionality for robust NaN handling.

neurodent.core.utils.parse_path_to_animalday(filepath: str | Path, animal_param: tuple[int, str] | str | list[str] = (0, None), day_sep: str | None = None, mode: Literal['nest', 'concat', 'base', 'noday'] = 'concat', **day_parse_kwargs)[source]#

DEPRECATED: Use FileDiscoverer with pattern-based discovery instead.

Parses the filename of a binfolder to get the animalday identifier (animal id, genotype, and day).

Parameters:
  • filepath (str | Path) – Filepath of the binfolder.

  • animal_param (tuple[int, str] | str | list[str], optional) – Parameter specifying how to parse the animal ID: tuple[int, str]: (index, separator) for simple split and index str: regex pattern to extract ID list[str]: list of possible animal IDs to match against

  • day_sep (str, optional) – Separator for day in filename. Defaults to None.

  • mode (Literal['nest', 'concat', 'base', 'noday'], optional) –

    Mode to parse the filename. Defaults to ‘concat’.

    • ’nest’: Extracts genotype/animal from parent directory name and date from filename. Example: “/WT_A10/recording_2023-04-01.*”

    • ’concat’: Extracts all info from filename, expects genotype_animal_date format. Example: “/WT_A10_2023-04-01.*”

    • ’base’: Same as concat

    • ’noday’: Extracts only genotype and animal ID, uses default date. Example: “/WT_A10_recording.*”

  • **day_parse_kwargs – Additional keyword arguments to pass to parse_str_to_day function. Common options include parse_params dict for dateutil.parser.parse.

Returns:

Dictionary with keys “animal”, “genotype”, “day”, and “animalday” (concatenated).

Example: {“animal”: “A10”, “genotype”: “WT”, “day”: “Apr-01-2023”, “animalday”: “A10 WT Apr-01-2023”}

Return type:

dict[str, str]

Raises:
  • ValueError – If mode is invalid or required components cannot be extracted

  • TypeError – If filepath is not str or Path

neurodent.core.utils.parse_str_to_genotype(string: str, strict_matching: bool = False) str[source]#

Parses the filename of a binfolder to get the genotype.

Parameters:
  • string (str) – String to parse.

  • strict_matching (bool, optional) – If True, ensures the input matches exactly one genotype. If False, allows overlapping matches and uses longest. Defaults to False for backward compatibility.

Returns:

Genotype.

Return type:

str

Raises:

ValueError – When string cannot be parsed or contains ambiguous matches in strict mode.

Examples

>>> parse_str_to_genotype("WT_A10_data")
'WT'
>>> parse_str_to_genotype("WT_KO_comparison", strict_matching=True)  # Would raise error
ValueError: Ambiguous match...
>>> parse_str_to_genotype("WT_KO_comparison", strict_matching=False)  # Uses longest match
'WT'  # or 'KO' depending on which alias is longer
neurodent.core.utils.parse_str_to_animal(string: str, animal_param: tuple[int, str] | str | list[str] = (0, None)) str[source]#

DEPRECATED: Use FileDiscoverer with {animal} placeholder in pattern instead.

Parses the filename of a binfolder to get the animal id.

Parameters:
  • string (str) – String to parse.

  • animal_param – Parameter specifying how to parse the animal ID: tuple[int, str]: (index, separator) for simple split and index. Not recommended for inconsistent naming conventions. str: regex pattern to extract ID. Most general use case. If multiple matches are found, returns the first match. list[str]: list of possible animal IDs to match against. Returns first match in list order, case-sensitive, ignoring empty strings.

Returns:

Animal id.

Return type:

str

Examples

# Tuple format: (index, separator) >>> parse_str_to_animal(“WT_A10_2023-01-01_data.bin”, (1, “_”)) ‘A10’ >>> parse_str_to_animal(“A10_WT_recording.bin”, (0, “_”)) ‘A10’

# Regex pattern format >>> parse_str_to_animal(“WT_A10_2023-01-01_data.bin”, r”A\d+”) ‘A10’ >>> parse_str_to_animal(“subject_123_data.bin”, r”\d+”) ‘123’

# List format: possible IDs to match >>> parse_str_to_animal(“WT_A10_2023-01-01_data.bin”, [“A10”, “A11”, “A12”]) ‘A10’ >>> parse_str_to_animal(“WT_A10_data.bin”, [“B15”, “C20”]) # No match ValueError: No matching ID found in WT_A10_data.bin from possible IDs: [‘B15’, ‘C20’]

neurodent.core.utils.parse_str_to_day(string: str, sep: str | None = None, parse_params: dict | None = None, parse_mode: Literal['full', 'split', 'window', 'all'] = 'split', date_patterns: list[tuple[str, str]] | None = None) datetime[source]#

DEPRECATED: Use FileDiscoverer with {session} placeholder in pattern instead.

Parses the filename of a binfolder to get the day.

Parameters:
  • string (str) – String to parse.

  • sep (str, optional) – Separator to split string by. If None, split by whitespace. Defaults to None.

  • parse_params (dict, optional) – Parameters to pass to dateutil.parser.parse. Defaults to {‘fuzzy’:True}.

  • parse_mode (Literal["full", "split", "window", "all"], optional) – Mode for parsing the string. Defaults to “split”. “full”: Try parsing the entire cleaned string only “split”: Try parsing individual tokens only “window”: Try parsing sliding windows of tokens (2-4 tokens) only “all”: Use all three approaches in the order “full”, “split”, “window

  • date_patterns (list[tuple[str, str]], optional) – List of (regex_pattern, strptime_format) tuples to try before falling back to token-based parsing. This allows users to specify exact formats to handle ambiguous cases like MM/DD/YYYY vs DD/MM/YYYY. Only used in “split” and “all” modes. Defaults to None (no regex patterns).

Returns:

Datetime object corresponding to the day of the binfolder.

Return type:

datetime

Raises:
  • ValueError – If no valid date token is found in the string.

  • TypeError – If date_patterns is not a list of tuples.

Examples

>>> # Handle ambiguous date formats with explicit patterns
>>> patterns = [(r'(19\d{2}|20\d{2})-(\d{1,2})-(\d{1,2})', '%Y-%m-%d')]
>>> parse_str_to_day('2001_2023-07-04_data', date_patterns=patterns)
datetime.datetime(2023, 7, 4, 0, 0)
>>> # European format pattern
>>> patterns = [(r'(\d{1,2})/(\d{1,2})/(19\d{2}|20\d{2})', '%d/%m/%Y')]
>>> parse_str_to_day('04/07/2023_data', date_patterns=patterns)
datetime.datetime(2023, 7, 4, 0, 0)  # July 4th, not April 7th

Note

When date_patterns is provided, users have full control over date interpretation. Without date_patterns, the function falls back to token-based parsing which may be ambiguous for formats like MM/DD/YYYY vs DD/MM/YYYY.

neurodent.core.utils.parse_chname_to_abbrev(channel_name: str, assume_from_number=False, strict_matching=True) str[source]#

Parses the channel name to get the abbreviation.

Parameters:
  • channel_name (str) – Name of the channel.

  • assume_from_number (bool, optional) – If True, assume the abbreviation based on the last number in the channel name when normal parsing fails. Defaults to False.

  • strict_matching (bool, optional) – If True, ensures the input matches exactly one L/R alias and one channel alias. If False, allows multiple matches and uses longest. Defaults to True.

Returns:

Abbreviation of the channel name.

Return type:

str

Raises:
  • ValueError – When channel_name cannot be parsed or contains ambiguous matches in strict mode.

  • KeyError – When assume_from_number=True but the detected number is not a valid channel ID.

Examples

>>> parse_chname_to_abbrev("left Aud")
'LAud'
>>> parse_chname_to_abbrev("Right VIS")
'RVis'
>>> parse_chname_to_abbrev("channel_9", assume_from_number=True)
'LAud'
>>> parse_chname_to_abbrev("LRAud", strict_matching=False)  # Would work in non-strict mode
'LAud'  # Uses longest L/R match
neurodent.core.utils.abbreviate_channel_names(names: list[str], strict_matching: bool = True, assume_from_number: bool = False) list[str][source]#

Abbreviate a list of channel names, falling back to raw names for unparseable entries.

Parameters:
  • names – List of channel name strings to abbreviate.

  • strict_matching – Passed to parse_chname_to_abbrev.

  • assume_from_number – Passed to parse_chname_to_abbrev.

Returns:

List of abbreviated channel names (same length as input).

neurodent.core.utils.normalize_value_from_aliases(value: str, alias_dict: dict[str, list[str]]) str | None[source]#

Normalize a value to its canonical form using an alias dictionary.

Unlike _get_key_from_match_values() which uses substring matching for parsing values embedded in filenames, this function performs exact matching for normalizing standalone configuration values.

Parameters:
  • value – The raw value to normalize (e.g., "M", "female").

  • alias_dict – Dictionary of {canonical_key: [aliases]}.

Returns:

The canonical key if value matches any alias, or None if no match.

neurodent.core.utils.set_temp_directory(path: str | Path) None[source]#

Set the temporary directory for NeuRodent operations.

This function configures the temporary directory used by NeuRodent for intermediate files and operations. The directory will be created if it doesn’t exist.

Parameters:

path (str | Path) – Path to the temporary directory. Will be created if it doesn’t exist.

Examples

>>> set_temp_directory("/tmp/neurodent_temp")
>>> set_temp_directory(Path.home() / "neurodent_workspace" / "temp")

Note

This function modifies the TMPDIR environment variable, which affects the behavior of other temporary file operations in the process.

neurodent.core.utils.get_temp_directory() Path[source]#

Get the current temporary directory used by NeuRodent.

Returns:

Path object representing the current temporary directory.

Return type:

Path

Examples

>>> temp_dir = get_temp_directory()
>>> print(f"Current temp directory: {temp_dir}")
Current temp directory: /tmp/neurodent_temp
Raises:

KeyError – If TMPDIR environment variable is not set.

neurodent.core.utils.cache_fragments_to_zarr(np_fragments: ndarray, n_fragments: int, tmpdir: str | None = None, chunk_size: int | None = None) tuple[str, zarr.Array][source]#

Cache numpy fragments array to zarr format for efficient memory management.

This function converts a numpy array of recording fragments to a zarr array stored in a temporary location. This allows better memory management and garbage collection by avoiding keeping large numpy arrays in memory for extended periods.

Parameters:
  • np_fragments (np.ndarray) – Numpy array of shape (n_fragments, n_samples, n_channels) containing the recording fragments to cache.

  • n_fragments (int) – Number of fragments to cache (allows for subset caching).

  • tmpdir (str, optional) – Directory path for temporary zarr storage. If None, uses get_temp_directory(). Defaults to None.

  • chunk_size (int, optional) – Number of fragments per zarr chunk along the first axis. Controls the read/write granularity when accessing the zarr array. Smaller values reduce memory overhead per chunk; larger values improve sequential throughput. When None, defaults to min(100, n_fragments).

Returns:

A tuple containing:
  • str: Path to the temporary zarr file

  • zarr.Array: The zarr array object for accessing cached data

Return type:

tuple[str, zarr.Array]

Raises:

ImportError – If zarr is not available

neurodent.core.utils.stream_fragments_to_zarr(get_fragment_fn: Callable[[int], ndarray], n_fragments: int, fragment_shape: tuple, fragment_dtype: dtype, chunk_size: int, tmpdir: str | None = None) str[source]#

Stream recording fragments to a zarr store in memory-bounded batches.

Unlike cache_fragments_to_zarr(), this function never holds more than chunk_size fragments in RAM at once. It calls get_fragment_fn one batch at a time, writes each batch to the zarr store, and immediately frees the batch buffer — so peak RAM is proportional to chunk_size rather than n_fragments.

Parameters:
  • get_fragment_fn (Callable[[int], np.ndarray]) – A callable that accepts a fragment index (0-based) and returns the corresponding fragment as a NumPy array of shape fragment_shape.

  • n_fragments (int) – Total number of fragments to stream.

  • fragment_shape (tuple) – Shape of a single fragment (e.g. (n_samples, n_channels)).

  • fragment_dtype (np.dtype) – Data-type of the fragment arrays.

  • chunk_size (int) – Number of fragments to buffer per batch. Must be >= 1. Larger values improve sequential write throughput; smaller values reduce peak RAM.

  • tmpdir (str, optional) – Directory for the temporary zarr file. If None, uses get_temp_directory().

Returns:

Path to the temporary zarr file on disk.

Return type:

str

Raises:
  • ValueError – If chunk_size < 1.

  • ImportError – If zarr is not available.

neurodent.core.utils.chunked_channel_distance_matrix(get_traces_fn: Callable[[int, int], ndarray], n_channels: int, n_samples: int, chunk_samples: int) ndarray[source]#

Compute pairwise Euclidean distance matrix between channels in chunks.

Instead of loading the full (n_samples, n_channels) trace matrix at once, this function reads chunk_samples frames at a time and accumulates squared distances using the identity

||c_i - c_j||^2 = ||c_i||^2 + ||c_j||^2 - 2 * c_i · c_j

so that peak RAM is proportional to chunk_samples * n_channels rather than n_samples * n_channels.

Parameters:
  • get_traces_fnfn(start_frame, end_frame) -> np.ndarray with shape (frames, n_channels). Typically recording.get_traces(start_frame=..., end_frame=..., return_scaled=True).

  • n_channels – Number of channels.

  • n_samples – Total number of samples in the recording.

  • chunk_samples – Number of samples to read per chunk.

Returns:

Symmetric (n_channels, n_channels) Euclidean distance matrix.

Return type:

np.ndarray

neurodent.core.utils.get_file_stem(filepath: str | Path) str[source]#

Get the true stem for files, handling double extensions like .npy.gz.

neurodent.core.utils.nanmean_series_of_np(x: Series, axis: int = 0) ndarray[source]#

Efficiently compute NaN-aware mean of a pandas Series containing numpy arrays.

This function is optimized for computing the mean across a Series where each element is a numpy array. It uses different strategies based on the size of the Series for optimal performance.

Parameters:
  • x (pd.Series) – Series containing numpy arrays as elements.

  • axis (int, optional) – Axis along which to compute the mean. Defaults to 0. - axis=0: Mean across the Series elements (most common) - axis=1: Mean within each array element

Returns:

Array containing the computed means with NaN values properly handled.

Return type:

np.ndarray

Examples

>>> import pandas as pd
>>> import numpy as np
>>> # Create a Series of numpy arrays
>>> arrays = [np.array([1.0, 2.0, np.nan]),
...           np.array([4.0, np.nan, 6.0]),
...           np.array([7.0, 8.0, 9.0])]
>>> series = pd.Series(arrays)
>>> nanmean_series_of_np(series)
array([4. , 5. , 7.5])
Performance Notes:
  • For Series with more than 1000 elements containing numpy arrays, uses np.stack() for better performance

  • Falls back to list conversion for smaller Series or mixed types

  • Handles shape mismatches gracefully by falling back to the slower method

neurodent.core.utils.log_transform(rec: ndarray, **kwargs) ndarray[source]#

Log transform the signal

Parameters:

rec (np.ndarray) – The signal to log transform.

Returns:

ln(rec + 1)

Return type:

np.ndarray

neurodent.core.utils.sort_dataframe_by_plot_order(df: DataFrame, df_sort_order: dict | None = None) DataFrame[source]#

Sort DataFrame columns according to predefined orders.

Parameters#

dfpd.DataFrame

DataFrame to sort

df_sort_orderdict

Dictionary mapping column names to the order of the values in the column.

Returns#

pd.DataFrame

Sorted DataFrame

Raises#

ValueError

If df_sort_order is not a valid dictionary or contains invalid categories

class neurodent.core.utils.Natural_Neighbor[source]#

Bases: object

Natural Neighbor algorithm implementation for finding natural neighbors in a dataset.

This class implements the Natural Neighbor algorithm which finds mutual neighbors in a dataset by iteratively expanding the neighborhood radius until convergence.

load(filename)[source]#

Load dataset from a CSV file, separating attributes and classes.

Parameters:

filename (str) – Path to the CSV file containing the dataset

read(data: ndarray)[source]#

Load data directly from a numpy array.

Parameters:

data (np.ndarray) – Input data array

read_distance_matrix(distance_matrix: ndarray)[source]#

Load a precomputed distance matrix for neighbor search.

When a distance matrix is provided, algorithm() uses argsort-based neighbor lookup instead of a KDTree, avoiding the need to hold the raw high-dimensional data in memory.

Parameters:

distance_matrix (np.ndarray) – Symmetric (n, n) distance matrix.

asserts()[source]#

Initialize data structures for the algorithm.

Sets up the necessary data structures including: - nan_edges as an empty set - knn, nan_num, and repeat dictionaries for each instance

count()[source]#

Count the number of instances that have no natural neighbors.

Returns:

Number of instances with zero natural neighbors

Return type:

int

findKNN(inst, r, tree)[source]#

Find the indices of the k nearest neighbors.

Parameters:
  • inst – Instance to find neighbors for

  • r (int) – Radius/parameter for neighbor search

  • tree – KDTree object for efficient neighbor search

Returns:

Array of neighbor indices (excluding the instance itself)

Return type:

np.ndarray

algorithm()[source]#

Execute the Natural Neighbor algorithm.

The algorithm iteratively expands the neighborhood radius until convergence, finding mutual neighbors between instances.

When a precomputed distance matrix is available (see read_distance_matrix()), neighbor lookup is performed via argsort instead of a KDTree, which avoids holding the raw high-dimensional data in memory.

Returns:

The final radius value when convergence is reached

Return type:

int

class neurodent.core.utils.TimestampMapper(file_end_datetimes: list[datetime], file_durations: list[float])[source]#

Bases: object

Map each fragment to its source file’s timestamp.

This class provides functionality to map data fragments back to their original file timestamps when data has been concatenated from multiple files with different recording times.

Variables:
  • file_end_datetimes (list[datetime]) – The end datetimes of each source file.

  • file_durations (list[float]) – The durations of each source file in seconds.

  • file_start_datetimes (list[datetime]) – Computed start datetimes of each file.

  • cumulative_durations (np.ndarray) – Cumulative sum of file durations.

Examples

>>> from datetime import datetime, timedelta
>>> # Set up files with known end times and durations
>>> end_times = [datetime(2023, 1, 1, 12, 0), datetime(2023, 1, 1, 13, 0)]
>>> durations = [3600.0, 1800.0]  # 1 hour, 30 minutes
>>> mapper = TimestampMapper(end_times, durations)
>>>
>>> # Get timestamp for fragment at index 2 with 60s fragments
>>> timestamp = mapper.get_fragment_timestamp(2, 60.0)
>>> print(timestamp)
2023-01-01 11:02:00
get_fragment_timestamp(fragment_idx: int, fragment_len_s: float) datetime[source]#

Get the timestamp for a specific fragment based on its index and length.

Parameters:
  • fragment_idx (int) – The index of the fragment (0-based).

  • fragment_len_s (float) – The length of each fragment in seconds.

Returns:

The timestamp corresponding to the start of the specified fragment.

Return type:

datetime

Examples

>>> # Get timestamp for the 5th fragment (index 4) with 30-second fragments
>>> timestamp = mapper.get_fragment_timestamp(4, 30.0)
>>> # This returns the timestamp 2 minutes into the first file
neurodent.core.utils.validate_timestamps(timestamps: list[datetime], gap_threshold_seconds: float = 60) list[datetime][source]#

Validate that timestamps are in chronological order and check for large gaps.

Parameters:
  • timestamps (list[datetime]) – List of timestamps to validate

  • gap_threshold_seconds (float, optional) – Threshold in seconds for warning about large gaps. Defaults to 60.

Returns:

The validated timestamps in chronological order

Return type:

list[datetime]

Raises:

ValueError – If no valid timestamps are provided

neurodent.core.utils.should_use_cached_file(cache_path: str | Path, source_paths: list[str | Path], use_cached: Literal['auto', 'always', 'never', 'error'] = 'auto') bool[source]#

Determine whether to use a cached intermediate file based on caching policy and file timestamps.

Parameters:
  • cache_path – Path to the cached intermediate file

  • source_paths – List of source file paths that the cache depends on

  • use_cached – Caching policy - “auto”: Use cached if exists and newer than all sources (default) - “always”: Always use cached if it exists - “never”: Never use cached (always regenerate) - “error”: Raise error if cached doesn’t exist

Returns:

True if cached file should be used, False if it should be regenerated

Return type:

bool

Raises:
  • FileNotFoundError – When use_cached=”error” and cache doesn’t exist

  • ValueError – For invalid use_cached values

neurodent.core.utils.get_cache_status_message(cache_path: str | Path, use_cached: bool) str[source]#

Generate a descriptive message about cache usage for logging.

neurodent.core.utils.should_use_cache_unified(cache_path: str | Path, source_paths: list[str | Path], cache_policy: Literal['auto', 'always', 'force_regenerate']) bool[source]#

Unified cache decision logic for all intermediate files.

Parameters:
  • cache_path – Path to the cache file

  • source_paths – List of source file paths to check timestamps against

  • cache_policy – Caching policy: - “auto”: Use cache if exists and newer than sources, regenerate with logging if missing/invalid - “always”: Use cache if exists, raise error if missing/invalid - “force_regenerate”: Always regenerate and overwrite existing cache

Returns:

True if cache should be used, False if should regenerate

Return type:

bool

Raises:

ValueError – If cache_policy is invalid

neurodent.core.utils.convert_intan_chname_mne(mne_obj)[source]#
neurodent.core.utils.slugify(value, allow_unicode=False)[source]#

Convert a string to a URL-friendly slug.

Converts to ASCII (unless allow_unicode is True), lowercases, removes non-alphanumeric characters (except hyphens and underscores), and converts spaces and repeated dashes to single dashes.

Drop-in replacement for django.utils.text.slugify using only the standard library.

Parameters:
  • value – The string to slugify.

  • allow_unicode – If True, keep Unicode characters instead of transliterating to ASCII.

Returns:

A URL-safe slug string.

Return type:

str