LongRecordingOrganizer#
- class neurodent.core.LongRecordingOrganizer(base_folder_path, mode: Literal['bin', 'si', 'mne', None] = 'bin', truncate: bool | int = False, cache_policy: Literal['auto', 'always', 'force_regenerate'] = 'auto', multiprocess_mode: Literal['dask', 'serial'] = 'serial', extract_func: Callable[[...], BaseRecording] | Callable[[...], Raw] | None = None, input_type: Literal['folder', 'file', 'files'] = 'folder', file_pattern: str | None = None, manual_datetimes: datetime | list[datetime] | None = None, datetimes_are_start: bool = True, n_jobs: int = 1, **kwargs)[source]#
Bases:
object- Parameters:
mode (Literal['bin', 'si', 'mne', None])
truncate (bool | int)
cache_policy (Literal['auto', 'always', 'force_regenerate'])
multiprocess_mode (Literal['dask', 'serial'])
extract_func (Callable[[...], si.BaseRecording] | Callable[[...], Raw])
input_type (Literal['folder', 'file', 'files'])
file_pattern (str)
manual_datetimes (datetime | list[datetime])
datetimes_are_start (bool)
n_jobs (int)
- __init__(base_folder_path, mode: Literal['bin', 'si', 'mne', None] = 'bin', truncate: bool | int = False, cache_policy: Literal['auto', 'always', 'force_regenerate'] = 'auto', multiprocess_mode: Literal['dask', 'serial'] = 'serial', extract_func: Callable[[...], BaseRecording] | Callable[[...], Raw] | None = None, input_type: Literal['folder', 'file', 'files'] = 'folder', file_pattern: str | None = None, manual_datetimes: datetime | list[datetime] | None = None, datetimes_are_start: bool = True, n_jobs: int = 1, **kwargs)[source]#
Construct a long recording from binary files or EDF files.
- Parameters:
base_folder_path (
str) – Path to the base folder containing the data files.mode (
Literal['bin', 'si', 'mne', None]) – Mode to load data in. Defaults to ‘bin’.truncate (
Union[bool, int], optional) – If True, truncate data to first 10 files. If an integer, truncate data to the first n files. Defaults to False.overwrite_rowbins (
bool, optional) – If True, overwrite existing row-major binary files. Defaults to False.multiprocess_mode (
Literal['dask', 'serial'], optional) – Processing mode for parallel operations. Defaults to ‘serial’.extract_func (
Callable, optional) – Function to extract data when using ‘si’ or ‘mne’ mode. Required for those modes.input_type (
Literal['folder', 'file', 'files'], optional) – Type of input to load. Defaults to ‘folder’.file_pattern (
str, optional) – Pattern to match files when using ‘file’ or ‘files’ input type. Defaults to ‘*’.manual_datetimes (
datetime | list[datetime], optional) – Manual timestamps for the recording. For ‘bin’ mode: if datetime, used as global start/end time; if list, one timestamp per file. For ‘si’/’mne’ modes: if datetime, used as start/end of entire recording; if list, one per input file.datetimes_are_start (
bool, optional) – If True, manual_datetimes are treated as start times. If False, treated as end times. Defaults to True.n_jobs (
int, optional) – Number of jobs for MNE resampling operations. Defaults to 1 for safety. Set to -1 for automatic parallel detection, or >1 for specific job count.**kwargs – Additional arguments passed to the data loading functions.
cache_policy (Literal['auto', 'always', 'force_regenerate'])
- Raises:
ValueError – If no data files are found, if the folder contains mixed file types, or if manual time parameters are invalid.
- detect_and_load_data(mode: Literal['bin', 'si', 'mne', None] = 'bin', cache_policy: Literal['auto', 'always', 'force_regenerate'] = 'auto', multiprocess_mode: Literal['dask', 'serial'] = 'serial', extract_func: Callable[[...], BaseRecording] | Callable[[...], Raw] | None = None, input_type: Literal['folder', 'file', 'files'] = 'folder', file_pattern: str | None = None, **kwargs)[source]#
Load in recording based on mode.
- Parameters:
mode (Literal['bin', 'si', 'mne', None])
cache_policy (Literal['auto', 'always', 'force_regenerate'])
multiprocess_mode (Literal['dask', 'serial'])
extract_func (Callable[[...], BaseRecording] | Callable[[...], Raw] | None)
input_type (Literal['folder', 'file', 'files'])
file_pattern (str | None)
- convert_colbins_rowbins_to_rec(overwrite_rowbins: bool = False, multiprocess_mode: Literal['dask', 'serial'] = 'serial', cache_policy: Literal['auto', 'always', 'force_regenerate'] = 'auto')[source]#
- Parameters:
overwrite_rowbins (bool)
multiprocess_mode (Literal['dask', 'serial'])
cache_policy (Literal['auto', 'always', 'force_regenerate'])
- convert_colbins_to_rowbins(overwrite=False, multiprocess_mode: Literal['dask', 'serial'] = 'serial')[source]#
Convert column-major binary files to row-major binary files, and save them in the rowbin_folder_path.
- Parameters:
overwrite (
bool, optional) – If True, overwrite existing row-major binary files. Defaults to True.multiprocess_mode (
Literal['dask', 'serial'], optional) – If ‘dask’, use dask to convert the files in parallel. If ‘serial’, convert the files in serial. Defaults to ‘serial’.
- convert_rowbins_to_rec(multiprocess_mode: Literal['dask', 'serial'] = 'serial', cache_policy: Literal['auto', 'always', 'force_regenerate'] = 'auto')[source]#
Convert row-major binary files to SpikeInterface Recording structure.
- Parameters:
multiprocess_mode (
Literal['dask', 'serial'], optional) – If ‘dask’, use dask to convert the files in parallel. If ‘serial’, convert the files in serial. Defaults to ‘serial’.cache_policy (
Literal) – Caching policy for intermediate files (default: “auto”) - “auto”: Use cached files if exist and newer than sources, regenerate with logging if missing/invalid - “always”: Use cached files if exist, raise error if missing/invalid - “force_regenerate”: Always regenerate files, overwrite existing cache
- convert_file_with_si_to_recording(extract_func: Callable[[...], BaseRecording], input_type: Literal['folder', 'file', 'files'] = 'folder', file_pattern: str = '*', cache_policy: Literal['auto', 'always', 'force_regenerate'] = 'auto', **kwargs)[source]#
Create a SpikeInterface Recording from a folder, a single file, or multiple files.
This is a thin wrapper around
extract_functhat discovers inputs underself.base_folder_pathand builds asi.BaseRecordingaccordingly.Modes:
folder: Passesself.base_folder_pathdirectly toextract_func.file: Usesglobwithfile_patternrelative toself.base_folder_path. If multiple matches are found, the first match is used and a warning is issued.files: UsesPath.globwithfile_patternunderself.base_folder_path, optionally truncates viaself._truncate_file_list(...), sorts the files, appliesextract_functo each file, and concatenates the resulting recordings viasi.concatenate_recordings.
- Parameters:
extract_func (
Callable[..., "si.BaseRecording"]) – Function that consumes a path (folder or file path) and returns asi.BaseRecording.input_type (
Literal['folder', 'file', 'files'], optional) – How to discover inputs. Defaults to'folder'.file_pattern (
str, optional) – Glob pattern used wheninput_typeis'file'or'files'. Defaults to'*'.**kwargs – Additional keyword arguments forwarded to
extract_func.cache_policy (Literal['auto', 'always', 'force_regenerate'])
- Side Effects:
Sets
self.LongRecordingto the resulting recording and initializesself.metabased on that recording’s properties.
- Raises:
ValueError – If no files are found for the given
file_patternorinput_typeis invalid.- Parameters:
extract_func (Callable[[...], BaseRecording])
input_type (Literal['folder', 'file', 'files'])
file_pattern (str)
cache_policy (Literal['auto', 'always', 'force_regenerate'])
- convert_file_with_mne_to_recording(extract_func: Callable[[...], Raw], input_type: Literal['folder', 'file', 'files'] = 'folder', file_pattern: str = '*', intermediate: Literal['edf', 'bin'] = 'edf', intermediate_name=None, cache_policy: Literal['auto', 'always', 'force_regenerate'] = 'auto', multiprocess_mode: Literal['dask', 'serial'] = 'serial', n_jobs: int | None = None, **kwargs)[source]#
Convert MNE-compatible files to SpikeInterface recording format with metadata caching.
- Parameters:
extract_func (
Callable) – Function that takes a file path and returns mne.io.Raw objectinput_type (
Literal) – Type of input - “folder”, “file”, or “files”file_pattern (
str) – Glob pattern for file matching (default: “*”)intermediate (
Literal) – Intermediate format - “edf” or “bin” (default: “edf”)intermediate_name (
str, optional) – Custom name for intermediate filecache_policy (
Literal) – Caching policy for intermediate and metadata files (default: “auto”) - “auto”: Use cached files if both data and metadata exist and cache is newer than sources, regenerate with logging if missing/invalid - “always”: Use cached files if both data and metadata exist, raise error if missing/invalid - “force_regenerate”: Always regenerate files, overwrite existing cachemultiprocess_mode (
Literal) – Processing mode - “dask” or “serial” (default: “serial”)n_jobs (
int, optional) – Number of jobs for MNE resampling. If None (default), uses the instance n_jobs value. Set to -1 for automatic parallel detection, or >1 for specific job count.**kwargs – Additional arguments passed to extract_func
Note
Creates two cache files: data file (e.g., file.edf) and metadata sidecar (e.g., file.edf.meta.json). Both files must exist for cache to be used. Metadata preserves channel names, original sampling rates, and other DDFBinaryMetadata fields across cache hits.
- get_datetime_fragment(fragment_len_s, fragment_idx)[source]#
Get the datetime for a specific fragment using the timestamp mapper.
- Parameters:
fragment_len_s (
float) – Length of each fragment in secondsfragment_idx (
int) – Index of the fragment to get datetime for
- Returns:
The datetime corresponding to the start of the fragment
- Return type:
datetime
- Raises:
ValueError – If timestamp mapper is not initialized (only available in ‘bin’ mode)
- convert_to_mne() RawArray[source]#
Convert this LongRecording object to an MNE RawArray.
- Returns:
The converted MNE RawArray
- Return type:
mne.io.RawArray
- compute_bad_channels(lof_threshold: float | None = None, limit_memory: bool = True, force_recompute: bool = False)[source]#
Compute bad channels using LOF analysis with unified score storage.
- Parameters:
lof_threshold (
float, optional) – Threshold for determining bad channels from LOF scores. If None, only computes/loads scores without setting bad_channel_names.limit_memory (
bool) – Whether to reduce memory usage by decimation and float16.force_recompute (
bool) – Whether to recompute LOF scores even if they exist.
- apply_lof_threshold(lof_threshold: float)[source]#
Apply threshold to existing LOF scores to determine bad channels.
- Parameters:
lof_threshold (
float) – Threshold for determining bad channels.
- get_lof_scores() dict[source]#
Get LOF scores with channel names.
- Returns:
Dictionary mapping channel names to LOF scores.
- Return type:
dict
- finalize_file_timestamps()[source]#
Finalize file timestamps using manual times if provided, otherwise validate CSV times.
- merge(other_lro)[source]#
Merge another LRO into this one using si.concatenate_recordings.
This creates a new concatenated recording from this LRO and the other LRO. The other LRO should represent a later time period to maintain temporal order.
- Parameters:
other_lro (
LongRecordingOrganizer) – The LRO to merge into this one- Raises:
ValueError – If LROs are incompatible (different channels, sampling rates, etc.)
ImportError – If SpikeInterface is not available