Data Loading Tutorial#

This tutorial covers how to load EEG data from various formats supported by NeuRodent.

Overview#

NeuRodent supports multiple data formats commonly used in rodent EEG research:

  1. Binary files (.bin) - Custom binary format

  2. SpikeInterface recordings - Via the SpikeInterface library

  3. MNE objects - From the MNE-Python library

  4. Neuroscope/Neuralynx (.dat, .eeg)

  5. Open Ephys (.continuous)

  6. NWB files (.nwb) - Neurodata Without Borders format

The LongRecordingOrganizer class handles loading and organizing recordings from these formats.

Setup#

import sys
from pathlib import Path
import logging

import numpy as np
import matplotlib.pyplot as plt

from neurodent import core

# Set up logging
logging.basicConfig(
    format="%(asctime)s - %(levelname)s - %(message)s", 
    level=logging.INFO
)
logger = logging.getLogger()

1. Loading Binary Files#

Binary files are a common format for storing continuous EEG data. NeuRodent can load binary files with associated metadata.

# Example: Loading binary files
data_path = Path("/path/to/binary/data")
animal_id = "animal_001"

lro_bin = core.LongRecordingOrganizer(
    base_folder=data_path,
    animal_id=animal_id,
    mode="bin",  # Specify binary mode
)

print(f"Loaded {len(lro_bin.recordings)} binary recordings")
print(f"Sample rate: {lro_bin.recordings[0].get_sampling_frequency()} Hz")
print(f"Channels: {lro_bin.recordings[0].get_channel_ids()}")

Binary File Format Details#

NeuRodent supports two binary layouts:

  1. Column-major (default): Each file contains data for one channel

  2. Row-major: Each file contains all channels, with samples interleaved

You can convert between formats:

# Convert column-major to row-major format
col_path = Path("/path/to/colmajor/data")
row_path = Path("/path/to/rowmajor/output")

core.convert_ddfcolbin_to_ddfrowbin(
    ddf_colpath=col_path,
    ddf_rowpath=row_path,
    overwrite=False
)

# Convert to SpikeInterface format
core.convert_ddfrowbin_to_si(
    ddf_rowpath=row_path,
    overwrite=False
)

2. Loading SpikeInterface Recordings#

SpikeInterface is a popular Python library for extracellular electrophysiology data. NeuRodent can directly use SpikeInterface recordings:

# Example: Loading SpikeInterface recordings
si_data_path = Path("/path/to/spikeinterface/data")

lro_si = core.LongRecordingOrganizer(
    base_folder=si_data_path,
    animal_id=animal_id,
    mode="si",  # SpikeInterface mode
)

print(f"Loaded {len(lro_si.recordings)} SpikeInterface recordings")

3. Loading MNE Objects#

MNE-Python is a widely-used library for MEG and EEG analysis. NeuRodent can work with MNE Raw objects:

import mne

# Load data using MNE first
mne_file = Path("/path/to/mne/data.fif")
raw_mne = mne.io.read_raw_fif(mne_file, preload=False)

# Create LongRecordingOrganizer with MNE object
lro_mne = core.LongRecordingOrganizer(
    base_folder=None,
    animal_id=animal_id,
    mode="mne",
    mne_objects=[raw_mne],  # Pass MNE objects directly
)

print(f"Loaded {len(lro_mne.recordings)} MNE recordings")

4. Loading NWB Files#

Neurodata Without Borders (NWB) is a standardized format for neurophysiology data:

# Example: Loading NWB files
nwb_path = Path("/path/to/nwb/file.nwb")

# First, load with SpikeInterface's NWB extractor
import spikeinterface.extractors as se

recording_nwb = se.read_nwb(nwb_path)

# Then use with LongRecordingOrganizer
lro_nwb = core.LongRecordingOrganizer(
    base_folder=None,
    animal_id=animal_id,
    mode="si",
    si_recordings=[recording_nwb],
)

print(f"Loaded NWB data with {len(lro_nwb.recordings)} recordings")

5. Loading Other Formats#

Neuroscope/Neuralynx#

For .dat or .eeg files:

# Load using SpikeInterface extractors
neuroscope_path = Path("/path/to/neuroscope/data.dat")
recording_neuroscope = se.read_neuroscope(neuroscope_path)

lro_neuroscope = core.LongRecordingOrganizer(
    base_folder=None,
    animal_id=animal_id,
    mode="si",
    si_recordings=[recording_neuroscope],
)

Open Ephys#

For Open Ephys .continuous files:

# Load using SpikeInterface extractors
openephys_path = Path("/path/to/openephys/folder")
recording_openephys = se.read_openephys(openephys_path)

lro_openephys = core.LongRecordingOrganizer(
    base_folder=None,
    animal_id=animal_id,
    mode="si",
    si_recordings=[recording_openephys],
)

6. Inspecting Loaded Data#

Once data is loaded, you can inspect its properties:

# Access recordings
recording = lro_bin.recordings[0]

# Get basic properties
print(f"Sampling frequency: {recording.get_sampling_frequency()} Hz")
print(f"Number of channels: {recording.get_num_channels()}")
print(f"Channel IDs: {recording.get_channel_ids()}")
print(f"Duration: {recording.get_num_frames() / recording.get_sampling_frequency()} seconds")

# Get channel metadata
channel_locations = recording.get_channel_locations()
print(f"Channel locations: {channel_locations}")

7. Working with Multiple Recordings#

NeuRodent can handle multiple recordings from the same animal (e.g., different sessions or days):

# Example: Loading multiple recordings
data_folder = Path("/path/to/multi/session/data")

lro_multi = core.LongRecordingOrganizer(
    base_folder=data_folder,
    animal_id=animal_id,
    mode="bin",
)

print(f"Total recordings: {len(lro_multi.recordings)}")

# Iterate through recordings
for i, recording in enumerate(lro_multi.recordings):
    duration = recording.get_num_frames() / recording.get_sampling_frequency()
    print(f"Recording {i}: {duration:.1f} seconds")

8. Metadata and Time Information#

NeuRodent extracts metadata from filenames and paths, including timing information:

from neurodent.core import is_day

# Check if recording is during day or night
# (assuming timestamp in filename or metadata)
example_timestamp = "2023-12-15_14-30-00"  # Example: 2:30 PM

is_daytime = is_day(example_timestamp)
print(f"Recording at {example_timestamp} is {'day' if is_daytime else 'night'}time")

# Access metadata from recordings
for recording in lro_bin.recordings:
    metadata = recording.get_property("metadata") if recording.has_property("metadata") else None
    if metadata:
        print(f"Recording metadata: {metadata}")

9. Advanced: Custom Data Loading#

For custom formats, you can create SpikeInterface Recording objects and pass them to LongRecordingOrganizer:

import spikeinterface as si

# Example: Create a recording from numpy array
# (useful for custom formats or testing)
num_channels = 16
sampling_frequency = 1000  # Hz
duration = 60  # seconds
num_samples = int(sampling_frequency * duration)

# Generate random data (replace with your actual data)
data = np.random.randn(num_channels, num_samples)

# Create SpikeInterface recording
recording_custom = si.NumpyRecording(
    traces_list=[data],
    sampling_frequency=sampling_frequency,
)

# Set channel IDs
channel_ids = [f"CH{i:02d}" for i in range(num_channels)]
recording_custom = recording_custom.rename_channels(
    new_channel_ids=channel_ids
)

# Use with LongRecordingOrganizer
lro_custom = core.LongRecordingOrganizer(
    base_folder=None,
    animal_id=animal_id,
    mode="si",
    si_recordings=[recording_custom],
)

print("Custom recording created successfully!")

Summary#

In this tutorial, you learned:

  1. How to load data from multiple formats (binary, SpikeInterface, MNE, NWB, etc.)

  2. How to inspect loaded data properties

  3. How to work with multiple recordings

  4. How to handle metadata and timing information

  5. How to create custom recordings for non-standard formats

Next Steps#