{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Data Loading Tutorial\n",
    "\n",
    "This tutorial covers how to load EEG data from various formats supported by NeuRodent.\n",
    "\n",
    "## Overview\n",
    "\n",
    "NeuRodent supports multiple data formats commonly used in rodent EEG research:\n",
    "\n",
    "1. **Binary files** (`.bin`) - Custom binary format\n",
    "2. **SpikeInterface recordings** - Via the SpikeInterface library\n",
    "3. **MNE objects** - From the MNE-Python library\n",
    "4. **Neuroscope/Neuralynx** (`.dat`, `.eeg`)\n",
    "5. **Open Ephys** (`.continuous`)\n",
    "6. **NWB files** (`.nwb`) - Neurodata Without Borders format\n",
    "\n",
    "The `LongRecordingOrganizer` class handles loading and organizing recordings from these formats."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "from pathlib import Path\n",
    "import logging\n",
    "\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "from neurodent import core\n",
    "\n",
    "# Set up logging\n",
    "logging.basicConfig(\n",
    "    format=\"%(asctime)s - %(levelname)s - %(message)s\", \n",
    "    level=logging.INFO\n",
    ")\n",
    "logger = logging.getLogger()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Loading Binary Files\n",
    "\n",
    "Binary files are a common format for storing continuous EEG data. NeuRodent can load binary files with associated metadata."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Loading binary files\n",
    "data_path = Path(\"/path/to/binary/data\")\n",
    "animal_id = \"animal_001\"\n",
    "\n",
    "lro_bin = core.LongRecordingOrganizer(\n",
    "    base_folder=data_path,\n",
    "    animal_id=animal_id,\n",
    "    mode=\"bin\",  # Specify binary mode\n",
    ")\n",
    "\n",
    "print(f\"Loaded {len(lro_bin.recordings)} binary recordings\")\n",
    "print(f\"Sample rate: {lro_bin.recordings[0].get_sampling_frequency()} Hz\")\n",
    "print(f\"Channels: {lro_bin.recordings[0].get_channel_ids()}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Binary File Format Details\n",
    "\n",
    "NeuRodent supports two binary layouts:\n",
    "\n",
    "1. **Column-major** (default): Each file contains data for one channel\n",
    "2. **Row-major**: Each file contains all channels, with samples interleaved\n",
    "\n",
    "You can convert between formats:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Convert column-major to row-major format\n",
    "col_path = Path(\"/path/to/colmajor/data\")\n",
    "row_path = Path(\"/path/to/rowmajor/output\")\n",
    "\n",
    "core.convert_ddfcolbin_to_ddfrowbin(\n",
    "    ddf_colpath=col_path,\n",
    "    ddf_rowpath=row_path,\n",
    "    overwrite=False\n",
    ")\n",
    "\n",
    "# Convert to SpikeInterface format\n",
    "core.convert_ddfrowbin_to_si(\n",
    "    ddf_rowpath=row_path,\n",
    "    overwrite=False\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Loading SpikeInterface Recordings\n",
    "\n",
    "SpikeInterface is a popular Python library for extracellular electrophysiology data. NeuRodent can directly use SpikeInterface recordings:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Loading SpikeInterface recordings\n",
    "si_data_path = Path(\"/path/to/spikeinterface/data\")\n",
    "\n",
    "lro_si = core.LongRecordingOrganizer(\n",
    "    base_folder=si_data_path,\n",
    "    animal_id=animal_id,\n",
    "    mode=\"si\",  # SpikeInterface mode\n",
    ")\n",
    "\n",
    "print(f\"Loaded {len(lro_si.recordings)} SpikeInterface recordings\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Loading MNE Objects\n",
    "\n",
    "MNE-Python is a widely-used library for MEG and EEG analysis. NeuRodent can work with MNE Raw objects:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import mne\n",
    "\n",
    "# Load data using MNE first\n",
    "mne_file = Path(\"/path/to/mne/data.fif\")\n",
    "raw_mne = mne.io.read_raw_fif(mne_file, preload=False)\n",
    "\n",
    "# Create LongRecordingOrganizer with MNE object\n",
    "lro_mne = core.LongRecordingOrganizer(\n",
    "    base_folder=None,\n",
    "    animal_id=animal_id,\n",
    "    mode=\"mne\",\n",
    "    mne_objects=[raw_mne],  # Pass MNE objects directly\n",
    ")\n",
    "\n",
    "print(f\"Loaded {len(lro_mne.recordings)} MNE recordings\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Loading NWB Files\n",
    "\n",
    "Neurodata Without Borders (NWB) is a standardized format for neurophysiology data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Loading NWB files\n",
    "nwb_path = Path(\"/path/to/nwb/file.nwb\")\n",
    "\n",
    "# First, load with SpikeInterface's NWB extractor\n",
    "import spikeinterface.extractors as se\n",
    "\n",
    "recording_nwb = se.read_nwb(nwb_path)\n",
    "\n",
    "# Then use with LongRecordingOrganizer\n",
    "lro_nwb = core.LongRecordingOrganizer(\n",
    "    base_folder=None,\n",
    "    animal_id=animal_id,\n",
    "    mode=\"si\",\n",
    "    si_recordings=[recording_nwb],\n",
    ")\n",
    "\n",
    "print(f\"Loaded NWB data with {len(lro_nwb.recordings)} recordings\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Loading Other Formats\n",
    "\n",
    "### Neuroscope/Neuralynx\n",
    "\n",
    "For `.dat` or `.eeg` files:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load using SpikeInterface extractors\n",
    "neuroscope_path = Path(\"/path/to/neuroscope/data.dat\")\n",
    "recording_neuroscope = se.read_neuroscope(neuroscope_path)\n",
    "\n",
    "lro_neuroscope = core.LongRecordingOrganizer(\n",
    "    base_folder=None,\n",
    "    animal_id=animal_id,\n",
    "    mode=\"si\",\n",
    "    si_recordings=[recording_neuroscope],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Open Ephys\n",
    "\n",
    "For Open Ephys `.continuous` files:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load using SpikeInterface extractors\n",
    "openephys_path = Path(\"/path/to/openephys/folder\")\n",
    "recording_openephys = se.read_openephys(openephys_path)\n",
    "\n",
    "lro_openephys = core.LongRecordingOrganizer(\n",
    "    base_folder=None,\n",
    "    animal_id=animal_id,\n",
    "    mode=\"si\",\n",
    "    si_recordings=[recording_openephys],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Inspecting Loaded Data\n",
    "\n",
    "Once data is loaded, you can inspect its properties:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Access recordings\n",
    "recording = lro_bin.recordings[0]\n",
    "\n",
    "# Get basic properties\n",
    "print(f\"Sampling frequency: {recording.get_sampling_frequency()} Hz\")\n",
    "print(f\"Number of channels: {recording.get_num_channels()}\")\n",
    "print(f\"Channel IDs: {recording.get_channel_ids()}\")\n",
    "print(f\"Duration: {recording.get_num_frames() / recording.get_sampling_frequency()} seconds\")\n",
    "\n",
    "# Get channel metadata\n",
    "channel_locations = recording.get_channel_locations()\n",
    "print(f\"Channel locations: {channel_locations}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 7. Working with Multiple Recordings\n",
    "\n",
    "NeuRodent can handle multiple recordings from the same animal (e.g., different sessions or days):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Loading multiple recordings\n",
    "data_folder = Path(\"/path/to/multi/session/data\")\n",
    "\n",
    "lro_multi = core.LongRecordingOrganizer(\n",
    "    base_folder=data_folder,\n",
    "    animal_id=animal_id,\n",
    "    mode=\"bin\",\n",
    ")\n",
    "\n",
    "print(f\"Total recordings: {len(lro_multi.recordings)}\")\n",
    "\n",
    "# Iterate through recordings\n",
    "for i, recording in enumerate(lro_multi.recordings):\n",
    "    duration = recording.get_num_frames() / recording.get_sampling_frequency()\n",
    "    print(f\"Recording {i}: {duration:.1f} seconds\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 8. Metadata and Time Information\n",
    "\n",
    "NeuRodent extracts metadata from filenames and paths, including timing information:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from neurodent.core import is_day\n",
    "\n",
    "# Check if recording is during day or night\n",
    "# (assuming timestamp in filename or metadata)\n",
    "example_timestamp = \"2023-12-15_14-30-00\"  # Example: 2:30 PM\n",
    "\n",
    "is_daytime = is_day(example_timestamp)\n",
    "print(f\"Recording at {example_timestamp} is {'day' if is_daytime else 'night'}time\")\n",
    "\n",
    "# Access metadata from recordings\n",
    "for recording in lro_bin.recordings:\n",
    "    metadata = recording.get_property(\"metadata\") if recording.has_property(\"metadata\") else None\n",
    "    if metadata:\n",
    "        print(f\"Recording metadata: {metadata}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 9. Advanced: Custom Data Loading\n",
    "\n",
    "For custom formats, you can create SpikeInterface Recording objects and pass them to `LongRecordingOrganizer`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import spikeinterface as si\n",
    "\n",
    "# Example: Create a recording from numpy array\n",
    "# (useful for custom formats or testing)\n",
    "num_channels = 16\n",
    "sampling_frequency = 1000  # Hz\n",
    "duration = 60  # seconds\n",
    "num_samples = int(sampling_frequency * duration)\n",
    "\n",
    "# Generate random data (replace with your actual data)\n",
    "data = np.random.randn(num_channels, num_samples)\n",
    "\n",
    "# Create SpikeInterface recording\n",
    "recording_custom = si.NumpyRecording(\n",
    "    traces_list=[data],\n",
    "    sampling_frequency=sampling_frequency,\n",
    ")\n",
    "\n",
    "# Set channel IDs\n",
    "channel_ids = [f\"CH{i:02d}\" for i in range(num_channels)]\n",
    "recording_custom = recording_custom.rename_channels(\n",
    "    new_channel_ids=channel_ids\n",
    ")\n",
    "\n",
    "# Use with LongRecordingOrganizer\n",
    "lro_custom = core.LongRecordingOrganizer(\n",
    "    base_folder=None,\n",
    "    animal_id=animal_id,\n",
    "    mode=\"si\",\n",
    "    si_recordings=[recording_custom],\n",
    ")\n",
    "\n",
    "print(\"Custom recording created successfully!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "## Summary\n\nIn this tutorial, you learned:\n\n1. How to load data from multiple formats (binary, SpikeInterface, MNE, NWB, etc.)\n2. How to inspect loaded data properties\n3. How to work with multiple recordings\n4. How to handle metadata and timing information\n5. How to create custom recordings for non-standard formats\n\n## Next Steps\n\n- **[Basic Usage Tutorial](basic_usage.ipynb)**: Complete workflow from loading to visualization\n- **[Windowed Analysis Tutorial](../tutorials/windowed_analysis.ipynb)**: Extract features from loaded data\n- **[Spike Analysis Tutorial](../tutorials/spike_analysis.ipynb)**: Work with spike-sorted data"
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}