NeuroBridge
Overview
NeuroBridge is a modular research framework for neural representation learning. It is designed to study latent structure, temporal dynamics and behavioral information in neural population activity.
The framework connects data preparation, contrastive sampling, deep neural encoders, self-supervised objectives, latent-variable simulation, decoding probes and manifold visualization.
The goal is not only to generate embeddings, but to build a controlled research pipeline for testing what neural representations preserve: time, trial structure, labels, behavior, latent dynamics and cross-subject consistency.
Problem
Neural data are high-dimensional, noisy and temporally structured. A central problem is to understand whether low-dimensional embeddings capture meaningful structure rather than only producing visually appealing manifolds.
NeuroBridge addresses this by making the representation-learning pipeline explicit: data are windowed, metadata are tracked, distances and similarities are constructed, encoders are trained and embeddings are evaluated through decoding, simulation and visualization.
Framework Architecture
The project is organized as a modular pipeline rather than as a single monolithic script.
- Data layer: loading, preprocessing and formatting of neural and EEG data.
- Simulation layer: latent trajectory generation, spike emission models and synthetic neural data.
- Sampling layer: temporal windows, trial-aware metadata, label distances and similarity matrices.
- Model layer: linear baselines, CEBRA-style adapters and deep neural encoders.
- Learning layer: self-supervised, contrastive and metric-learning objectives.
- Evaluation layer: decoding probes, latent recovery, consistency checks and model-selection utilities.
- Visualization layer: manifold plots, trajectory visualization, embedding diagnostics and video generation.
Technologies and Implementation Stack
- Python as the main programming language for the research framework.
- PyTorch for neural encoders, training loops, tensor operations and GPU-based deep learning experiments.
- NumPy / pandas for numerical computation, data handling and intermediate array manipulation.
- scikit-learn for baseline models, PCA, kNN decoding and model-selection utilities.
- CEBRA as an external reference method integrated through adapter-style workflows for neural representation learning.
- Matplotlib for manifold plots, diagnostics and trajectory visualization.
- YAML / configuration files for reproducible experiment setup and project-level parameter control.
Example Workflow
A typical NeuroBridge workflow can be summarized as:
Deep Learning and Model Layer
Deep learning enters NeuroBridge through neural encoders that map windowed neural activity into low-dimensional embeddings:
X_window → fθ(X_window) → z
Here, X_window is a temporal window of neural activity, fθ is a learned encoder and z is the resulting latent representation.
This makes NeuroBridge broader than a single contrastive-learning experiment: contrastive learning is one training paradigm inside a wider deep representation-learning framework.
Core Learning Paradigms
- Representation learning: learning compact embeddings of neural population activity.
- Self-supervised learning: using temporal, behavioral, trial-derived or label-derived structure as training signal.
- Contrastive learning: training embeddings so that similar neural windows are close and dissimilar windows are separated.
- Metric / similarity learning: constructing explicit distance and similarity matrices from metadata such as time, labels, trials or conditions.
- Neural time-series modeling: using temporal windows and sequence-aware encoders to capture local or long-range neural dynamics.
- Latent-variable modeling: modeling neural activity through hidden trajectories and emission processes.
Model Families
The model layer is designed to compare simple baselines, deep neural encoders, contrastive objectives and latent-variable components within the same evaluation pipeline.
Current / Core Models
- PCA baseline: linear dimensionality reduction used as an interpretable reference model.
- CEBRA-style adapter: integration layer for comparing CEBRA-oriented embeddings with custom encoders and baseline dimensionality-reduction methods.
- Temporal CNN encoder: deep neural encoder for extracting local temporal patterns from windowed neural activity.
- MLP encoder: nonlinear baseline for flattened neural windows.
- Latent neural generator: generative component for simulating latent trajectories and spike observations.
Experimental / Legacy Models
The broader research codebase also contains earlier or experimental components that informed the current NeuroBridge design.
- EEGNet-style architectures: compact convolutional models for EEG/BCI classification experiments.
- Shallow and deep convolutional networks: baseline architectures for neural signal classification.
- Recurrent models: LSTM-style components for temporal neural or EEG sequences.
- VAE-style models: experimental latent-variable models for nonlinear representation and generative modeling.
- Weighted contrastive encoders: earlier contrastive-learning experiments using pairwise weights and similarity structure.
Planned Extensions
- Transformer encoder: planned extension for longer-range temporal dependencies and attention-based sequence representation.
- GRU / recurrent encoder: lighter recurrent alternative for sequential neural activity.
- Probabilistic latent dynamics: extensions toward state-space, GPFA-like or LFADS-like latent temporal models.
- Deep generative models: extensions for nonlinear latent dynamics and neural data generation.
- Cross-subject alignment models: methods for comparing embeddings across subjects, sessions or interacting agents.
Implemented Elements
- Window-based construction of neural samples from trial-structured data.
- Metadata tracking for local time, global time, trial identity and labels.
- Pairwise temporal-distance and label-distance construction.
- Combination of multiple distance matrices through weighted aggregation.
- Conversion of distance matrices into similarity targets.
- CEBRA-oriented representation learning workflows.
- Latent trajectory simulation for synthetic neural data.
- Spike emission modeling through latent-to-rate and rate-to-count transformations.
- kNN-based decoding probes for behavioral or positional variables.
- Embedding and manifold visualization utilities.
Evaluation and Diagnostics
NeuroBridge treats decoding and visualization as diagnostic tools rather than as isolated final outputs.
- Decoding probes: assess whether embeddings preserve behavioral variables such as position, direction or trial condition.
- Latent recovery: compare learned embeddings with known latent variables in synthetic simulations.
- Similarity diagnostics: inspect whether the learned geometry reflects time, labels, trials or conditions.
- Manifold visualization: plot trajectories, condition averages and low-dimensional embeddings.
- Cross-subject comparison: evaluate consistency or coupling between representations from different subjects.
Methodological Note
The central idea of NeuroBridge is that representation learning should be tested through explicit structure: metadata, distances, similarities, decoding and controlled simulations.
This avoids treating embeddings as black-box visual objects. Instead, the framework asks what information the representation preserves, what geometry it induces and whether that geometry is scientifically meaningful.
Current Limitations
NeuroBridge is an active research framework and not yet a polished software package.
- Some modules are still experimental or under refactoring.
- Several model families are planned but not yet fully integrated.
- Evaluation protocols need to be standardized across datasets and tasks.
- Cross-subject and cross-session validation remains an active development area.
- Documentation and reproducible experiment configuration are still being consolidated.
Research Direction
The long-term direction is to turn NeuroBridge into a controlled framework for neural representation learning, synthetic data generation and cross-subject comparison.
- Develop stronger encoder families for neural time series.
- Compare contrastive, generative and probabilistic latent models.
- Use synthetic data to validate whether learned embeddings recover known structure.
- Evaluate embeddings through decoding, stability and cross-subject consistency.
- Connect neural manifold learning with broader questions in representation learning and complex systems.
Resources
Technical report in preparation.
Code available upon request.
Technical Context
- Schneider, Lee & Mathis (2023), CEBRA — relevant as a contrastive learning framework for behavioral and neural data.
- Pandarinath et al. (2018), LFADS — relevant to latent dynamical modeling of neural population activity.
- Yu et al. (2009), Gaussian-process factor analysis — relevant to probabilistic latent trajectories in neural data.
- van der Maaten & Hinton (2008), t-SNE, and McInnes et al. (2018), UMAP — relevant as nonlinear dimensionality-reduction references for visualizing high-dimensional structure.