李国齐: Spiking Neural Networks: from small networks to Large Models

guoqi.li@ia.ac.cn

Research background

Transformer(Google). openai: GPT

scaling law: is endless model scalin the correct approach to achieving AGI?

如何找到 scaling lwa 之外可持续驱动当前 ai 系统到新阶段的 AGI 系统?

power issue for today’s AI system: 功耗随着性能表现提升而指数增长.

human brain vs current large models: 人脑以远超大模型的参数量, 仅消耗 20W 的功率(大模型 300KW). 如何借鉴大脑的机制

overreliance on transformer architecture:

  • advantages: 1. exceptional performance; 2. high parallelism

  • disadvantages: 1. quadratically with sequence length 2. linear growth in time and space complexity 3. challenges in handling ultra-long sequences

how can neuroscience contribute to the foundational theories of next generation ai? ai 的发展速度显著超过了神经科学的.

brain inspired large model architechture(类脑大模型架构)

  1. at present large models have poor bio plausibility and fail to exploit the rich multi scale dynamic features of brain networks such as somatic dynamics and dnedreitic dynamics multi scale memory functions.

  2. large models do not yet reflect the characteristics of 0-1 spike communitcation /eventdriven mechanism / dynamic computing /sparse addition. which are important foundations for

  3. large models do not fully exploit the neuron diversity neuron encoding diversity resulting in insufficient generalization ability (such as continous learning multi task zeroshot learning. )

神经元本身是具有多样性结构的. 但是如何体现在大模型中?

  • snn(spiking neural networks): mainstream network in brain-inspired intelligence

  • ann(artificial neural networks): mainstream network in deep learning.

snn = ann + neuronal dynamics

problems:

  • how to propose a suitable nrueonal model?
  • how to build a network to solve real world ai tasks?

LIF neuron: simple and easy to use. rich biological dynamics.

大规模时, 同步计算的困难.

how to buikd spiking neural networks?

limited in scale and performamce due to the lack of large scale learning algorithms.

back-propagation. difficult to optimize deep snn networks.

Key Problems

  1. 大模型时代: 为了提升性能, 存在不只是堆参数的方法吗?

  2. 从大模型出发,

dendritic computation

dendritic spiking nrural networks: 1. intrinsic complexity…

current large models: based on external complexity(scaling law driven)

new approach: internal complexity(intrinsic complexity driven)

  1. parallel dendritic spiking neuron model. 树突计算, 神经动力学计算, 并行神经元计算.

  2. event driven linear self attention spiking networks for large models(线性增加或许就是性能的极致了?)

  3. how to accelerate the training speed of snn large models? 在 gpu 上 snn 依赖的算子(decay…) 实现的效率较低.

model & algorithm -> training platform -> software tool chain …

Research Progress

significant performance improvement for snns.

2019: snn cant be trained on ImageNet

2025: snn is comparable to ann’s sota level, and the energy effiency is increased by 5-30 times.

main challenge in snns:

  1. complex spatiotemporal dynamics(复杂时空动力学)

  2. binary spike representations

spike 信号不可导. (quantization error)

surrogate gradient error

  1. training acceleration.

the Reset mechanism

triton framework(adapted on gpu)

How to train large-scale snn effectively and efficiently?

STBP method: spatio temporal back propagation. 解决不可微分的问题. 估计梯度使得可在网络中传播.

gradient approximation.

TDBN method:

how to prove that the TDBN approach prevents vanishing/exploding gradients in deep snns?

direct training snn: MS-ResNet (membrane shortcut ResNet)

attention snns: multi dimensional attention:

spike-driven transformer. new computing operation(only involves mask and addition)

  • V2: versatility. handle image classification, object detecction. semantic segmentation concurrently.

  • V3: binary spike firing is a mechanistic defect. integer training, spike YOLO for Object detection.

SVL: spike-based vision-language pretraining framework

open-source training platform for snns: 惊蛰(spiking jelly)

asychronous sensing-computing neuromorphic chip.

spike-based dynamic computing with asynchronous sensing computing neuromorphic chip

static power: 0.41mw; typical power: 1-5 mW.

network model with internal complexity bridges artificial intelligence and neuroscience.

spikeGPT model based on snn.

snn based on large models: spikeLM (ternary snn).

snn based large models: SpikeLLM. solved the problem of outliers in the neuronal state by varying time step of spiking neurons

Unified Linear Model Frame work: MetaLA

unifies linear transformer, ssm, linear rnns.

SpikingBrain 瞬悉.

spike based neuromorphic large models on fpga.

brain inspired edge devices (traditional vision tasks) <-> brain inspired cloud(nlp/generative tasks).

Michael Hausser

hausser@gmail.com, 网址

neural codes:

  • spatial codes. identity; spatial patterns; number(sparse or dense)

  • activity codes

the cortex exihibits high derees of variability, so is this noise or information(signal)?

we cant record all neurons in the brain simultaneously.

-> strategy: introduce a defined small amount of noise. 1 extra AP in a neuron.

  • single cell io curve: prob of an extra spike.

p * K = 0.019 * 1500 = 28 spikes / spike(28 \pm 13)

28**5 \sim 17 000 000. -> single neuron can have a huge impact.

patch-clamp and silicon probe recordings in vivo.

  • single-neuron pertubations grow and grow fast.

  • the cortex is highly sensitive to noise

  • the neural code must be robust to perturbations.

use light to build the model of neurons. advantages: non-invasive; inert; precise; multiplexable; targetable.

two revolutions using light to probe neural circuits.

  1. record (ohki nature 2005)

  2. manipulate(boyden nature 2005)

the challenge: combine recording(记录) and manipulation(操控).

conventional optogenetics: unknown numbers of activated cells/spatial distribution of activated cells; cells targeted by genetic not functional identity; synchrony across the network.

replaying the neural code with light(hausser & smith 2007): calcium sensor channel rhodopsin

all-optical toolkit

two-photon optogenetics of dendritic spines and neural circuits.

spatial light modulator(slm): a programmable beam splitter for digital holography.

microsope design.

optical insertion of single spikes in single neurons.

conceptual goals: read & write;

the state of the art: subcellular connectivity probing across brain.