Abstract

Continuous Attractor Neural Networks (CANNs) traditionally rely on pre-wired recurrent connectivity to model spatial representations, path integration, and anticipatory dynamics. However, the biological mechanisms through which this structured connectivity emerges via learning remain relatively unexplored.

This work presents a theoretical framework revealing how continuous attractor connectivity and its computational properties self-organize through Hebbian plasticity, firing-rate adaptation, and global inhibition.

连续吸引子神经网络（CANNs）传统上依赖于 预先连接 的循环连接来建模空间表示、路径积分和 预测动力学。然而，通过学习这种结构化连接出现的生物机制仍然相对未被探索。

本研究提出了一个理论框架，揭示了连续吸引子连接及其计算属性如何通过 Hebbian 可塑性、放电率适应 和全局抑制自组织。

We show that translationally invariant inputs naturally drive the emergence of stable, Gaussian-profiled feedforward weights. Crucially, anticipatory dynamics arise spontaneously within these feedforward architectures, shifting the activity bump forward without requiring recurrent excitatory collaterals.

This predictive shift can be linearly amplified across multilayer networks, consistent with anticipatory activity observed in the superficial layers of the entorhinal cortex. Furthermore, introducing recurrent interactions allows the network to learn connections capable of self-sustaining a moving bump of activity.

我们展示了平移不变的输入自然地驱动了稳定的、Gaussian 分布的前馈权重的出现。关键是，预测动力学自发地出现在这些前馈架构中，在不需要循环兴奋性侧向(连接)的情况下将活动峰向前移动。

这种预测性转移可以在线性放大跨多层网络，与在 内嗅皮层 浅层观察到的预测性活动一致。此外，引入循环交互使网络能够学习能够自我维持活动峰移动的连接。

Finally, by modulating the network with an external, time-varying baseline current that encodes speed, the system adjusts its intrinsic velocity to function as a precise unidirectional path integrator.

Ultimately, this study suggests that prospective coding and path integration are not manually engineered features, but rather naturally co-emergent properties of a single self-organizing competitive network.

最后，通过用编码速度的外部、随时间变化的基线电流调制网络，系统调整其内在速度以作为精确的单向路径积分器运行。

最终，这项研究表明，预测编码和路径积分不是手动设计的特征，而是单个自组织竞争网络自然涌现的属性。

Introduction

Attractor neural networks have long served as canonical models for understanding how neural populations store and process information. In the classical formulation, Hopfield networks encode discrete patterns as stable fixed-point attractors arising from Hebbian synaptic plasticity [1].

Continuous attractor neural networks (CANNs) extend this framework to continuous variables—such as spatial position or orientation—by sustaining localized activity bumps that evolve along lowdimensional manifolds. This formulation has proven effective in modeling a variety of neural systems, including multiple environment-specific representations of space in the hippocampus of the rodent brain [2], as well as gridcell [3], and head-direction responses [4, 5].

In these architectures, the population activity manifests as a localized bump on the representational manifold, where individual neurons fire only when the encoded variable falls within their specific receptive fields.

吸引子神经网络长期以来一直作为理解神经群体如何存储和处理信息的经典模型。在经典的表述中，Hopfield 网络将离散模式编码为由 Hebbian 突触可塑性产生的稳定不动点吸引子。

连续吸引子神经网络（CANNs）通过维持沿低维流形演化的局域活动峰，将这一框架扩展到连续变量——例如空间位置或方向。这种表述已被证明在建模各种神经系统方面是有效的，包括 啮齿动物 大脑海马体中空间的多环境特定表示，以及网格细胞和头方向反应。

在这些架构中，群体活动表现为表示流形上的局域峰，其中个体神经元仅在编码变量落在其特定感受野内时才放电。

Many of these systems support computations that require the internal representation to evolve coherently in time, such as prospective coding and path integration—the ability to estimate position or orientation by integrating self-motion cues.

Maintaining a representation that leads the current input is essential for both real-time control and long-term planning.

While this mechanism is often understood as a way to compensate for processing delays [6], the utility of anticipation extends further; by representing future states, the system can proactively prepare for incoming stimuli and facilitate planning.

这些系统中的许多支持需要内部表示随时间一致演化的计算，例如 预测编码 和路径积分——通过积分自我运动线索来估计位置或方向的能力。

维持一个领先于当前输入的表示对于实时控制和长期规划都是至关重要的。

虽然这种机制通常被理解为补偿处理延迟的一种方式，但预测的实用性更进一步；通过表示未来状态，系统可以主动为即将到来的刺激做好准备并促进规划。

Consistent with this requirement, anticipatory activity has been extensively documented across a wide variety of neural systems. For example, the visual system employs predictive mechanisms to compensate for delays during stimulus tracking [7, 8]. Similar anticipatory control is observed in birds during flight prior to water entry [9], and in head-direction cells, which encode the future orientation of rodents with a lead time of approximately $25$ ms [10].

与这一要求一致，预测性活动已在各种神经系统中得到广泛记录。例如，视觉系统采用预测机制来补偿刺激跟踪过程中的延迟。在鸟类飞行入水之前也观察到了类似的预测性控制，以及在头方向细胞中，这些细胞以大约 $25$ 毫秒的提前时间编码啮齿动物的未来方向。

In the head-direction system, anticipatory dynamics are closely linked to path integration [11]. This computation is a hallmark of grid-cell and head-direction systems in the entorhinal cortex [12, 13]. The prevailing view is that grid cells perform path integration by combining speed signals, encoded in the firing-rates of speed cells [14], with directional inputs [15, 16], while incorporating environmental cues for error correction [17, 18].

在头方向系统中，预测动力学与路径积分密切相关。这种计算是 内嗅皮层 中网格细胞和头方向系统的标志。普遍观点认为，网格细胞通过结合速度信号（在 速度细胞 的放电率中编码）与方向输入来执行路径积分，同时结合环境线索进行误差校正。

Prospective coding has also been reported in neural representations of both speed and position in the entorhinal cortex [14, 19], suggesting that these two computations may be intrinsically related. However, in CANN models, path integration [3] and anticipatory coding via firing-rate adaptation [20–22] are typically reproduced separately and generally rely on pre-wired recurrent architectures.

预测编码也已在内嗅皮层的速度和位置的神经表示中被报道，这表明这两种计算可能本质上是相关的。然而，在 CANN 模型中，路径积分和通过放电率适应进行的预测编码通常是分开再现的，并且通常依赖于预先连接的循环架构。

Within this modeling framework, canonical CANNs typically rely on symmetric recurrent connectivity to sustain a static bump of activity capable of tracking a moving input. However, several extensions have shown that richer dynamical regimes can emerge from modest modifications to this class of models. Mechanisms such as asymmetric connectivity [23], explicit inhibitory populations [24, 25], or intrinsic neuronal adaptation [20] can generate moving bumps and anticipatory dynamics, providing candidate mechanisms for these computations.

在该模型框架内，典型的 CANN 通常依赖于对称的循环连接来维持能够跟踪移动输入的静态活动峰。然而，一些扩展表明，通过对这一类模型进行适度修改，可以出现更丰富的动力学机制。诸如非对称连接、显式抑制群体或内在神经元适应等机制可以生成移动峰和预测动力学，为这些计算提供候选机制。

In parallel, analytical work has focused on tractable formulations of CANNs. A prominent example is the Gaussian connectivity model with quadratic activation function and divisive normalization [26–29], where the network dynamics can be diagonalized in a Hermitepolynomial basis.

This approach enables substantial dimensionality reduction and allows for closed-form characterization of bump shape, stability, and dynamical response. However, this analytical tractability comes at the cost of assuming a predefined connectivity structure.

与此同时，分析工作集中于 CANN 的可处理表述。一个突出的例子是具有二次激活函数和除法归一化的 Gaussian 连接模型，其中网络动力学可以在 Hermite 多项式基中对角化。

这种方法实现了实质性的降维，并允许对峰形状、稳定性和动力学响应进行封闭形式的表征。然而，这种分析可处理性是以假设预先精细调节的连接结构为代价的。

In contrast, the question of how such structured connectivity emerges through learning remains relatively unexplored. Although Hebbian and related learning rules have been applied to continuous attractor systems—including grid-cell models [30–32], motor-sequence networks [33], and head-direction networks [34]—these studies are generally not analytically tractable and tend to emphasize functional outcomes over the conditions required for the emergence and stability of the underlying connectivity.

As a result, a theoretical understanding of how continuous attractors arise from learning, and how their computational properties are shaped by plasticity, is still lacking.

相比之下，这种结构化连接如何通过学习出现的问题仍然相对未被探索。尽管 Hebbian 和相关的学习规则已应用于连续吸引子系统——包括网格细胞模型、运动序列网络和头方向网络——这些研究通常在分析上不可处理，并倾向于强调功能结果，而不是强调基础连接出现和稳定所需的条件。

因此，关于连续吸引子如何从学习中出现，以及它们的计算属性如何受到可塑性的影响的理论理解仍然缺乏。

In this work, we develop a theoretical framework to characterize the emergence of continuous attractor connectivity and its link to network function. Using gridcell-like dynamics as a reference, we investigate how learning gives rise to both prospective coding and path integration. Following a similar approach to [31], we first analyze the emergence of structured feedforward connectivity and show how it supports low-dimensional representations and generates anticipatory dynamics.

在本研究中, 我们开发了一个理论框架来表征连续吸引子连接的出现及其与网络功能的联系。以类网格细胞的动力学为参考，我们研究了学习如何产生预测编码和路径积分。遵循与 [31] 类似的方法，我们首先分析了结构化前馈连接的出现，并展示了它如何支持低维表示并生成预测动力学。

Notably, prospective coding arises spontaneously within feedforward architectures, without requiring recurrent excitatory collaterals. This mechanism is consistent with biological constraints, as recurrent excitatory connections are prominent in deep layers.

In contrast, superficial layers—where these computations are actually observed [19, 31]—lack comparable recurrent structure and instead rely more strongly on inhibitory interactions mediated by interneurons [35–37]. We then incorporate recurrent interactions to study the emergence and stability of attractor states across parameter regimes.

Finally, we provide an analytical treatment of path integration, focusing on the network’s ability to calibrate its internal velocity and maintain accurate position estimates.

值得注意的是，预测编码自发地出现在前馈架构中，而不需要循环兴奋性侧向(连接)。这一机制与生物学约束一致，因为循环兴奋性连接在深层中很突出。

相比之下，浅层——这些计算实际上被观察到的地方——缺乏可比的循环结构，而是更依赖于由中间神经元介导的抑制交互。然后，我们结合循环相互作用来研究吸引子状态在参数范围内的出现和稳定性。

最后，我们提供了路径积分的分析处理，重点关注网络校准其内部速度和保持准确位置估计的能力。

Results

A. The Model with Feedforward Connections

We consider a two-layer neural network. The first, input, layer consists of $N_{\mathrm{in}}$ neurons that encode a moving stimulus propagating in a single direction with constant velocity, $v$, along a segment of length $L$. Each neuron is tuned to exhibit a Gaussian response centered at a given position, with width $\sigma_{R}$.

我们考虑一个两层神经网络。第一层输入层由 $N_{\mathrm{in}}$ 个神经元组成，这些神经元编码沿长度为 $L$ 的线段以恒定速度 $v$ 传播的移动刺激。每个神经元被调谐为在给定位置中心表现出 Gaussian 响应，宽度为 $\sigma_{R}$。

For simplicity, we identify a neuron by the position $x$ where its peak response is achieved. We assume homogeneous density along the segment, given by $\rho_{\mathrm{in}} = N_{\mathrm{in}}/L$. The moving stimulus, or tutor, follows a periodic trajectory starting at $x = 0$ each time it reaches $x = L$.

Throughout this work, we assume $L\gg\sigma_{R}$ in both our theoretical analysis and simulations, ensuring that integrals across the finite domain can be treated as integrals over the entire real line.

为简单起见，我们通过其峰值响应达到的位置 $x$ 来区分神经元。我们假设沿线段的密度是均匀的，由 $\rho_{\mathrm{in}} = N_{\mathrm{in}}/L$ 给出。移动刺激或引导遵循一个周期性轨迹，每次到达 $x = L$ 时从 $x = 0$ (重新)开始。

在本研究中，我们在理论分析和模拟中都假设 $L\gg\sigma_{R}$，确保跨有限域的积分可以视为整个实数轴上的积分。

Assuming the density is high enough to approximate the population by a continuous distribution, the activity at time $t$ of a neuron centered at position $x$ is given by

$$ R(x,t) = A_{R}\mathcal{N}(x; vt, \sigma_{R}) $$

where $\mathcal{N}(x;\mu,\sigma) = \frac{1}{\sqrt{2\pi \sigma^{2}}}\exp{\left[-\frac{(x-\mu)^{2}}{2\sigma^{2}}\right]}$ denotes a Gaussian with mean $\mu$ and standard deviation $\sigma$. With this convention, the total drive to the layer is set by $A_{R}$, while the Gaussian profile distributes this activity across neurons.

假设密度足够高，可以通过连续分布来近似群体，则在时间 $t$ 时，位于位置 $x$ 的神经元的活动由下式给出：

$$ \begin{align} R(x,t) = A_{R}\mathcal{N}(x; vt, \sigma_{R}) \end{align} $$

其中 $\mathcal{N}(x;\mu,\sigma) = \frac{1}{\sqrt{2\pi \sigma^{2}}}\exp{\left[-\frac{(x-\mu)^{2}}{2\sigma^{2}}\right]}$ 表示均值为 $\mu$、标准差为 $\sigma$ 的 Gaussian。按照这种约定，层的总驱动力由 $A_{R}$ 设置，而 Gaussian 分布将这种活动分布在神经元之间。

A second layer of $N_{c}$ neurons and density $\rho_{c}$ receives input from the tutor layer. Its dynamics are that of a CANN model with firing-rate adaptation, as described in [20]:

$$ \tau\frac{\partial U(x,t)}{\partial t} &= -U(x,t) - V(x,t) + I(x,t)\\ \tau_{v}\frac{\partial V(x,t)}{\partial t} &= -V(x,t) + m U(x,t) $$

where $U(x, t)$ denotes the membrane potential of a neuron at position $x$ and time $t$, $V(x,t)$ the firing-rate adaptation variable, and $m$ the strength of the adaptation.

第二层由 $N_{c}$ 个神经元组成，密度为 $\rho_{c}$，接收来自 引导层 的输入。其动力学是具有放电率适应的 CANN 模型，如 [20] 所述:

$$ \begin{align} \tau\frac{\partial U(x,t)}{\partial t} &= -U(x,t) - V(x,t) + I(x,t)\\ \tau_{v}\frac{\partial V(x,t)}{\partial t} &= -V(x,t) + m U(x,t) \end{align} $$

Note that $x$ does not necessarily represent the physical position of the external stimulus, but rather the coordinates of the neuron within its internal neural manifold. The term $I(x, t)$ is the total input current to the neuron, which may include both feedforward input from other layers and recurrent contributions within the layer itself, and can vary with $x$ if neurons receive different inputs.

注意 $x$ 不一定表示外部刺激的物理位置，而是表示神经元在其内部神经流形中的坐标。术语 $I(x, t)$ 是神经元的总输入电流，它可能包括来自其他层的前馈输入和层内的循环贡献，并且如果神经元接收不同的输入，则可以随 $x$ 变化。

In this first section, we focus on the case where $I(x, t)$ is the feedforward drive from the input layer:

$$ I(x,t) = \rho_{\mathrm{in}}\int_{x^{\prime}}\mathrm{d}x^{\prime},J(x,x^{\prime})R(x^{\prime},t), $$

where the integration is done over the interval $[0, L)$ and $J(x, x^{\prime})$ denotes the feedforward synaptic weight from neuron $x^{\prime}$ in the input layer to neuron $x$ in the competitive layer.

在本节中，我们关注 $I(x, t)$ 是来自输入层的前馈驱动的情况：

$$ \begin{align} I(x,t) = \rho_{\mathrm{in}}\int_{x^{\prime}}\mathrm{d}x^{\prime},J(x,x^{\prime})R(x^{\prime},t), \end{align} $$

The firing-rate $r(x, t)$ of the neural population is modeled as a nonlinear function of the membrane potential $U (x, t)$, defined by:

$$ r(x,t) = \frac{[U(x,t)]_{+}^{2}}{B}, \quad B = 1 + k\rho_{c}\int_{x^{\prime}}\mathrm{d}x^{\prime},[U(x^{\prime},t)]_{+}^{2}, $$

and $[\cdot]_{+}$ rectifies the content taking the maximum value between it and zero. The firing-rate is subject to global divisive normalization, governed by the inhibition parameter $k$. The integration in the normalization term expression $B$ is defined over the entire domain without explicit spatial limits. This generalized notation allows the model to account for a wide range of neural codes, encompassing both initially unstructured (disorganized) populations and those that have acquired a specific spatial organization.

放电率 $r(x, t)$ 的神经群体被建模为膜电位 $U (x, t)$ 的非线性函数，定义为：

$$ \begin{align} r(x,t) = \frac{[U(x,t)]_{+}^{2}}{B}, \quad B = 1 + k\rho_{c}\int_{x^{\prime}}\mathrm{d}x^{\prime},[U(x^{\prime},t)]_{+}^{2}, \end{align} $$

而 $[\cdot]_{+}$ 对内容进行整流，取其与零之间的最大值。放电率受到全局除法归一化的约束，由抑制参数 $k$ 控制。归一化项表达式 $B$ 中的积分在整个域上定义，没有明确的空间限制。这种广义符号允许模型考虑各种神经编码，包括最初无结构（无组织）的群体和那些已经获得特定空间组织的群体。

To incorporate learning, we introduce Hebbian plasticity in feedforward weights between the two layers, in the form:

$$ \frac{\partial J(x,x^{\prime})}{\partial t} = \eta_{J}r(x,t)[R(x^{\prime},t)-\alpha_{J}J(x,x^{\prime})^{\beta}] $$

where $\eta_{J}$ is the learning rate. We set $1/\eta_{J}\gg \tau_{v}$, so that plasticity evolves on a slower timescale than the neural dynamics. In this expression, the first term implements Hebbian potentiation: synapses are strengthened in proportion to the co-activation between presynaptic drive $R(x, t)$ and postsynaptic activity $r(x^{\prime}, t)$. The second term introduces a decay mechanism that mimics a local homeostatic process, with the parameter $\beta > 0$ controlling the extent and $\alpha_{J} > 0$ the strength of this decay.

为了引入学习，我们在两层之间的前馈权重中引入 Hebbian 可塑性，形式为:

$$ \begin{align} \frac{\partial J(x,x^{\prime})}{\partial t} = \eta_{J}r(x,t)[R(x^{\prime},t)-\alpha_{J}J(x,x^{\prime})^{\beta}] \end{align} $$

Finally, to respect Dale’s law, we restrict $J(x, x^{\prime})$ to excitatory connections. In simulations, any update that results in a negative weight is clipped at zero.

最后，为了遵守 Dale 定律，我们将 $J(x, x^{\prime})$ 限制为兴奋性连接。在模拟中，任何导致负权重的更新都被限制为零。

B. Self Organized Feedforward Connectivity

1. Gaussian shape

Simulations of a network without recurrent connectivity show that a translationally invariant input, combined with Hebbian learning, naturally produces a Gaussian profile in the feedforward weights. Fig. 1A illustrates the resulting system configuration. Global inhibition induces competition among neurons, Hebbian plasticity strengthens connections of consistently active neurons, while synaptic depression and adaptation prevent runaway activity. Without adaptation, a single neuron could dominate and learn all stimuli; with adaptation, suppression ensures that activity is distributed across the population.

没有循环连接的网络模拟表明，平移不变的输入结合 Hebbian 学习，自然地产生了前馈权重的 Gaussian 分布。图 1A 展示了由此产生的系统配置。全局抑制在神经元之间引发竞争，Hebbian 可塑性增强了一直活跃的神经元的连接，而突触抑制和适应防止了活动失控。没有适应，一个神经元可能会主导并学习所有刺激；有了适应，抑制确保活动分布在整个群体中。

To study the steady states analytically (see Appendix A for derivations), we assume Gaussian-shaped equilibrium profiles for the activity variables (see also [20]):

$$ U^{\mathrm{eq}}(x,z) &= A_{u}\mathcal{N}(x;z,\sigma_{u})\\ V^{\mathrm{eq}}(x,z) &= A_{v}\mathcal{N}(x;z-d,\sigma_{u})\\ r^{\mathrm{eq}}(x,z) &= A_{r}\mathcal{N}\left(x;z,\frac{\sigma_{u}}{\sqrt{2}}\right) $$

where $z = vt$ is the position of the moving stimulus. Here, $A_{u}$, $A_{v}$, and $A_{r}$ are the amplitudes of the respective variables, $\sigma_{u}$ is the bump width, and $d$ represents the lag of the adaptation variable $V(x, t)$ behind the moving bump, reflecting the delayed effect of adaptation on network activity. All amplitudes and $d$ depend on system parameters (details in Appendix A).

为了分析研究稳态（推导见附录 A），我们假设活动变量的平衡分布为 Gaussian 形状（另见 [20]）:

$$ \begin{align} U^{\mathrm{eq}}(x,z) &= A_{u}\mathcal{N}(x;z,\sigma_{u})\\ V^{\mathrm{eq}}(x,z) &= A_{v}\mathcal{N}(x;z-d,\sigma_{u})\\ r^{\mathrm{eq}}(x,z) &= A_{r}\mathcal{N}\left(x;z,\frac{\sigma_{u}}{\sqrt{2}}\right) \end{align} $$

其中 $z = vt$ 是移动刺激的位置。这里，$A_{u}$、$A_{v}$ 和 $A_{r}$ 分别是各变量的幅度，$\sigma_{u}$ 是峰宽度，$d$ 表示适应变量 $V(x, t)$ 落后于移动峰的滞后，反映了适应对网络活动的延迟影响。所有幅度和 $d$ 都依赖于系统参数（详见附录 A）。

Since learning is much slower than bump dynamics, Hebbian updates can be averaged over a full stimulus cycle, which is equivalent to averaging Eq. 6 over all possible positions of the bump. By setting the averaged time derivative to zero, the equilibrium condition yields

$$ J^{\mathrm{eq}}(x,x^{\prime}) = \left[\frac{\langle r^{\mathrm{eq}}(x^{\prime},z)R(x,z)\rangle_{z}}{\alpha_{J}\langle r^{\mathrm{eq}}(x^{\prime},z)\rangle_{z}}\right]^{1/\beta} $$

where $\langle\cdot\rangle_{z}$ denotes the average over all possible $z$ values.

因为学习比峰值动力学慢得多，所以 Hebbian 更新可以在整个刺激周期内进行平均，这相当于对 Eq. 6 在峰值的所有可能位置上进行平均。通过将平均时间导数设为零，平衡条件产生

$$ \begin{align} J^{\mathrm{eq}}(x,x^{\prime}) = \left[\frac{\langle r^{\mathrm{eq}}(x^{\prime},z)R(x,z)\rangle_{z}}{\alpha_{J}\langle r^{\mathrm{eq}}(x^{\prime},z)\rangle_{z}}\right]^{1/\beta} \end{align} $$

其中 $\langle\cdot\rangle_{z}$ 表示对所有可能的 $z$ 值进行平均。

Because the equilibrium activity profiles $r(x^{\prime}, z)$ and $R(x, z)$ are Gaussian functions, Eq. 10 admits a Gaussian solution. Evaluating the averages results in

$$ J^{\mathrm{eq}}(x,x^{\prime}) = A_{J}\mathcal{N}(x;x^{\prime},\sigma_{J}) $$

where

$$ A_{J} = \left(\frac{A_{R}}{\alpha_{J}C_{\beta}}\right)^{1/\beta},\quad C_{\beta} = \sqrt{\frac{(2\pi\sigma_{J}^{2})^{1-\beta}}{\beta}} $$

因为平衡活动分布 $r(x^{\prime}, z)$ 和 $R(x, z)$ 是 Gaussian 函数，Eq. 10 允许 Gaussian 解。评估平均值的结果为

$$ \begin{align} J^{\mathrm{eq}}(x,x^{\prime}) = A_{J}\mathcal{N}(x;x^{\prime},\sigma_{J}) \end{align} $$

其中

$$ \begin{align} A_{J} = \left(\frac{A_{R}}{\alpha_{J}C_{\beta}}\right)^{1/\beta},\quad C_{\beta} = \sqrt{\frac{(2\pi\sigma_{J}^{2})^{1-\beta}}{\beta}} \end{align} $$

For self-consistency, all Gaussian terms in Eqs. 2, 3 and 10 must share the same width. This requirement leads to the relations

$$ \sigma_{J} = \sqrt{\frac{3\beta}{2-\beta}}\sigma_{R},\quad \sigma_{u} = \sqrt{\frac{2\beta + 2}{2-\beta}}\sigma_{R} $$

为了自洽，Eqs. 2、3 和 10 中的所有 Gaussian 项必须具有相同的宽度。这个要求导致了以下关系

$$ \begin{align} \sigma_{J} = \sqrt{\frac{3\beta}{2-\beta}}\sigma_{R},\quad \sigma_{u} = \sqrt{\frac{2\beta + 2}{2-\beta}}\sigma_{R} \end{align} $$

These expressions establish a monotonic dependence of the learned widths on the parameter $\beta$, while also constraining its admissible range. Requiring real-valued solutions implies $0 < \beta < 2$, and the limit $\beta\to 2$ leads to a divergence of both widths. In particular, $\beta = 0.5$ corresponds to the critical value for which the learned neural representation, in terms of the firing rate $r(x, z)$, has the same width as the input, i.e., $\sigma_{u}/\sqrt{2} = \sigma_{R}$. For $\beta < 0.5$, the neural code is narrower than the input, yielding a more spatially precise representation. For $\beta > 0.5$, the neural code becomes broader and therefore less spatially precise.

这些表达式建立了学习宽度对参数 $\beta$ 的单调依赖，同时也限制了其可接受范围。要求实值解意味着 $0 < \beta < 2$，并且当 $\beta\to 2$ 时，两个宽度都会发散。特别地，$\beta = 0.5$ 对应于临界值，在该值下，学习的神经表示（以放电率 $r(x, z)$ 表示）与输入具有相同的宽度，即 $\sigma_{u}/\sqrt{2} = \sigma_{R}$。对于 $\beta < 0.5$，神经编码比输入更窄，从而产生更空间精确的表示。对于 $\beta > 0.5$，神经编码变得更宽，因此空间精确性较低。

Starting from random initial weights, we simulated the system across a range of $\beta$ values. As shown in Figs. 1BC, the learned Gaussian connectivity profiles recover an amplitude and standard deviation that closely align with the theoretical predictions of Eqs. 12 and 13. Fig. 1D presents snapshots of an example connectivity matrix at various stages of the simulation, illustrating its temporal evolution towards the predicted Gaussian profile. Together, the alignment between these analytical derivations and computational simulations shows that the network robustly self-organizes from random initial conditions into a stable, predictable spatial representation.

从随机初始化权重开始，我们在一系列 $\beta$ 值下模拟了系统。如图 1BC 所示，学习到的 Gaussian 连接构型恢复了与 Eqs. 12 和 13 的理论预测密切一致的幅度和标准差。图 1D 展示了模拟各个阶段的示例连接矩阵快照，说明其向预测的 Gaussian 分布的时间演化。总之，这些分析推导与计算模拟之间的一致性表明，网络能够从随机初始条件稳健地自组织为稳定、可预测的空间表示。

2. Weight stability

We next study the stability of the connectivity profiles by means of a perturbative approach. In the regime of extremely weak inhibition ($k\ll 1$), small perturbations of the synaptic weights, J(x, x′), obey the linearized dynamics

$$ \tau_{J}(x,x^{\prime})\frac{\partial\delta J(x,x^{\prime})}{\partial t} = \int\mathrm{d}x^{\prime\prime}\, K(x,x^{\prime},x^{\prime\prime})\delta J(x^{\prime\prime},x^{\prime}) - \delta J(x,x^{\prime}) $$

where $K(x, x^{\prime}, x^{\prime\prime})$ denotes the effective interaction kernel and $\tau_{J} (x, x^{\prime})$ is an effective, spatially dependent time constant (see Appendix B for explicit expressions and derivation). We note that both $J(x, x^{\prime})$ and $K(x, x^{\prime}, x^^{\prime\prime})$ depend on the relative coordinates. This allows rewriting the dynamics in terms of the displacement variables $s = x − x^{\prime}$ and $s^{\prime\prime} = x^{\prime\prime} − x^{\prime}$ and a reparametrized perturbation $w_{x^{\prime}}(s,t) = J(s+x^{\prime},x^{\prime}, t)$ (see Appendix B). Under this transformation, the dynamics reduce to an operator equation of the form

$$ \tau_{J}(s)\frac{\partial w_{x^{\prime}}(s,t)}{\partial t} = \hat{L}[w_{x^{\prime}}(s,t)], $$

where $\hat{L}$ is a linear integral operator defined by

$$ \hat{L}[w_{x^{\prime}}(s,t)] = \int\mathrm{d}s^{\prime\prime}\, K(s,s^{\prime\prime})w_{x^{\prime}}(s^{\prime\prime},t) - w_{x^{\prime}}(s,t) $$

我们接下来通过微扰方法研究连接构型的稳定性。在极弱抑制 ($k\ll 1$) 的情况下，突触权重 $J(x, x^{\prime})$ 的小扰动遵循线性化动力学

$$ \begin{align} \tau_{J}(x,x^{\prime})\frac{\partial\delta J(x,x^{\prime})}{\partial t} = \int\mathrm{d}x^{\prime\prime}\, K(x,x^{\prime},x^{\prime\prime})\delta J(x^{\prime\prime},x^{\prime}) - \delta J(x,x^{\prime}) \end{align} $$

其中 $K(x, x^{\prime}, x^{\prime\prime})$ 表示有效相互作用核，$\tau_{J} (x, x^{\prime})$ 是有效的、空间依赖的时间常数（显式表达式和推导见附录 B）。我们注意到，$J(x, x^{\prime})$ 和 $K(x, x^{\prime}, x^{\prime\prime})$ 都依赖于相对坐标。这允许将动力学重写为位移变量 $s = x − x^{\prime}$ 和 $s^{\prime\prime} = x^{\prime\prime} − x^{\prime}$ 以及重新参数化的扰动 $w_{x^{\prime}}(s,t) = J(s+x^{\prime},x^{\prime}, t)$（见附录 B）。在此变换下，动力学简化为如下形式的算子方程

$$ \begin{align} \tau_{J}(s)\frac{\partial w_{x^{\prime}}(s,t)}{\partial t} = \hat{L}[w_{x^{\prime}}(s,t)], \end{align} $$

其中 $\hat{L}$ 是由下式定义的线性积分算子

$$ \begin{align} \hat{L}[w_{x^{\prime}}(s,t)] = \int\mathrm{d}s^{\prime\prime}\, K(s,s^{\prime\prime})w_{x^{\prime}}(s^{\prime\prime},t) - w_{x^{\prime}}(s,t) \end{align} $$

We then seek the modal solutions $w_{x^{\prime}}(s,t) = \sum_{i}c_{i}(x^{\prime})f_{i}(s)e^{\lambda_{i}t}$, which lead to the generalized eigenvalue-eigenfunction problem

$$ \hat{L}[f_{i}(s)] = \lambda_{i}\tau_{J}(s)f_{i}(s) $$

我们然后寻找模态解 $w_{x^{\prime}}(s,t) = \sum_{i}c_{i}(x^{\prime})f_{i}(s)e^{\lambda_{i}t}$，这导致广义特征值-特征函数问题

$$ \begin{align} \hat{L}[f_{i}(s)] = \lambda_{i}\tau_{J}(s)f_{i}(s) \end{align} $$

The system can only be stable if all the eigenvalues satisfy $\mathcal{R}(\lambda_{i}) < 0$. Importantly, the spectrum only depends on the displacement variables (not $x^{\prime}$), so this condition defines a global stability criterion (see Appendix B for details on the spectral decomposition). Numerical spectral analysis reveals that the eigenvalue with highest real part (from now on, referred as $\lambda_{\max}$) remains negative across a broad parameter range, including biologically plausible parameters. Specifically, we show in Fig. 2 the change in $\lambda_{\max}$ across a wide range of input densities ($\rho_{\mathrm{in}}$) and damping synaptic strengths ($\alpha_{J}$), focusing on these because other biologically relevant parameters did not yield such prominent variations.

系统只有在所有特征值满足 $\mathcal{R}(\lambda_{i}) < 0$ 时才能稳定。重要的是，谱仅依赖于位移变量（而不是 $x^{\prime}$），因此该条件定义了全局稳定性标准（有关谱分解的详细信息，请参见附录 B）。数值谱分析表明，具有最高实部的特征值（从现在起称为 $\lambda_{\max}$）在广泛的参数范围内保持为负，包括生物学上合理的参数。具体来说，我们在图 2 中展示了输入密度 ($\rho_{\mathrm{in}}$) 和阻尼突触强度 ($\alpha_{J}$) 的广泛范围内 $\lambda_{\max}$ 的变化，重点关注这些参数，因为其他生物学相关参数没有产生如此显著的变化。

Our theoretical analysis complements the simulation results. In the biologically plausible parameter regime (and beyond), the system remains theoretically stable as long as the inhibition is sufficiently weak. However, it is of biological interest to determine what happens in a higher inhibition regime where analytical manipulation becomes challenging. Our simulations demonstrate that increasing the inhibition factor $k$ does not lead to instabilities. In fact, all simulation results presented in this work involving learned feedforward connectivity correspond to simulations with non-negligible values of $k$. Thus, both approaches converge on a single conclusion: the learned feedforward weights are fundamentally stable, ensuring a robust spatial representation across a wide spectrum of network parameters.

我们的理论分析补充了模拟结果。在生物学上合理的参数范围内（以及更广泛的范围内），只要抑制足够弱，系统在理论上保持稳定。然而，确定在抑制较高的情况下会发生什么是具有生物学意义的，在这种情况下，分析操作变得具有挑战性。我们的模拟表明，增加抑制因子 $k$ 不会导致不稳定性。事实上，本研究中涉及学习到的前馈连接的所有模拟结果都对应于具有不可忽略 $k$ 值的模拟。因此，这两种方法得出一个结论：学习到的前馈权重在根本上是稳定的，确保了在广泛的网络参数范围内具有稳健的空间表示。

C. Bump Asymmetry and Anticipative Coding

In the preceding sections we have shown that Gaussian connectivity profiles and their resulting population activity profiles constitute stable solutions when driven by translationally invariant moving inputs. This selforganizing property facilitates the construction of hierarchical feedforward architectures, wherein each successive layer learns to represent the manifold provided by its predecessor. We now investigate the structural distortions of the population bump profile as a function of the network’s intrinsic parameters.

在前面的章节中，我们已经展示了当受到平移不变的移动输入驱动时，Gaussian 连接构型及其产生的群体活动构型构成了稳定的解。这种自组织特性促进了分层前馈架构的构建，其中每个连续层学习表示其前任提供的流形。我们现在研究群体峰值构型的结构畸变，作为网络内在参数的函数。

From a functional perspective, these distortions are naturally characterized by an expansion in Hermite polynomials (see Appendix A for a derivation, or [28] for a comprehensive treatment). In this framework, the Gaussian profile corresponds to the zeroth-order Hermite mode, while deviations from symmetry are captured by higher-order terms. The odd-order modes are of particular interest as they govern the displacement of the bump’s center of mass. Specifically, our analysis reveals that increasing the adaptation strength systematically enhances the contribution of the first-order mode, inducing a pronounced asymmetry in the activity profile.

从功能的角度来看，这些畸变自然地通过 Hermite 多项式展开来表征（推导见附录 A，或 [28] 进行全面处理）。在这个框架中，Gaussian 分布对应于零阶 Hermite 模式，而对称性的偏差由高阶项捕获。奇数阶模式特别有趣，因为它们控制峰值质心的位移。具体来说，我们的分析表明，增加适应强度系统地增强了第一阶模式的贡献，从而在活动构型中引入了明显的不对称性。

To quantify this effect, we define an ansatz for the steady-state moving bump solution:

$$ U_{\gamma}^{\mathrm{eq}}(x,z) = A_{u}\mathcal{N}(x;z,\sigma_{u})\left(1 + \gamma\frac{x-z}{\sigma_{u}}\right) $$

where $\gamma$ scales the contribution of the first-order Hermite polynomial and serves as the primary metric for profile asymmetry. For small perturbations ($\gamma\ll 1$), this expansion is equivalent to a first-order Taylor approximation of a shifted Gaussian:

$$ \mathcal{N}(x;z+\sigma_{u}\gamma,\sigma_{u}) \approx \mathcal{N}(x;z,\sigma_{u})\left(1 + \gamma\frac{x-z}{\sigma_{u}}\right) $$

为了量化这种效应，我们为稳态移动峰值解定义了一个假设:

$$ \begin{align} U_{\gamma}^{\mathrm{eq}}(x,z) = A_{u}\mathcal{N}(x;z,\sigma_{u})\left(1 + \gamma\frac{x-z}{\sigma_{u}}\right) \end{align} $$

其中 $\gamma$ 缩放了第一阶 Hermite 多项式的贡献，并作为构型不对称性的主要度量。对于小扰动 ($\gamma\ll 1$)，这种展开等价于一个平移 Gaussian 的一阶 Taylor 近似:

$$ \begin{align} \mathcal{N}(x;z+\sigma_{u}\gamma,\sigma_{u}) \approx \mathcal{N}(x;z,\sigma_{u})\left(1 + \gamma\frac{x-z}{\sigma_{u}}\right) \end{align} $$

This approximation illustrates that a positive $\gamma$ corresponds to a forward displacement of the activity peak, effectively encoding a predictive representation of the trajectory. Conversely, a negative $\gamma$ implies retrospective coding. Thus, the emergence of a non-zero first-order Hermite coefficient provides a formal mathematical signature of either prospective or retrospective neural dynamics.

这种近似说明，正的 $\gamma$ 对应于活动峰值的前向位移，有效地编码了轨迹的预测表示。相反，负的 $\gamma$ 意味着回顾性编码。因此，非零第一阶 Hermite 系数的出现提供了预测性或回顾性神经动力学的正式数学特征。

By substituting the ansatz from Eq. 18 into the governing dynamical equations (Eqs. 2 and 3) and projecting the system onto the first two Hermite modes, we derive an explicit relationship between the asymmetry parameter $\gamma$ and the spatial lag $d$ between the excitatory variable $U (x, t)$ and the adaptive variable $V (x, t)$ (detailed in Appendix A 2 a). To facilitate the analysis, we introduce the following dimensionless variables:

$$ u = \frac{\tau v}{\sqrt{2}\sigma_{u}},\quad y = \frac{d}{\sqrt{2}\sigma_{u}},\quad \Gamma = \frac{\tau_{v}}{\tau}. $$

通过将 Eq. 18 中的假设代入控制动力学方程（Eqs. 2 和 3）并将系统投影到前两个 Hermite 模式上，我们推导出不对称参数 $\gamma$ 与兴奋变量 $U (x, t)$ 和适应变量 $V (x, t)$ 之间的空间滞后 $d$ 之间的显式关系（详见附录 A 2 a）。为了便于分析，我们引入以下无量纲变量:

$$ \begin{align} u = \frac{\tau v}{\sqrt{2}\sigma_{u}},\quad y = \frac{d}{\sqrt{2}\sigma_{u}},\quad \Gamma = \frac{\tau_{v}}{\tau}. \end{align} $$

The steady-state solutions for the asymmetry $\gamma$ and the spatial displacement $y$ are given by:

$$ \gamma = \frac{my}{\Gamma yu + 1} - u $$

and

$$ y = \frac{(m+1)-\Gamma u^{2}}{2\Gamma u} \left(\sqrt{1 + \frac{4u^{2}\Gamma(\Gamma + 1)}{[\Gamma u^{2} - (m+1)]^{2}}}\right) $$

反对称 $\gamma$ 和空间位移 $y$ 的稳态解由下式给出:

$$ \begin{align} \gamma = \frac{my}{\Gamma yu + 1} - u \end{align} $$

和

$$ \begin{align} y = \frac{(m+1)-\Gamma u^{2}}{2\Gamma u} \left(\sqrt{1 + \frac{4u^{2}\Gamma(\Gamma + 1)}{[\Gamma u^{2} - (m+1)]^{2}}}\right) \end{align} $$

Fig. 3 shows that the equilibrium value of $\gamma$ increases with adaptation strength $m$, with the exact relationship modulated by the input speed $v$. This demonstrates that adaptation introduces a systematic asymmetry into the bump profile. Importantly, the sign and magnitude of the asymmetry are not dictated by adaptation alone, but emerge from the interaction between adaptation and input velocity. Larger $m$ accelerates the decay of neural activity, and through global inhibition this decay allows previously silent neurons to become active, shifting the bump in the direction of motion. If activity decays quickly enough, the network can anticipate the stimulus (positive asymmetry). Conversely, weak adaptation or very high input velocity prevents sufficient recruitment of new neurons, resulting in delayed responses and retrospective coding (negative asymmetry).

图 3 显示，$\gamma$ 的平衡值随着适应强度 $m$ 的增加而增加，确切的关系受到输入速度 $v$ 的调制。这表明适应在峰值构型中引入了系统的不对称性。重要的是，不对称性的符号和大小不仅由适应决定，而是由适应与输入速度之间的相互作用产生。较大的 $m$ 加速了神经活动的衰减，并且通过全局抑制，这种衰减允许先前沉默的神经元变得活跃，从而将峰值向运动方向移动。如果活动衰减得足够快，网络可以预测刺激（正不对称）。相反，弱适应或非常高的输入速度会阻止新神经元的充分招募，从而导致延迟响应和回顾性编码（负不对称）。

In a feedforward architecture composed of multiple stacked layers, each successive layer not only reconstructs the underlying one-dimensional manifold but also displays an activity bump shifted forward with respect to its predecessor (see schematic in Fig. 4A). Importantly, these predictive shifts do not strictly rely on a deep architecture; similar anticipatory behavior can be achieved within a single layer by tuning the adaptation strength. Instead, the primary role of a multilayer hierarchy is to provide a robust framework through which these shifts can naturally emerge and be progressively potentiated through learning. Fig. 4B illustrates this evolutionary process, tracking how every layer within the network progressively shifts from completely random initializations toward structured profiles by measuring the correlation between each connectivity matrix and its theoretically expected Gaussian shape.

在由多个堆叠层组成的前馈架构中，每个连续层不仅重建了基础的一维流形，而且显示出相对于其前任向前移动的活动峰值（见图 4A 的示意图）。重要的是，这些预测性位移并不严格依赖于深层架构；通过调整适应强度，可以在单层内实现类似的预期行为。相反，多层层次结构的主要作用是提供一个稳健的框架，通过该框架，这些位移可以自然地出现，并通过学习逐步增强。图 4B 展示了这一演化过程，跟踪网络中每一层如何从完全随机的初始化逐步转向结构化的分布，通过测量每个连接矩阵与其理论上预期的 Gaussian 形状之间的相关性。

Analytically, this progressive potentiation means that for a system of $M$ layers, the final layer is expected to display a cumulative forward shift given by $\Delta = M\sigma_{u}\gamma$ (see Figs. 4C and D). However, this linear amplification comes at the cost of error accumulation across successive processing stages. This compounding error imposes a natural upper bound on the viable depth of the feedforward network, as sequential information degradation eventually compromises the accuracy and stability of the spatial prediction.

从分析上讲，这种逐步增强意味着对于一个由 $M$ 层组成的系统，最终层预计会显示出由 $\Delta = M\sigma_{u}\gamma$ 给出的累积前向位移（见图 4C 和 D）。然而，这种线性放大是以连续处理阶段中误差积累为代价的。这种复合误差对前馈网络的可行深度施加了自然的上限，因为顺序信息退化最终会损害空间预测的准确性和稳定性。

This provides a mechanistic interpretation of how anticipatory codes may arise naturally through learning in dynamic environments. Such predictive shifts are consistent with experimental observations in the medial entorhinal cortex, where neurons in superficial layers exhibit activity that anticipates the animal’s future trajectory [14, 19]. Our results suggest that hierarchical feedforward structures can be learned through selforganizing rules, enabling anticipatory coding without requiring strong recurrent interactions. This is consistent with the observation that a relatively low density of recurrent connections is found in superficial layers of the entorhinal cortex [35–37], precisely where anticipatory activity is most prominent.

这为预测性编码如何通过在动态环境中学习自然产生提供了机制解释。这种预测性位移与内嗅皮层中实验观察到的现象一致，其中浅层神经元表现出预测动物未来轨迹的活动 [14, 19]。我们的结果表明，分层前馈结构可以通过自组织规则学习，从而实现预测性编码，而无需强烈的循环相互作用。这与观察到的现象一致，即在内嗅皮层的浅层中发现了相对较低密度的循环连接 [35–37]，正是在这些地方预测性活动最为显著。

D. Learning Recurrent Weights

1. Stability Analysis

In this section, we incorporate recurrent connectivity into the proposed framework (see schematic in Fig. 5A). While the feedforward architecture described above accounts for anticipatory shifts, it does not provide a mechanism for path integration. Estimating position from self-motion requires the continuous integration of proprioceptive signals, such as velocity and heading direction, which in neural systems is typically implemented through recurrent dynamics. In particular, CANN models rely on structured recurrent connectivity to translate velocity inputs into coherent shifts of the activity bump along the manifold.

在本节中，我们将循环连接纳入所提出的框架（见图 5A 的示意图）。虽然上述前馈架构解释了预测性位移，但它并没有提供路径积分的机制。从自我运动中估计位置需要连续整合本体感觉信号，例如速度和航向方向，在神经系统中通常通过循环动力学实现。特别是，CANN 模型依赖于结构化的循环连接，将速度输入转化为沿流形的活动峰值的相干位移。

Accordingly, we extend the model by introducing a recurrent contribution to the synaptic current. The total synaptic current $I(x, t)$ is reformulated as the sum of feedforward and recurrent contributions:

$$ I(x,t) = \rho_{\mathrm{in}}\int_{x^{\prime}}\mathrm{d}x^{\prime}\,J(x,x^{\prime})R(x^{\prime}, t) + \rho_{c}\int_{x^{\prime}}\mathrm{d}x^{\prime}\,W(x,x^{\prime})r(x^{\prime},t), $$

where $W (x, x^{\prime})$ represents the recurrent connectivity matrix. The temporal evolution of $W (x, x^{\prime})$ is governed by a Hebbian plasticity rule analogous to the one defining the feedforward weights $J(x, x^{\prime})$ (Eq. 6):

$$ \frac{\partial W(x,x^{\prime})}{\partial t} = \eta_{W}[r(x,t)r(x^{\prime},t) - \alpha_{W}r(x^{\prime},t)W(x,x^{\prime})^{\beta}] $$

因此，我们通过引入循环贡献到突触电流来扩展模型。总突触电流 $I(x, t)$ 被重新表述为前馈和循环贡献的总和:

$$ \begin{align} I(x,t) = \rho_{\mathrm{in}}\int_{x^{\prime}}\mathrm{d}x^{\prime}\,J(x,x^{\prime})R(x^{\prime}, t) + \rho_{c}\int_{x^{\prime}}\mathrm{d}x^{\prime}\,W(x,x^{\prime})r(x^{\prime},t), \end{align} $$

其中 $W (x, x^{\prime})$ 表示循环连接矩阵。$W (x, x^{\prime})$ 的时间演化由类似于定义前馈权重 $J(x, x^{\prime})$ 的 Hebbian 可塑性规则（Eq. 6）控制:

$$ \begin{align} \frac{\partial W(x,x^{\prime})}{\partial t} = \eta_{W}[r(x,t)r(x^{\prime},t) - \alpha_{W}r(x^{\prime},t)W(x,x^{\prime})^{\beta}] \end{align} $$

Under the conditions specified in this work (where spatial inputs come from Gaussian tuning curves in the tutor layer), this learning rule gives rise to a Gaussian connectivity profile that we will express as $A_{W}\mathcal{N}(x; x^{\prime},\sigma_{W})$. To ensure structural consistency across the network, we focus on the case $\beta = 0.5$. This specific parameterization allows the system to converge toward feedforward and recurrent connectivity matrices with matched standard deviations, thereby harmonizing the spatial scales of both input sources (see Eq. 13 and Fig. 1B).

在本文中指定的条件下（空间输入来自引导层中的 Gaussian 调谐曲线），该学习规则产生了一个 Gaussian 连接构型，我们将其表示为 $A_{W}\mathcal{N}(x; x^{\prime},\sigma_{W})$。为了确保网络的结构一致性，我们关注 $\beta = 0.5$ 的情况。这种特定的参数化允许系统收敛到具有匹配标准差的前馈和循环连接矩阵，从而协调两个输入源的空间尺度（见 Eq. 13 和图 1B）。

Numerical simulations show that the system can learn structured feedforward and recurrent connectivity. Fig. 5B shows the evolution of the correlation between each learned connectivity matrix and the theoretical prediction. By arguments analogous to the feedforward case, the expected recurrent connectivity is approximately Gaussian with standard deviation equal to $\sigma_{R}$ (for $\beta = 0.5$).

数值模拟表明，系统可以学习结构化的前馈和循环连接。图 5B 显示了每个学习到的连接矩阵与理论预测之间相关性的演变。通过类似于前馈情况的论证，预期的循环连接大约是 Gaussian，其标准差等于 $\sigma_{R}$（对于 $\beta = 0.5$）。

After learning the attractor recurrent connectivity, the system is capable of self-sustaining a moving bump of activity with intrinsic speed vint(m) monotonically dependent on the adaptation strength [20]:

$$ v_{\mathrm{int}}(m) = \frac{\sqrt{2}\sigma_{u}}{\tau_{v}}\sqrt{\frac{m\tau_{v}}{\tau} - \sqrt{\frac{m\tau_{v}}{\tau}}} $$

在学习了吸引子循环连接之后，系统能够自我维持一个活动的移动峰值，其内在速度 $v_{\mathrm{int}}(m)$ 单调依赖于适应强度 [20]:

$$ \begin{align} v_{\mathrm{int}}(m) = \frac{\sqrt{2}\sigma_{u}}{\tau_{v}}\sqrt{\frac{m\tau_{v}}{\tau} - \sqrt{\frac{m\tau_{v}}{\tau}}} \end{align} $$

In this and subsequent sections, we present theoretical results corresponding to a corrected version of Eq. 25. We rescale the theoretical formula by an empirical factor of $0.72$ to account for inaccuracies introduced by the projection method’s Gaussian ansatz. This correction factor was derived from simulations across various combinations of $\tau$ and $\tau_{v}$ (see Appendix A 2 b). Fig. 5C shows that the resulting bump velocities are in good agreement with the corrected theoretical prediction.

在本节和随后的章节中，我们展示了对应于 Eq. 25 的修正版本的理论结果。我们通过经验因子 $0.72$ 对理论公式进行重新缩放，以考虑投影方法的 Gaussian 假设引入的不准确性。该修正因子是通过对 $\tau$ 和 $\tau_{v}$ 的各种组合进行模拟得出的（见附录 A 2 b）。图 5C 显示，结果峰值速度与修正后的理论预测非常一致。

Regarding stability, the complexity of the coupled dynamics makes an analytical treatment challenging. We therefore resort to extensive numerical simulations to identify the parameter region in which excitatory recurrent connectivity can emerge and remain stable. By performing a grid search in the $(\alpha_{J} , \alpha_{W} )$ plane, we find that $\alpha_{W}\geq \alpha_{J}$ is required for the system to effectively learn the feedforward connectivity. Since $\alpha_{J}$ is inversely related to the feedforward weight amplitude $A_{J}$ , and $\alpha_{W}$ to $A_{W}$ (see Eq. 12), this result indicates that the regime in which the system learns to encode space requires the tutor-driven input to dominate over the recurrent signal (see Fig. 6B). Interestingly, recurrent weights can be learned provided that the recurrent signal is not too weak (see Fig. 6C). Moreover, the system fails to maintain stable feedforward connectivity for sufficiently small $\alpha_{J}$ (see Fig. 6B). A plausible explanation for this behavior follows from the approximation underlying the stability analysis: learning must remain slow. If the input current becomes too strong, the time derivative of the weights can increase to the point where the slow-learning-byaveraging condition is violated. Overall, while extreme input strengths compromise stability, the identified optimal regime clearly shows that a network can simultaneously self-organize its feedforward and recurrent architectures, successfully giving rise to self-sustained continuous attractor dynamics.

关于稳定性，耦合动力学的复杂性使得分析处理具有挑战性。因此，我们求助于广泛的数值模拟，以确定兴奋性循环连接可以出现并保持稳定的参数区域。通过在 $(\alpha_{J} , \alpha_{W} )$ 平面上执行网格搜索，我们发现系统有效学习前馈连接需要 $\alpha_{W}\geq \alpha_{J}$。由于 $\alpha_{J}$ 与前馈权重幅度 $A_{J}$ 成反比，而 $\alpha_{W}$ 与 $A_{W}$ 成反比（见 Eq. 12），这一结果表明，系统学习编码空间的区域需要由导师驱动的输入主导循环信号（见图 6B）。有趣的是，只要循环信号不太弱，就可以学习循环权重（见图 6C）。此外，对于足够小的 $\alpha_{J}$，系统无法维持稳定的前馈连接（见图 6B）。这种行为的一个合理解释来自于稳定性分析的近似：学习必须保持缓慢。如果输入电流变得过强，权重的时间导数可能会增加到违反慢学习通过平均条件的程度。总体而言，虽然极端输入强度会损害稳定性，但所识别的最佳区域清楚地表明，网络可以同时自组织其前馈和循环架构，从而成功地产生自维持的连续吸引子动力学。

E. Unidirectional Path Integration

Finally, we assess whether the competitive neural layer can function as a self-organized, unidirectional path integrator. In biological systems, velocity is timevarying; in contrast, the emergent attractor in our model propagates at a speed determined by intrinsic parameters—adaptation strength ($m$), bump width ($\sigma_{u}$), and the characteristic time constants ($\tau , \tau_{v}$). Consequently, the network cannot support accurate path integration under variable velocity conditions.

最后，我们评估竞争神经层是否可以作为自组织的单向路径积分器。在生物系统中，速度是随时间变化的；相比之下，我们模型中出现的吸引子以由内在参数决定的速度传播——适应强度 ($m$)、峰值宽度 ($\sigma_{u}$) 和特征时间常数 ($\tau , \tau_{v}$)。因此，在可变速度条件下，网络无法支持准确的路径积分。

To overcome this limitation, we introduce an external control signal in the form of a spatially uniform background current that encodes the instantaneous speed—an approach similar in spirit to the velocity inputs utilized in established path integration models (see [3] and [32]):

$$ I(x,t) = I_{\mathrm{ff}}(x,t) + I_{\mathrm{rec}}(x,t) + g[v(t)-v_{\mathrm{int}}(m)], $$

where $g$ is a positive gain factor and $v(t)$ is the external speed (which can vary dynamically in time).

为了克服这一限制，我们引入了一个外部控制信号，以空间均匀的背景电流的形式编码瞬时速度——这种方法在精神上类似于已建立的路径积分模型中使用的速度输入（见 [3] 和 [32]）:

$$ \begin{align} I(x,t) = I_{\mathrm{ff}}(x,t) + I_{\mathrm{rec}}(x,t) + g[v(t)-v_{\mathrm{int}}(m)], \end{align} $$

其中 $g$ 是一个正增益因子，$v(t)$ 是外部速度（可以随时间动态变化）。

In what follows, we exploit the analytical tractability of the model to provide a mechanistic account of how this input rescales the effective self-sustained speed of the bump of the attractor, thereby enabling controlled modulation of its propagation speed.

在接下来的内容中，我们利用模型的解析可处理性，提供了一个机制性的解释，说明这种输入如何重新调整吸引子峰值的有效自维持速度，从而实现其传播速度的受控调制。

We analyze the system’s intrinsic dynamics in the absence of spatial feedforward input ($I_{\mathrm{ff}} = 0$). To derive a theoretical understanding of the motion, we employ the projection method. Unless otherwise noted, for sake of simplicity we assume a constant external speed $v$. We later show through numerical simulations that our results generalize for varying speeds. We introduce a modified ansatz that incorporates the effect of the global input current by allowing for a uniform baseline shift. Concretely, we propose that the solutions retain the shape of the original equilibrium profiles, but are offset by a constant term that depends on the input speed:

$$ U_{\epsilon}^{\mathrm{eq}}(x,z) &= U^{\mathrm{eq}}(x,z) + \epsilon_{u}I_{\mathrm{speed}},\\ V_{\epsilon}^{\mathrm{eq}}(x,z) &= V^{\mathrm{eq}}(x,z) + \epsilon_{v}I_{\mathrm{speed}}, $$

where $U^{\mathrm{eq}}(x,z)$ and $V^{\mathrm{eq}}(x,z)$ denote equilibrium solutions of the form defined in Eqs. 7 and 8. The scaling factors $\epsilon_{u}$ and $\epsilon_{v}$ are constrained by the steady-state conditions of the network dynamics:

$$ \epsilon_{u} + \epsilon_{v} = 1,\quad \epsilon_{v} = m\epsilon_{u}. $$

我们分析了在没有空间前馈输入的情况下系统的内在动力学 ($I_{\mathrm{ff}} = 0$)。为了推导运动的理论理解，我们采用投影方法。除非另有说明，为了简化起见，我们假设外部速度 $v$ 是恒定的。我们稍后通过数值模拟表明，我们的结果可以推广到变化速度。我们引入了一个修改后的假设，通过允许均匀基线偏移来纳入全局输入电流的影响。具体来说，我们提出解保持原始平衡构型的形状，但由一个依赖于输入速度的常数项偏移:

$$ \begin{align} U_{\epsilon}^{\mathrm{eq}}(x,z) &= U^{\mathrm{eq}}(x,z) + \epsilon_{u}I_{\mathrm{speed}},\\ V_{\epsilon}^{\mathrm{eq}}(x,z) &= V^{\mathrm{eq}}(x,z) + \epsilon_{v}I_{\mathrm{speed}}, \end{align} $$

其中 $U^{\mathrm{eq}}(x,z)$ 和 $V^{\mathrm{eq}}(x,z)$ 表示 Eqs. 7 和 8 中定义的形式的平衡解。缩放因子 $\epsilon_{u}$ 和 $\epsilon_{v}$ 受网络动力学稳态条件的约束:

$$ \begin{align} \epsilon_{u} + \epsilon_{v} = 1,\quad \epsilon_{v} = m\epsilon_{u}. \end{align} $$

Abstract#

Introduction#

Results#

A. The Model with Feedforward Connections#

B. Self Organized Feedforward Connectivity#

1. Gaussian shape#

2. Weight stability#

C. Bump Asymmetry and Anticipative Coding#

D. Learning Recurrent Weights#

1. Stability Analysis#

E. Unidirectional Path Integration#