Abstract

One approach to invariant object recognition employs a recurrent neural network as an associative memory. In the standard depiction of the network's state space, memories of objects are stored as attractive fixed points of the dynamics.

一种用于不变物体识别的方法是使用循环神经网络作为联想记忆。在该网络状态空间的标准描述中，物体的记忆被存储为动力学的吸引不动点。

I argue for a modification of this picture: if an object has a continuous family of instantiations, it should be represented by a continuous attractor. This idea is illustrated with a network that learns to complete patterns.

To perform the task of filling in missing information, the network develops a continuous attractor that models the manifold from which the patterns are drawn.

我主张对这一图景进行修改：如果一个物体有一系列连续的实例，它应该由一个连续吸引子来表示。这个想法通过一个学习完成模式的网络得到了说明。

为了执行填充缺失信息的任务，网络发展出一个连续吸引子，该吸引子模拟了模式来源的流形。

From a statistical viewpoint, the pattern completion task allows a formulation of unsupervised learning in terms of regression rather than density estimation.

从统计学的角度来看，模式完成任务允许以回归而非密度估计的方式来表述无监督学习。

Introduction

A classic approach to invariant object recognition is to use a recurrent neural network as an associative memory. In spite of the intuitive appeal and biological plausibility of this approach, it has largely been abandoned in practical applications. This paper introduces two new concepts that could help resurrect it: object representation by continuous attractors, and learning attractors by pattern completion.

一个经典的不变物体识别方法是使用循环神经网络作为联想记忆。尽管这种方法具有直观的吸引力和生物学上的合理性，但在实际应用中它已被大体上放弃。本文介绍了两个可能有助于复兴它的新概念：通过连续吸引子表示物体，以及 通过模式完成学习吸引子。

In most models of associative memory, memories are stored as attractive fixed points at discrete locations in state space. Discrete attractors may not be appropriate for patterns with continuous variability, like the images of a three-dimensional object from different viewpoints. When the instantiations of an object lie on a continuous pattern manifold, it is more appropriate to represent objects by attractive manifolds of fixed points, or continuous attractors.

在大多数联想记忆模型中，记忆被存储为状态空间中离散位置的吸引不动点。对于具有连续变异性的模式，如从不同视角看到的三维物体的图像，离散吸引子可能不适用。当一个物体的实例位于一个连续模式流形上时，更适合通过不动点的吸引流形或连续吸引子来表示物体。

To make this idea practical, it is important to find methods for learning attractors from examples. A naive method is to train the network to retain examples in shortterm memory. This method is deficient because it does not prevent the network from storing spurious fixed points that are unrelated to the examples.

A superior method is to train the network to restore examples that have been corrupted, so that it learns to complete patterns by filling in missing information.

为了使这个想法具有实用性，找到从示例中学习吸引子的方法是很重要的。一种朴素的方法是训练网络在短期记忆中保留示例。这种方法存在缺陷，因为它不能防止网络存储与示例无关的 虚假不动点。

一种更优的方法是训练网络恢复已被破坏的示例，从而学习通过填充缺失信息来完成模式。

Learning by pattern completion can be understood from both dynamical and statistical perspectives. Since the completion task requires a large basin of attraction around each memory, spurious fixed points are suppressed.

The completion task also leads to a formulation of unsupervised learning as the regression problem of estimating functional dependences between variables in the sensory input.

通过模式完成学习可以从动力学和统计学的角度来理解。由于完成任务要求每个记忆周围有一个大的吸引域，因此抑制了虚假不动点。

完成任务还导致了将无监督学习表述为估计感官输入中变量之间函数依赖关系的回归问题。

Density estimation, rather than regression, is the dominant formulation of unsupervised learning in stochastic neural networks like the Boltzmann machine. Density estimation has the virtue of suppressing spurious fixed points automatically, but it also has the serious drawback of being intractable for many network architectures.

Regression is a more tractable, but nonetheless powerful, alternative to density estimation.

密度估计，而不是回归，是像 Boltzmann 机这样的随机神经网络中无监督学习的主要表述。密度估计具有自动抑制虚假不动点的优点，但它也有一个严重的缺点，即对于许多网络架构来说是不可行的。

回归是密度估计的一个更易处理但仍然强大的替代方案。

In a number of recent neurobiological models, continuous attractors have been used to represent continuous quantities like eye position, direction of reaching, head direction, and orientation of a visual stimulus. Along with these models, the present work is part of a new paradigm for neural computation based on continuous attractors.

在一些最近的神经生物学模型中，连续吸引子已被用来表示连续量，如眼睛位置、伸手方向、头部方向和视觉刺激的方向。与这些模型一起，本文工作是基于连续吸引子的神经计算新范式的一部分。

Discrete versus continuous attractors

Figure 1 depicts two ways of representing objects as attractors of a recurrent neural network dynamics. The standard way is to represent each object by an attractive fixed point, as in Figure 1a. Recall of a memory is triggered by a sensory input, which sets the initial conditions. The network dynamics converges to a fixed point, thus retrieving a memory. If different instantiations of one object lie in the same basin of attraction, they all trigger retrieval of the same memory, resulting in the many-to-one map required for invariant recognition.

图 1 描述了两种将物体表示为循环神经网络动力学吸引子的方式。标准方式是通过吸引不动点表示每个物体，如图 1a 所示。记忆的回忆由感官输入触发，该输入设置初始条件。网络动力学收敛到一个不动点，从而检索记忆。如果一个物体的不同实例位于同一吸引域中，它们都会触发相同记忆的检索，从而产生不变识别所需的多对一映射。

In Figure 1b, each object is represented by a continuous manifold of fixed points. A one-dimensional manifold is shown, but generally the attractor should be multidimensional, and is parametrized by the instantiation or pose parameters of the object . For example, in visual object recognition, the coordinates would include the viewpoint from which the object is seen.

在图 1b 中，每个物体由不动点的连续流形表示。显示了一维流形，但通常吸引子应该是多维的，并且由物体的实例化或姿态参数参数化。例如，在视觉物体识别中，坐标将包括观察物体的视角。

The reader should be cautioned that the term "continuous attractor" is an idealization and should not be taken too literally. In real networks, a continuous attractor is only approximated by a manifold in state space along which drift is very slow. This is illustrated by a simple example, a descent dynamics on a trough-shaped energy landscape. If the bottom of the trough is perfectly level, it is a line of fixed points and an ideal continuous attractor of the dynamics. However, any slight imperfections cause slow drift along the line. This sort of approximate continuous attractor is what is found in real networks, including those trained by the learning algorithms to be discussed below.

读者应注意，“连续吸引子”一词是一种理想化，不应过于字面理解。在真实网络中，连续吸引子仅由状态空间中的一个流形近似表示，在该流形上漂移非常缓慢。这通过一个简单的例子来说明，即在槽形能量景观上的下降动力学。如果槽底完全平坦，它就是不动点的线，是动力学的理想连续吸引子。然而，任何轻微的不完美都会导致沿着这条线的缓慢漂移。这种近似连续吸引子就是在真实网络中发现的，包括那些通过下面将讨论的学习算法训练的网络。

Dynamics of memory retrieval

The preceding discussion has motivated the idea of representing pattern manifolds by continuous attractors. This idea will be further developed with the simple network shown in Figure 2a, which consists of a visible layer $x_{1}\in \mathbb{R}^{n_{1}}$ and a hidden layer $x_{2}\in \mathbb{R}^{n_{2}}$. The architecture is recurrent, containing both bottom-up connections (the $n_{2}\times n_{1}$ matrix $W_{21}$ and top-down connections (the $n_{1} \times n_{2}$ matrix $W_{12}$). The vectors $b_{1}$ and $b_{2}$ represent the biases of the neurons. The neurons have a rectification nonlinearity $[x]_{+} = \max\{x, 0\}$, which acts on vectors component by component.

前面的讨论激发了通过连续吸引子表示模式流形的想法。这个想法将在图 2a 所示的简单网络中进一步发展，该网络由一个可见层 $x_{1}\in \mathbb{R}^{n_{1}}$ 和一个隐藏层 $x_{2}\in \mathbb{R}^{n_{2}}$ 组成。该架构是循环的，包含自底向上的连接（$n_{2}\times n_{1}$ 矩阵 $W_{21}$）和自上而下的连接（$n_{1} \times n_{2}$ 矩阵 $W_{12}$）。向量 $b_{1}$ 和 $b_{2}$ 表示神经元的偏置。神经元具有整流非线性 $[x]_{+} = \max\{x, 0\}$，它对向量逐分量作用。

There are many variants of recurrent network dynamics: a convenient choice is the following discrete-time version, in which updates of the hidden and visible layers alternate in time. After the visible layer is initialized with the input vector $x_{1}(0)$, the dynamics evolves as

$$ \begin{align*} x_{2}(t) &= [b_{2} + W_{21}x_{1}(t-1)]_{+} \\ x_{1}(t) &= [b_{1} + W_{12}x_{2}(t)]_{+} \end{align*} $$

If memories are stored as attractors, iteration of this dynamics can be regarded as memory retrieval.

有许多循环网络动力学的变体：一个方便的选择是以下离散时间版本，其中隐藏层和可见层的更新交替进行。在可见层用输入向量 $x_{1}(0)$ 初始化后，动力学演化如下：

$$ \begin{align*} x_{2}(t) &= [b_{2} + W_{21}x_{1}(t-1)]_{+} \\ x_{1}(t) &= [b_{1} + W_{12}x_{2}(t)]_{+} \end{align*} $$

如果记忆被存储为吸引子，则该动力学的迭代可以被视为记忆检索。

Activity circulates around the feedback loop between the two layers. One iteration of this loop is the map $x_{1}(t-1)\to x_{2}(t) \to x_{1}(t)$. This single iteration is equivalent to the feedforward architecture of Figure 2b. In the case where the hidden layer is smaller than the visible layers, this architecture is known as an autoencoder network. Therefore the recurrent network dynamics (1) is equivalent to repeated iterations of the feedforward autoencoder. This is just the standard trick of unfolding the dynamics of a recurrent network in time, to yield an equivalent feedforward network with many layers. Because of the close relationship between the recurrent network of Figure 2a and the autoencoder of Figure 2b, it should not be surprising that learning algorithms for these two networks are also related, as will be explained below.

活动在两层之间的反馈循环中循环。这个循环的一次迭代是映射 $x_{1}(t-1)\to x_{2}(t) \to x_{1}(t)$。这个单次迭代等价于图 2b 的前馈架构。在隐藏层比可见层小的情况下，这个架构被称为自动编码器网络。因此，循环网络动力学 (1) 等价于前馈自动编码器的重复迭代。这只是展开循环网络动力学在时间上的标准技巧，以产生一个具有多层的等效前馈网络。由于图 2a 的循环网络和图 2b 的自动编码器之间的密切关系，不应该感到惊讶的是，这两个网络的学习算法也相关，如下所述。

Learning to retain patterns

Little trace of an arbitrary input vector $x_{1}(0)$ remains after a few time steps of the dynamics (1). However, the network can retain some input vectors in short-term memory as "reverberating" patterns of activity. These correspond to fixed points of the dynamics (1); they are patterns that do not change as activity circulates around the feedback loop.

在动力学 (1) 的几个时间步之后，任意输入向量 $x_{1}(0)$ 的痕迹几乎消失。然而，网络可以将一些输入向量保留在短期记忆中，作为“回响”的活动模式。这些对应于动力学 (1) 的不动点；它们是随着活动在反馈循环中循环而不改变的模式。

This suggests a formulation of learning as the optimization of the network's ability to retain examples in short-term memory. Then a suitable cost function is the squared difference $|x_{1}(T)-x_{1}(0)|^{2}$ between the example pattern $x_{1}(0)$ and the network's short-term memory $x_{1}(T)$ of it after $T$ time steps. Gradient descent on this cost function can be done via backpropagation through time.

这表明可以将学习表述为优化网络在短期记忆中保留示例的能力。那么，一个合适的代价函数是示例模式 $x_{1}(0)$ 与网络在 $T$ 个时间步后对其的短期记忆 $x_{1}(T)$ 之间的平方差 $|x_{1}(T)-x_{1}(0)|^{2}$。可以通过时间反向传播对这个代价函数进行梯度下降。

If the network is trained with patterns drawn from a continuous family, then it can learn to perform the short-term memory task oy developing a continuous attractor that lies near the examples it is trained on. When the hidden layer is smaller than the visible layer, the dimensionality of the attractor is limited by the size of the hidden layer.

如果网络使用从连续族中抽取的模式进行训练，那么它可以通过开发一个位于其训练示例附近的连续吸引子来学习执行短期记忆任务。当隐藏层比可见层小时，吸引子的维数受隐藏层大小的限制。

For the case of a single time step ($T = 1$), training the recurrent network of Figure 2a to retain patterns is equivalent to training the autoencoder of Figure 2b by minimizing the squared difference between its input and output layers, averaged over the examples. From the information theoretic perspective, the small hidden layer in Figure 2b acts as a bottleneck between the input and output layers, forcing the autoencoder to learn an efficient encoding of the input.

对于单个时间步的情况（$T = 1$），训练图 2a 的循环网络以保留模式等价于通过最小化其输入和输出层之间的平方差（在示例上取平均）来训练图 2b 的自动编码器。从信息论的角度来看，图 2b 中的小隐藏层充当输入和输出层之间的瓶颈，迫使自动编码器学习输入的有效编码。

For the special case of a linear network, the nature of the learned encoding is understood completely. Then the input and output vectors are related by a simple matrix multiplication. The rank of the matrix is equal to the number of hidden units. The average distortion is minimized when this matrix becomes a projection operator onto the subspace spanned by the principal components of the examples.

对于线性网络的特殊情况，学习编码的性质是完全理解的。然后输入和输出向量通过简单的矩阵乘法相关联。矩阵的秩等于隐藏单元的数量。当该矩阵成为投影算子到由示例的主成分所张成的子空间时，平均失真最小化。

From the dynamical perspective, the principal subspace is a continuous attractor of the dynamics (1). The linear network dynamics converges to this attractor in a single iteration, starting from any initial condition. Therefore we can interpret principal component analysis and its variants as methods of learning continuous attractors.

从动力学的角度来看，主子空间是动力学 (1) 的连续吸引子。线性网络动力学从任何初始条件开始，在单次迭代中收敛到这个吸引子。因此，我们可以将主成分分析及其变体解释为学习连续吸引子的方法。

Learning to complete patterns

Learning to retain patterns in short-term memory only works properly for architectures with a small hidden layer. The problem with a large hidden layer is evident when the hidden and visible layers are the same size, and the neurons are linear. Then the cost function for learning can be minimized by setting the weight matrices equal to the identity, $W_{21} = W_{12} = I$. For this trivial minimum, every input vector is a fixed point of the recurrent network (Figure 2a), and the equivalent feedforward network (Figure 2b) exactly realizes the identity map. Clearly these networks have not learned anything.

在短期记忆中学习保留模式仅适用于具有小隐藏层的架构。当隐藏层和可见层大小相同时，且神经元是线性的，这个问题就很明显了。然后，学习的代价函数可以通过将权重矩阵设置为单位矩阵 $W_{21} = W_{12} = I$ 来最小化。对于这个平凡的最小值，每个输入向量都是循环网络（图 2a）的不动点，并且等效的前馈网络（图 2b）完全实现了恒等映射。显然，这些网络没有学到任何东西。

Therefore in the case of a large hidden layer, learning to retain patterns is inadequate. Without the bottleneck in the architecture, there is no pressure on the feedforward network to learn an efficient encoding. Without constraints on the dimension of the attractor, the recurrent network develops spurious fixed points that have nothing to do with the examples.

因此，在隐藏层较大的情况下，学习保留模式是不充分的。没有架构中的瓶颈，前馈网络没有压力去学习有效的编码。没有对吸引子维数的约束，循环网络会发展出与示例无关的虚假不动点。

These problems can be solved by a different formulation of learning based on the task of pattern completion. In the completion task of Figure 3a, the network is initialized with a corrupted version of an example. Learning is done by minimizing the completion error, which is the squared difference $|x_{1}(T)-d|^{2}$ between the uncorrupted pattern $d$ and the final visible vector $x_{1}(T)$. Gradient descent on completion error can be done with backpropagation through time.

通过基于模式完成任务的不同学习表述可以解决这些问题。在图 3a 的完成任务中，网络用一个示例的损坏版本初始化。通过最小化完成误差来进行学习，完成误差是未损坏模式 $d$ 与最终可见向量 $x_{1}(T)$ 之间的平方差 $|x_{1}(T)-d|^{2}$。可以通过时间反向传播对完成误差进行梯度下降。

This new formulation of learning eliminates the trivial identity map solution mentioned above: while the identity network can retain any example, it cannot restore corrupted examples to their pristine form. The completion task forces the network to enlarge the basins of attraction of the stored memories, which suppresses spurious fixed points. It also forces the network to learn associations between variables in the sensory input.

这种新的学习表述消除了上述提到的平凡恒等映射解：虽然恒等网络可以保留任何示例，但它无法将损坏的示例恢复到原始形式。完成任务迫使网络扩大存储记忆的吸引域，从而抑制虚假不动点。它还迫使网络学习感官输入中变量之间的关联。

Locally connected architecture

Experiments were conducted with images of handwritten digits from the USPS database described in [12]. The example images were $16\times 16$, with a gray scale ranging from $0$ to $1$. The network was trained on a specific digit class, with the goal of learning a single pattern manifold. Both the network architecture and the nature of the completion task were chosen to suit the topographic structure present in visual images.

使用 [12] 中描述的 USPS 数据库中的手写数字图像进行了实验。示例图像为 $16\times 16$，灰度范围从 $0$ 到 $1$。网络在特定数字类别上进行训练，目标是学习单一模式流形。网络架构和完成任务的性质都被选择以适应视觉图像中存在的拓扑结构。

The network architecture was given a topographic organization by constraining the synaptic connectivity to be local, as shown in Figure 4a. Both the visible and hidden layers of the network were $16\times 16$. The visible layer represented an image, while the hidden layer was a topographic feature map. Each neuron had $5\times 5$ receptive and projective fields, except for neurons near the edges, which had more restricted connectivity.

网络架构通过限制突触连接为局部来获得拓扑组织，如图 4a 所示。网络的可见层和隐藏层都是 $16\times 16$。可见层表示图像，而隐藏层是一个拓扑特征图。每个神经元都有 $5\times 5$ 的感受野和投射场，除了边缘附近的神经元，它们的连接更受限制。

In the pattern completion task, example images were corrupted by zeroing the pixels inside a $9\times 9$ patch chosen at a random location, as shown in Figure 3a. The location of the patch was randomized for each presentation of an example. The size of the patch was a substantial fraction of the $16\times 16$ image, and much larger than the $5\times 5$ receptive field size. This method of corrupting the examples gave the completion task a topographic nature, because it involved a set of spatially contiguous pixels. This topographic nature would have been lacking if the examples had been corrupted by, for example, the addition of spatially uncorrelated noise.

在模式完成任务中，示例图像通过将随机位置的 $9\times 9$ 补丁内的像素置零来损坏，如图 3a 所示。每次呈现示例时，补丁的位置都是随机的。补丁的大小是 $16\times 16$ 图像的一个重要部分，并且远大于 $5\times 5$ 的感受野大小。这种损坏示例的方法赋予了完成任务一个拓扑性质，因为它涉及一组空间连续的像素。如果示例被损坏，例如通过添加空间不相关的噪声，这种拓扑性质就会缺失。

Figure 3b illustrates the dynamics of pattern completion performed by a network trained on examples of the digit class "two." The network is initialized with a corrupted example of a "two." After the first itex:ation of the dynamics, the image is partially restored. The second iteration leads to superior restoration, with further sharpening of the image. The "filling in" phenomenon is also evident in the hidden layer.

图 3b 说明了一个在数字类别 “二” 的示例上训练的网络执行模式完成的动力学。网络用一个损坏的 “二” 的示例初始化。第一次迭代后，图像部分恢复。第二次迭代导致更好的恢复，图像进一步锐化。“填充” 现象在隐藏层中也很明显。

The network was first trained on a retrieval dynamics of one iteration. The resulting biases and synaptic weights were then used as initial conditions for training on a retrieval dynamics of two iterations. The hidden layer developed into a topographic feature map suitable for representing images of the digit "two." Figure 4b depicts the bottom-up receptive fields of the 256 hidden neurons. The top-down projective fields of these neurons were similar, but are not shown.

网络首先在一次迭代的检索动力学上进行训练。然后将得到的偏置和突触权重用作在两次迭代的检索动力学上训练的初始条件。隐藏层发展成为适合表示数字 “二” 图像的拓扑特征图。图 4b 描述了 $256$ 个隐藏神经元的自底向上的感受野。这些神经元的自上而下投射场类似，但未显示。

This feature map is distinct from others because of its use of top-down and bottom-up connections in a feedback loop. The bottom-up connections analyze images into their constituent features, while the top-down connections synthesize images by composing features. The features in the top-down connections can be regarded as a "vocabulary" for synthesis of images. Since not all combinations of features are proper patterns, there must be some "grammatical" constraints on their combination. The network's ability to complete patterns suggests that some of these constraints are embedded in the dynamical equations of the network. Therefore the relaxation dynamics (1) can be regarded as a process of massively parallel constraint satisfaction.

这个特征图因其在反馈循环中使用自上而下和自底向上的连接而与众不同。自底向上的连接将图像分析为其组成特征，而自上而下的连接通过组合特征合成图像。自上而下连接中的特征可以被视为合成图像的 “词汇”。由于并非所有特征的组合都是适当的模式，因此它们的组合必须有一些 “语法” 约束。网络完成模式的能力表明，其中一些约束嵌入在网络的动力学方程中。因此，松弛动力学 (1) 可以被视为一个大规模并行约束满足的过程。

Conclusion

I have argued that continuous attractors are a natural representation for pattern manifolds. One method of learning attractors is to train the network to retain examples in short-term memory. This method is equivalent to autoencoder learning, and does not work if the number of hidden units is large. A better method is to train the network to complete patterns. For a locally connected network, this method was demonstrated to learn a topographic feature map. The trained network is able to complete patterns, indicating that syntactic constraints on the combination of features are embedded in the network dynamics.

我认为连续吸引子是模式流形的自然表示。学习吸引子的一种方法是训练网络在短期记忆中保留示例。这种方法等价于自动编码器学习，并且当隐藏单元数量较大时不起作用。更好的方法是训练网络完成模式。对于局部连接的网络，已经证明这种方法可以学习一个拓扑特征图。训练好的网络能够完成模式，这表明对特征组合的语法约束嵌入在网络动力学中。

Empirical evidence that the network has indeed learned a continuous attractor is obtained by local linearization of the network (1). The linearized dynamics has many eigenvalues close to unity, indicating the existence of an approximate continuous attractor. Learning with an increased number of iterations in the retrieval dynamics should improve the quality of the approximation.

通过对网络 (1) 的局部线性化获得了网络确实学习了连续吸引子的经验证据。线性化的动力学有许多接近于一的特征值，表明存在一个近似的连续吸引子。在检索动力学中增加迭代次数的学习应该会提高近似的质量。

There is only one aspect of the learning algorithm that is specifically tailored for continuous attractors. This aspect is the limitation of the retrieval dynamics (1) to a few iterations, rather than iterating it all the way to a true fixed point. As mentioned earlier, a continuous attractor is only an idealization; in a real network it does not consist of true fixed points, but is just a manifold to which relaxation is fast and along which drift is slow. Adjusting the shape of this manifold is the goal of learning; the exact locations of the true fixed points are not relevant.

学习算法中只有一个方面是专门针对连续吸引子设计的。这个方面是将检索动力学 (1) 限制在几次迭代，而不是一直迭代到一个真正的不动点。如前所述，连续吸引子只是一个理想化；在真实网络中，它不由真正的不动点组成，而只是一个弛豫快速且漂移缓慢的流形。调整这个流形的形状是学习的目标；真正不动点的确切位置并不相关。

The use of a fast retrieval dynamics removes one long-standing objection to attractor neural networks, which is that true convergence to a fixed point takes too long. If all that is desired is fast relaxation to an approximate continuous attractor, attractor neural networks are not much slower than feedforward networks.

快速检索动力学消除了对吸引子神经网络的一个长期存在的反对意见，即真正收敛到一个不动点需要太长时间。如果所需的只是快速弛豫到一个近似的连续吸引子，那么吸引子神经网络并不比前馈网络慢多少。

In the experiments discussed here, learning was done with backpropagation through time. Contrastive Hebbian learning is a simpler alternative. Part of the image is held clamped, the missing values are filled in by convergence to a fixed point, and an anti-Hebbian update is made. Then the missing values are clamped at their correct values, the network converges to a new fixed point, and a Hebbian update is made. This procedure has the disadvantage of requiring true convergence to a fixed point, which can take many iterations. It also requires symmetric connections, which may be a representational handicap.

在这里讨论的实验中，学习是通过时间反向传播完成的。对比 Hebbian 学习是一种更简单的替代方法。部分图像被固定，缺失的值通过收敛到一个不动点来填充，然后进行反 Hebbian 更新。然后将缺失的值固定在正确的值上，网络收敛到一个新的不动点，并进行 Hebbian 更新。这种程序的缺点是需要真正收敛到一个不动点，这可能需要很多迭代。它还需要对称连接，这可能是一个表示上的障碍。

This paper addressed only the learning of a single attractor to represent a single pattern manifold. The problem of learning multiple attractors to represent mUltiple pattern classes will be discussed elsewhere, along with the extension to network architectures with many layers.

这篇论文只讨论了学习一个吸引子来表示一个模式流形的问题。学习多个吸引子来表示多个模式类别的问题将另行讨论，以及扩展到具有多层的网络架构。

Abstract#

Introduction#

Discrete versus continuous attractors#

Dynamics of memory retrieval#

Learning to retain patterns#

Learning to complete patterns#

Locally connected architecture#

Conclusion#