We are searching data for your request:

**Forums and discussions:**

**Manuals and reference books:**

**Data from registers:**

**Wait the end of the search in all databases.**

Upon completion, a link will appear to access the found materials.

Upon completion, a link will appear to access the found materials.

I am reading this paper on fear conditioning, where the following is given:

- The n-dimensional population vector (activity of n neurons) evoked by the conditioned stimulus (CS+, auditory tone)
**before**conditioning - The n-dimensional population vector (activity of n neurons) for the CS+
**after**conditioning

The authors write that: "[After conditioning] the CS+ population vector rotated out of the plane defined by the US [unconditioned stimulus] and the initial CS+". They conclude that this out-of-plane rotation corresponds to "new learning".

**My question**: Why is out-of-plane-rotation of neural activity vectors equated with a new representation/new learning?

They are not writing about "after conditioning" as you write, but, importantly, **after extinction**.

What they observe is that, during learning, the response to the conditioned stimulus (CS+) becomes more similar to the response to the US. In the n-dimensional space in which a population response lives, you can describe this as the population vector evoked by CS+ pointing closer to same direction as the population vector from the US.

We expect that during extinction, the CS+ vector will become different from the US compared to at the peak of conditioning. There are two general possibilities for how that could happen. The first possibility (perhaps the simplest one to expect) is that with extinction the CS+ vector returns to where it was initially, before the conditioned stimulus was paired with the US. You could describe this as merely *forgetting* what has been learned: responses to the US and CS go back to how they were before conditioning and it's as if nothing was ever learned.

However, that's not what these authors observe:

During within-session extinction, the CS+ representation did not revert and gained no more similarity to its initial representation before learning

Although they do observe that with extinction the CS+ vector moves away from the US, it doesn't move back towards the old response, it points in a new direction that is further both from the US response and from the original CS response. That can be described as an "out of plane" rotation, not towards either of the other vectors.

In case it's hard to think about these things in N-dimensional space, we can think of a more familiar situation in our own 3-dimensional world.

Let's say the US causes a response corresponding to "12 o'clock". The conditioned stimulus causes a response corresponding to "3 o'clock" before any training. As you do some conditioning training, pairing the US and CS, the CS response moves towards "1 o'clock", beginning to resemble the US response. During extinction, we might expect the CS response to move back towards "3 o'clock", maybe we'll find it at "2 o'clock". This would represent rotation in the "plane of the CS and US" made by the clock. Instead, the hand points out away from the wall, so it is further from both the original US at 12 and original CS at 3. That's the "new learning".

## Author Summary

The question of how the brain generates movement has been extensively studied, yet multiple competing models exist. Representational approaches relate the activity of single neurons to movement parameters such as velocity and position, approaches useful for the decoding of movement intentions, while the dynamical systems approach predicts that neural activity should evolve in a predictable way based on population activity. Existing representational models cannot reproduce the recent finding in monkeys that predictable rotational patterns underlie motor cortex activity during reach initiation, a finding predicted by a dynamical model in which muscle activity is a direct combination of neural population rotations. However, previous simulations did not consider an essential aspect of representational models: variable time offsets between neurons and kinematics. Whereas these offsets reveal rotational patterns in the model, these rotations are statistically different from those observed in the brain and predicted by a dynamical model. Importantly, a simple recurrent neural network model also showed rotational patterns statistically similar to those observed in the brain, supporting the idea that dynamical systems-based approaches may provide a powerful explanation of motor cortex function.

**Citation:** Michaels JA, Dann B, Scherberger H (2016) Neural Population Dynamics during Reaching Are Better Explained by a Dynamical System than Representational Tuning. PLoS Comput Biol 12(11): e1005175. https://doi.org/10.1371/journal.pcbi.1005175

**Editor:** Byron Yu, Carnegie Mellon University, UNITED STATES

**Received:** March 31, 2016 **Accepted:** September 24, 2016 **Published:** November 4, 2016

**Copyright:** © 2016 Michaels et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability:** All relevant modeling data are within the paper and its supporting information files. We also compared our model with third party data (Churchland et al, 2012 http://doi.org/10.1038/nature11129) that was downloaded from: http://churchlandlab.neuroscience.columbia.edu/links.html. This data can be accessed there or by contacting the authors of the original study: Mark Churchland (email: [email protected]) Krishna Shenoy (email: [email protected]).

**Funding:** This work was supported by Deutsche Forschungsgemeinschaft (SCHE 1575/1-1 & SFB 889, Project C09). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests:** The authors have declared that no competing interests exist.

## Introduction

Our current understanding of sensory information processing in the brain has been built heavily upon the observations of neurons that respond only to particular sensory inputs. One classical and common conception is that neurons with higher selectivity, i.e., responding to a narrower range of sensory input parameters, provide more information about sensory input [1–4]. In auditory cortex, selectivity of single neurons increases as spiking evolves following the onset of a stationary sound stimulus. Specifically, during the first tens to hundreds of milliseconds following stimulus onset, neurons tend to respond to a large range of sounds. Average spiking rates are high during this transient epoch and settle to lower sustained values during the later epoch. Differential adaptation leads to many fewer neurons active during the sustained response epoch, resulting in a sparser representation and, correspondingly, a neuron population of higher selectivity [5–8]. This feature of neuronal adaptation following stimulus onset is shared among multiple sensory systems [9–11]. Based on this selectivity difference, the current theory regarding neuronal responses in auditory cortex posits that the relatively unselective (i.e., dense) onset responses predominately encode the presence of sounds whereas the more selective (i.e., sparse) sustained responses predominately encode the identity of sounds [5, 12, 13].

The approach of interpreting informativeness based on neuronal response selectivity, although intuitively appealing, contains potential caveats upon closer consideration. For example, due to the limited size of individual neuronal receptive fields, stimulus information must be encoded collectively by multiple neurons [14, 15]. Therefore, a larger population of highly selective neurons would be required to encode an entire stimulus space compared to less selective neurons.

Current theory predicts that the sustained response epoch of auditory cortical responses should carry more information on average than the onset response epoch because stimulus identification is a less constrained task than detection. Here, we tested this prediction by investigating the sound encoding dynamics of a population of 171 neurons in marmoset primary auditory cortex (A1). Our results contradicted current theory and implied a novel sensory encoding principle potentially applicable to other sensory systems.

## Results

We trained two rhesus monkeys to modulate neural activity to drive movements of a computer cursor to hit targets in a two-dimensional workspace (Figure 1B). The family of BMI mappings that we used is represented by:

where x t is the cursor state (position and velocity), u t comprises the recorded M1 activity, and A , B , and b are the parameters of the BMI mapping. All experiments began with a closed-loop calibration of an *intuitive BMI mapping*, which was designed to provide proficient control on par with the majority of studies in the field (Serruya et al., 2002 Velliste et al., 2008 Ganguly and Carmena, 2009 Suminski et al., 2010 Hauschild et al., 2012 Ifft et al., 2013 Sadtler et al., 2014). Subjects indeed demonstrated proficient and stable control of the BMI, with success rates of nearly 100%, and movement times on average faster than one second (Figure 1—figure supplement 1).

The BMI provides an ideal paradigm for studying internal models because it simplifies several key complexities of native limb control. First, native limb control involves effectors with non-linear dynamics, and the causal relationship between the recorded neural activity and limb movements is not completely understood. In contrast, the causal relationship between recorded neural activity and BMI cursor movements is completely specified by the experimenter (through A , B and b in Equation 1), and can be chosen to be linear (as in Equation 1). Second, native limb control involves multiple modalities of sensory feedback (e.g., proprioception and vision), which makes it difficult for the experimenter to know how the subject combines sources of sensory information. In the BMI, task-relevant sensory feedback is limited to a single modality (vision), which is completely specified by the experimenter ( x t in Equation 1). Finally, the neural activity that directly drives the BMI is completely specified by the recorded population activity ( u t in Equation 1), whereas typically only a subset of neurons driving limb movements is recorded. We can thereby reinterpret the full set of BMI control signals using an internal model in a more concrete manner than is currently possible with limb movements.

### Subjects compensate for sensory feedback delays while controlling a BMI

Because internal models have not previously been studied in a BMI context, we sought evidence of internal prediction. A hallmark of internal prediction is compensation for sensory feedback delays (Miall et al., 2007 Shadmehr et al., 2010 Farshchiansadegh et al., 2015). To assess the visuomotor latency experienced by a subject in our BMI system, we measured the elapsed time between target onset and the appearance of target-related activity in the recorded neural population (Figure 2A). The delays we measured ( 100 ms, monkey A 133 ms, monkey C) are consistent with visuomotor latencies reported in arm reaching studies of single-neurons in primary motor cortex (Schwartz et al., 1988). Next, we asked whether subjects produced motor commands consistent with the current cursor position, which was not known to the subject due to visual feedback delay, or whether motor commands were more consistent with a previous, perceived position (Figure 2B,C and Figure 2—figure supplement 1). If subjects did not compensate for visual feedback delays and aimed from the most recently available visual feedback of cursor position, we would expect errors to be smallest at lags of 100 ms and 133 ms relative to the current cursor position for monkeys A and C, respectively (dashed red lines in Figure 2C). Rather, we found that these error curves had minima at lags close to 0 ms (dashed black lines in Figure 2C), indicating that motor commands through the BMI mapping pointed closer to the targets when originating from the current cursor position than from any previous position. This finding suggests that subjects use an internal model to internally predict the current cursor position.

###### Subjects compensate for sensory feedback delays while controlling a BMI.

(**A**) The visuomotor latency experienced by a subject in our BMI system was assessed by measuring the elapsed time between target onset and the first significant ( p < 0.05 ) decrease in angular error. If that first decrease was detected τ + 1 timesteps following target onset, we concluded that the visuomotor latency was at least τ timesteps (red dashed lines). For both subjects, the first significant difference was highly significant (** p < 10 - 5 , two-sided Wilcoxon test with Holm-Bonferroni correction for multiple comparisons n = 5908 trials monkey C: n = 4578 trials). (**B**) Conceptual illustration of a single motor command (black arrows) shifted to originate from positions lagged relative to the current cursor position (open circle). In this example, the command points farther from the target as it is shifted to originate from earlier cursor positions. (**C**) Motor commands pointed closer to the target when originating from the current cursor position (zero lag) than from outdated (positive lag) cursor positions that could be known from visual feedback alone (** p < 10 - 5 , two-sided Wilcoxon test monkey A: n = 33,660 timesteps across 4489 trials monkey C: n = 31,214 timesteps across 3639 trials). Red lines indicate subjects’ inherent visual feedback delays from panel A. Shaded regions in panels A and C (barely visible) indicate ± SEM.

Because we have not yet explicitly identified the subject’s *internal* model, motor commands were defined in this analysis using the BMI mapping, which is *external* to the subject. If the internal model bears similarities to the BMI mapping, it is reasonable to use the BMI mapping as a proxy for the internal model to assess feedback delay compensation. With evidence that subjects engage an internal model during BMI control, we next asked whether we could explicitly identify an internal model from the recorded neural activity.

### Internal model mismatch explains the majority of subjects’ control errors

The BMI mapping, which determines the cursor movements displayed to the subject, provides one relevant, low-dimensional projection of the high-dimensional neural activity. With evidence that subjects use an internal model during closed-loop BMI control, we asked whether mismatch between an internal model and the actual BMI mapping could explain the subject’s moment-by-moment aiming errors. This requires identifying the subject’s internal model, which could reveal a different projection of the high-dimensional neural activity, representing the subject’s internal beliefs about the cursor state. Because of the closed-loop nature of the BMI paradigm, the subject continually updates motor control decisions as new visual feedback of the cursor becomes available. To resolve these effects, the internal model needs to operate on a timescale of tens of milliseconds (in this case, a single timestep of the BMI system) on individual experimental trials. The extraction of such a rich internal model has been difficult prior to this study due to the lack of an appropriate statistical framework.

To overcome this limitation, we developed an internal model estimation (IME) framework, which extracts, from recorded population activity, a fully parameterized internal model along with a moment-by-moment account of the internal prediction process (Figure 3A). In the IME framework, the subject internally predicts the cursor state according to:

###### Mismatch between the internal model and the BMI mapping explains the majority of the subjects’ cursor movement errors.

(**A**) At each timestep, the subject’s internal state predictions ( x

t ) are formed by integrating the visual feedback ( x t-3 ) with the recently issued neural commands ( u t-2 , u t-1 , u t ) using the internal model ( A

). We defined cursor states and internal state predictions to include components for position and velocity (i.e., x t = [ p t v t ] , x

t ] ). (**B**) Cursor trajectory (black line) from a BMI trial that was not used in model fitting. Red *whisker* shows the subject’s internal predictions of cursor state as extracted by IME. The critical comparison is between the actual cursor velocity ( v t black arrow) and the subject’s internal prediction of cursor velocity ( v

t red arrow). (**C**) Cross-validated angular aiming errors based on IME-extracted internal models are significantly smaller than cursor errors from the BMI mapping (** p < 10 - 5 , two-sided Wilcoxon test monkey A: n = 5908 trials monkey C: n = 4577 trials). Errors in panel B are from a single timestep within a single trial. Errors in panel C are averaged across timesteps and trials. Errors in panels B and C incorporate temporal smoothing through the definition of the BMI mapping and the internal model, and are thus not directly comparable to the errors shown in Figure 2C, which are based on single-timestep velocity commands needed for additional temporal resolution. Error bars (barely visible) indicate ± SEM.

t is the subject’s internal prediction about the cursor state (position and velocity), u t is a vector of recorded neural activity, and A

are the parameters of the subject’s internal model. This form of the internal model was chosen to be analogous to the BMI mapping from Equation 1 so that the actual BMI mapping lies within the family of internal models that we consider. Additionally, this formulation aligns with recent studies of skeletomotor (Shadmehr and Krakauer, 2008) and oculomotor (Frens, 2009) control, and a vast literature of control theory (Anderson and Moore, 1990).

The primary concept of the IME framework is that, at each timestep, the subject internally predicts the current cursor state by recursively applying Equation 2 (starting from the most recently available sensory feedback) and generates neural activity consistent with aiming straight to the target relative to this internal prediction (see the 'Framework for internal model estimation (IME)' subsection in 'Materials and methods' and Figure 3—figure supplement 1). At each timestep, IME extracts the entire time-evolution of the subject’s internal state prediction using Equation 2 as an internal forward model. This evolution can be visualized in the form of a whisker (Figure 3B) that begins at the cursor position of the most recently available feedback and unfolds according to the extracted internal model. At each new timestep, the subject forms a new internal prediction that incorporates newly received visual feedback. If the internal model exactly matches the BMI mapping, the subject’s internal predictions would exactly match the cursor trajectory. A visualization of an example internal model and BMI mapping is given in Figure 3—figure supplement 2.

The central hypothesis in this study is that movement errors arise from a mismatch between the subject’s internal model of the BMI and the actual BMI mapping. The alternative to this hypothesis is that the subject’s internal model is well-matched to the BMI mapping, and movement errors result from other factors, such as “noise” in the sensorimotor system, subjects’ inability to produce certain patterns of neural activity, or subjects disengaging from the task. Our key finding is that recorded neural commands were markedly more consistent with the task goals when interpreted through subjects’ internal models than when viewed through the BMI mapping (Figure 3C). Subjects’ internal models deviated from the actual BMI mappings such that control errors evaluated through extracted internal models were substantially smaller than actual cursor errors: extracted internal models explained roughly 65% of cursor movement errors (70%, monkey A 59%, monkey C). Although this finding does not preclude other factors (e.g., spiking noise or subject disengagement) from contributing toward movement errors, it does suggest their contribution is substantially smaller than previously thought, due to the large effect of internal model mismatch.

We found that the majority of the explanatory power of extracted internal models was in their ability to identify structure in the high-dimensional neural activity (Figure 3—figure supplement 3). This structure was captured in the internal model by the mapping from high-dimensional neural activity to low-dimensional kinematics ( B

in Equation 2), which need not match the BMI mapping ( B in Equation 1). Consistent with this finding, internal models fit to low-dimensional behavior rather than high-dimensional neural activity were not able to explain cursor errors (Figure 3—figure supplement 4).

That a majority of cursor errors can be explained by mismatch of the internal model is not to say that control through the BMI mapping was poor–in fact control was proficient and stable (Figure 1B and Figure 1—figure supplement 1). Rather, extracted internal models predicted movements that consistently pointed straight to the target, regardless of whether the actual cursor movements did (Figure 4A) or did not (Figure 4B and Figure 4—figure supplement 1) point straight to the target. On most trials, BMI cursor trajectories proceeded roughly straight to the target (Figure 4A). On these trials, internal model predictions aligned with actual cursor movements, resulting in small errors through both the BMI mapping and the extracted internal model. In a smaller subset of trials, actual cursor movements were more circuitous and thus had relatively large errors. Previously, the reason behind these seemingly incorrect movements was unknown, and one possibility was that the subject simply disengaged from the task. When interpreted through the extracted internal model, however, neural activity during these circuitous trials appears correct, suggesting that the subject was engaged but was acting under an internal model that was mismatched to the BMI mapping (Figure 4B and Figure 4—figure supplement 1). In other words, when armed with knowledge of the subject’s internal model, outwardly irrational behavior (i.e., circuitous cursor movements) appears remarkably rational. Across all trials, the majority of neural activity patterns had low or zero error as evaluated through extracted internal models, regardless of whether errors of the actual cursor movements (i.e., through the BMI mapping) were large or small (Figure 4C and Figure 4—figure supplement 2).

###### Neural activity appears correct through the internal model, regardless of how the actual cursor moved.

(**A**) Typical trial in which the cursor followed a direct path (black) to the target. Internal model predictions (red whiskers) also point straight to the target. (**B**) Trial with a circuitous cursor trajectory. Internal model predictions point straight to the target throughout the trial, regardless of the cursor movement direction (same color conventions as in panel A). (**C**) Timestep-by-timestep distribution of BMI cursor and internal model errors. Neural activity at most timesteps produced near-zero error through the internal model, despite having a range of errors through the BMI mapping. (**D**) Hypothetical internal model (red) and BMI mapping (black) relating 2D neural activity to a 1D velocity output. This is a simplified visualization of Equations 1 and 2, involving only the B and B

parameters, respectively. Each contour represents activity patterns producing the same velocity through the internal model ( v

, red) or BMI mapping ( v , black). Because of internal model mismatch, many patterns result in different outputs through the internal model and the BMI. However, some patterns result in the same output through both the internal model and the BMI (gray line). Here we illustrate using a 2D neural space and 1D velocity space. In experiments with q -dimensional neural activity and 2D velocity, activity patterns producing identical velocities through both the internal model and the cursor span a ( q - 4 ) -dimensional space.

When cursor trajectories were circuitous, it was not uncommon for some internal model predictions (whiskers) to match the actual cursor movement while others did not, even within the same trial (Figure 4B). Given a single internal model, how can some patterns of neural activity result in whiskers aligned to the cursor trajectory, while others patterns produce whiskers that deviate from the cursor trajectory? This is possible due to mathematical operation of mapping from high-dimensional neural activity patterns to low-dimensional cursor states. Figure 4D provides a conceptual illustration of a simplified BMI mapping:

and a simplified internal model:

each of which relies only on a mapping ( B or B

) from neural activity ( u t ) to cursor velocity ( v t or v

here (without considering A , b , A

from Equations 1 and 2) because of the aforementioned finding that the majority of the internal model mismatch effect is captured by differences between B and B

(Figure 3—figure supplement 3). Given a mismatched BMI mapping (black lines) and internal model (red lines), many neural activity patterns will produce different velocities through the BMI mapping versus the internal model. However, a subset of activity patterns (gray line) will produce identical velocities through both the BMI mapping and the internal model. These patterns lie in the nullspace of B - B

(i.e., solutions to the equation Bu t = B

u t ). In the example trials shown in Figure 4A,B and Figure 4—figure supplement 1, internal model predictions (red) that match the actual cursor movement (black) correspond to neural activity patterns along the gray line in Figure 4D. Predictions not matching the cursor movement correspond to neural activity patterns anywhere off the gray line in Figure 4D.

### Two alternative hypotheses do not explain the effect of internal model mismatch

The data presented thus far support our central hypothesis that internal model mismatch is a primary source of movement errors. Next we asked whether it might be possible to have arrived at this result under the alternate hypothesis that the internal model is well-matched to the BMI mapping. We address two specific cases of this alternative hypothesis and show that they do not explain the observed effect of internal model mismatch.

First, we explored the possibility that the subject might have a well-matched internal model, but has systematic difficulties producing the neural activity patterns required to drive the cursor in all directions in the 2D workspace using the BMI mapping. This could result in an estimated internal model that appears to be mismatched to the BMI mapping. Although M1 cannot readily produce all possible patterns of high-dimensional neural activity (Sadtler et al., 2014), we observed that subjects could readily produce the full range of movement directions through the BMI mapping (Figure 3—figure supplement 5). Gaps between producible movement directions were typically less than 1/4 of a degree, which is substantially smaller than the cursor errors shown in Figure 3C. This suggests that our main finding of internal model mismatch cannot be explained by subjects’ inability to produce particular neural activity patterns.

Second, we explored the possibility that the subject intended to produce neural commands that were correct according to the BMI mapping, but that those intended commands were corrupted by “noise” that is oriented such that errors appear smaller through the extracted internal model than through the BMI mapping. Here we define noise as spiking variability not explained by the desired movement direction under the BMI mapping. If spiking variability is correlated across neurons, it is possible to identify a mapping that best attenuates that variability. To determine whether correlated spiking variability could explain the effect of internal model mismatch, we simulated neural activity according to this alternative hypothesis in a manner that preserved the statistics of the real data (Figure 3—figure supplement 6). If this simulation produced results that match our findings from the real data, it would indicate that our main finding can be explained by the alternate hypothesis. However, this was not the case. Simulated neural activity was more consistent with the BMI mapping than the extracted internal model, which contrasts with our finding from the recorded neural activity.

### Statistical controls for validating observed effects

To further validate the main results presented above, we implemented four statistical controls. First, we ensured that our findings were not simply artifacts of overfitting the data. Second, we removed the high-dimensional structure from the neural activity while preserving the cursor movements, and show that resulting extracted internal models no longer provided explanatory power. Third, we ensured that internal model predictions do not trivially point toward the targets. Finally, we explored a variety of forms for the internal model and found that a simplified form does not account for the data. Here we describe each of these four statistical controls in additional detail.

One possible concern when interpreting the findings presented above is that internal models might be simply overfitting the data. To rule out this possibility, all findings presented throughout this paper are cross-validated (see the 'Computing cross-validated internal model predictions' subsection in 'Materials and methods'). Internal models were fit using a subset of trials as training data. Then, trials that were held out during fitting were used to evaluate each extracted internal model. If the extracted internal models had overfit the training data, we would expect those internal models to generalize poorly to the held-out data. However, this was not the case. Internal models explained the majority of cursor errors in the held-out data (Figure 3C), demonstrating that extracted internal models captured real, task-relevant structure in the recorded neural activity.

In addition to properly cross-validating our results, we performed a control analysis to show that extracted internal models identified reliable, task-appropriate structure in the high-dimensional neural activity. Here we extracted internal models using neural activity that had been shuffled across timesteps in a manner that preserved the cursor movements through the BMI mapping (Figure 3—figure supplement 7). If our results could be explained by internal models that simply overfit noise in the data, we would expect internal models fit to these shuffled data data sets to again explain a majority of cursor errors. However, internal models extracted from these shuffled data sets could no longer explain cursor errors, indicating that IME does not identify effects when they do not exist in the data. This result is consistent with our findings that the majority of the explanatory power of extracted internal models relies on structure in the high-dimensional neural activity (Figure 3—figure supplement 3), and that cursor errors cannot be explained by internal models when high-dimensional neural activity is replaced by low-dimensional behavioral measurements during model fitting (Figure 3—figure supplement 4).

If an internal model prediction points toward the target, it is not trivially due to our inclusion of straight-to-target aiming during model fitting (see the 'Computing cross-validated internal model predictions' subsection in 'Materials and methods'). Although target positions were used during model fitting, they were never used when computing internal model predictions from the data (e.g., when constructing the whiskers in Figure 3B, Figure 4A,B, and Figure 4—figure supplement 1). Each whisker was constructed in a held-out trial using only visual feedback (consisting of a single timestep of cursor position and velocity), the recorded neural activity up through the current timestep, and the internal model extracted from the training data. Because of our aforementioned cross-validation procedures, when the neural command u t is used to compute the movement error at timestep t , that neural command had not been seen previously (i.e., it was not used when fitting the internal model, when estimating the subject’s internal cursor state prediction, when calibrating the BMI mapping, nor when determining the current position of the actual BMI cursor). A whisker that points straight to the target in the held-out data thus reveals that, when interpreted through the subject’s internal model, the recorded neural activity would have driven the cursor straight to the target.

Finally, we explored a variety of approaches to modeling the subject’s internal tracking process and found that models demonstrated similarly high degrees of explanatory power as long as they could capture high-dimensional structure in the neural activity. However, a simplified internal model that does not account for any form of internal forward prediction was not consistent with our data (Figure 3—figure supplement 8).

### Internal model mismatch explains limitations in speed dynamic range

A major limitation in BMI performance is the ability to control cursor speed (Gilja et al., 2012 Golub et al., 2014). Gilja et al. (2012) and Golub et al. (2014) have proposed solutions to improve control of BMI speed (in particular, with respect to stopping the BMI cursor at targets). However, it is still an open question as to why BMI speed control is deficient in the first place. In addition to explaining the subjects’ aiming errors, we asked whether mismatch between the internal model and BMI mapping could also explain subjects’ difficulty in controlling cursor speed. Using the extracted internal model, we could compare the subject’s intended speed (from the internal model) to the speed of the actual BMI cursor at each timestep. We found that low intended speeds were systematically overestimated, and high intended speeds were systematically underestimated by the BMI mapping (Figure 5A). Furthermore, we discovered that the subjects intended to hold the cursor steadier during the initial hold period and move the cursor faster during the movement than what occurred during experiments (Figure 5B). Note that we make no assumptions about movement speed when extracting the internal model (see the 'Framework for internal model estimation (IME)' subsection in 'Materials and methods').

###### Internal model mismatch limits the dynamic range of BMI cursor speeds.

(**A**) BMI cursor speeds across the range of intended (i.e., internal model) speeds. At low intended speeds, BMI speeds were higher than intended, whereas for mid-to-high intended speeds, BMI speeds were lower than intended. Shaded regions indicate ± SEM. (**B**) During the hold period prior to target onset, intended speeds were significantly lower than those produced through the BMI mapping. During movement, intended speeds were significantly higher than those produced through the BMI. Error bars indicate ± SEM (** p < 10 - 5 , two-sided Wilcoxon test monkey A: n = < 5006 , 5908 >trials monkey C: n = < 3008 , 4578 >trials). In panels A and B internal models were used to predict intended speed on trials not used during model fitting.

To gain insight into this speed mismatch, we can use extracted internal models to examine the discrepancies between intended and actual speeds at the level of individual units and on the timescale of a single 33-ms timestep (Figure 5—figure supplement 1). These systematic differences between intended and actual cursor speeds indicate that internal model mismatch limits realizable dynamic range of BMI movement speeds. These findings suggest that the longstanding deficiencies in BMI speed control may be a consequence of internal model mismatch.

### Perturbations drive internal model adaptation

A key feature of an internal model is its ability to adapt. Arm reaching studies have demonstrated behavioral evidence of internal model adaptation (Shadmehr and Mussa-Ivaldi, 1994 Thoroughman and Shadmehr, 2000 Joiner and Smith, 2008 Taylor et al., 2014). Behavioral learning has also been demonstrated in the context of BMIs (Taylor, 2002 Carmena et al., 2003 Jarosiewicz et al., 2008 Ganguly and Carmena, 2009 Chase et al., 2012 Sadtler et al., 2014). While these BMI studies suggest that subjects adapt their internal models to better match the BMI mapping, a direct assessment has been difficult without access to those internal models. With the ability to extract a subject’s internal model, here we asked whether extracted internal models adapt in accordance with perturbations to the BMI mapping (Figure 6). In one monkey, an initial block of trials under an intuitive BMI mapping was followed by a block of trials under a perturbed BMI mapping. All data analyzed prior to this section were recorded during intuitive trials. The intuitive and perturbed mappings were of the form of Equation 1, but each used different values in the matrix B . The perturbed BMI mapping effectively rotated the pushing directions of a subset of the recorded units, such that the global effect resembled a visuomotor rotation (see the 'Behavioral task' subsection in 'Materials and methods'). Previous studies have shown that perturbations of this type can be learned by monkeys (Wise et al., 1998 Paz et al., 2005 Chase et al., 2012).

###### Extracted internal models capture adaptation to perturbations.

(**A**) Cross-validated angular errors computed by interpreting monkey A neural activity through BMI mappings and internal models. The intuitive BMI mapping (blue) defined cursor behavior during the intuitive and washout trials. The perturbed BMI mapping (red) defined cursor behavior during the perturbation trials. The late intuitive internal model (yellow) was extracted from the last 48 intuitive trials (yellow bar). A time-varying internal model (green) was extracted from a moving window of the 48 preceding trials. Values were smoothed using a causal 24-trial boxcar filter and averaged across 36 experiments. (**B**) Differences between monkey A’s time-varying internal model and the BMI mappings, assessed through the high-dimensional neural activity. For each round of 16 trials, neural activity from those trials was mapped to velocity through the time-varying internal model, the intuitive BMI mapping, and the perturbed BMI mapping. Signed angles were taken between velocities computed through the time-varying internal model and the intuitive BMI mapping (blue) and between velocities computed through the time varying internal model and the perturbed BMI mapping (red). Values were averaged across 36 experiments.

For each experiment, we interpreted recorded population activity through the intuitive and perturbed BMI mappings, as well as through two views of the subject’s internal model: a time-varying internal model extracted from a moving window of 48 trials, and a late intuitive internal model extracted from the last 48 intuitive trials. We could then quantify changes in the subject’s internal model and assess which BMI mapping or internal model was most consistent with the neural activity, relative to task goals (Figure 6A). To avoid circularity, trials used to evaluate the BMI mappings and internal models were not used when fitting the internal models nor when calibrating the BMI mappings.

Errors through the intuitive BMI mapping describe the actual cursor performance during the intuitive and washout trials (thick blue traces analogous to cursor errors in Figure 3C), and how that mapping would have performed had it been in effect during the perturbation trials (thin blue trace). Similarly, errors through the perturbed BMI mapping describe the actual cursor performance during the perturbation trials (thick red trace), and how that mapping would have performed had it been in effect during the intuitive and washout trials (thin red traces). Behavioral learning was evident in that errors through the perturbed BMI mapping were large in early perturbation trials and decreased continuously throughout the perturbation trials. A detailed characterization of this behavioral learning can be found in (Chase et al., 2012).

Our key finding in this analysis is that extracted internal models adapted in a manner consistent with the BMI perturbations (Figure 6B). During the perturbation trials, the time-varying internal model adapted to better match the perturbed BMI mapping (red trace trends toward zero). Similarly, during the washout trials, the time-varying internal model adapted to better match the intuitive BMI mapping (blue trace trends toward zero). Had the subject’s internal model not adapted, or if the adaptation was not reflected in the extracted internal model, we would expect the traces in Figure 6B to be flat. Rather than being static entities, the extracted internal models were dynamic with timescales independent of experimenter-induced changes to the BMI mapping.

Consistent with our central hypothesis, internal model mismatch was present throughout the intuitive, perturbation, and washout trials. During intuitive trials, errors through the time-varying internal model were substantially lower than errors through the intuitive BMI mapping (green trace lower than blue trace in Figure 6A), which is consistent with our main findings in Figure 3C. Because the subject’s internal model adapts, errors through the time-varying internal model remained substantially smaller than errors through the BMI mappings across the perturbation and washout trials as well (green trace remains low across Figure 6A). Although behavioral errors decreased over the course of the perturbation and washout trials, internal model mismatch was still present following adaptation (red and blue traces are nonzero during late perturbation and washout trials, respectively, in Figure 6B).

It could have been that this internal model mismatch was only substantial during early intuitive trials before the subject had accrued enough experience to form a stable internal model. This was not the case. The subject’s internal model was stable throughout the intuitive session, as evidenced by the nearly constant angular differences between velocities mapped through the time-varying internal model and the BMI mapping (red and blue traces are roughly flat in Figure 6B) and the nearly identical errors through the time-varying internal model and the (static) late intuitive internal model (green and yellow traces overlap in Figure 6A wherever cross-validated errors can be computed). Consistent with a stable internal model, behavioral performance was stable throughout the intuitive trials (blue trace is flat in Figure 6A), and internal model mismatch explained a steady fraction of behavioral errors (green trace is also flat, and substantially lower than blue trace). During the perturbation session, the subject’s internal model diverges from this stabilized state (yellow trace diverges from green trace in Figure 6A and both traces are non-constant in Figure 6B).

## Acknowledgements

We thank Dr John Bender, Dr Josh Martin and Mr Alan Pollack for suggestions and help at various stages of this project.

### APPENDIX

#### Tetrode data analysis

Each electrode within a tetrode bundle records action potentials from multiple neural sources. In order to correlate specific electrical impulses to the activity of single neurons, we followed spike-sorting procedures laid out in detail elsewhere (Daly et al., 2004 Bender et al., 2010). We used the MClust toolbox for MATLAB (version 3.5 authors A. D. Redish et al., University of Minnesota). Initial, automated clustering was performed by the program KlustaKwik (version 1.5, author K. Harris, Rutgers University) and then imported into MClust for further refinement and analysis. By sequentially superimposing spike waveforms from pairs of initial clusters, we determined which should be combined into a single cluster and which are more likely to comprise their own clusters. An example of the spike waveforms viewed in MClust is shown in supplementary material Fig. S1A. In this case, three distinct patterns of spike shapes can be seen. Unit 6-1-1 has a larger peak on electrode 3 and smaller peaks on the remaining three electrodes. Unit 6-1-2 has similarly sized spikes on electrodes 1, 2 and 3, while those on electrode 4 are smaller. Unit 6-1-3 has a larger peak on electrode 4 and smaller peaks on the remaining three electrodes.

In the end, all similar initial clusters were merged, and the quality of the resulting clusters was examined by plotting different combinations of electrodes and shape metrics (including energy, peak height, time to peak and principal components) in 3-dimensions. supplementary material Fig. S1B shows two different 3-dimensional views of the relationships between the waveform energy recorded on three of the four channels of tetrode 6-1.

Three metrics can be used to determine the quality of the resulting clusters. For every cluster, the isolation distance (ID) is the ratio of the cluster's standard deviation to the distance from extra-cluster spikes. High values of ID typically indicate that spikes from a different neuron are incorrectly included in a cluster. The ID values for the clusters in supplementary material Fig. S1 signify around a 1% rate of spikes being incorrectly included. Furthermore, the distribution of interspike intervals (ISIs) may also provide information about the quality of the resulting clusters (supplementary material Fig. S1C). Very short ISIs (<1–2 ms) are unlikely to occur in a single unit because of its refractory period, so an ISI histogram with a substantial number of occurrences at small ISIs suggests that multiple neurons may be included within a cluster. Finally, the *L*_{ratio} is a measure of how close to the center of a given cluster are spikes from other clusters and indicates the chance that spikes from a single neuron are incorrectly excluded from a cluster. Our *L*_{ratio} values suggest that perhaps 10% of the spikes in each of our clusters may have been incorrectly excluded.

## 6. Intrinsic Coding: Representation of Visual Space

A number of neurophysiological studies have used MDS to analyze population coding of visual shape (Kayaert et al., 2005 Kiani et al., 2007 Lehky & Sereno, 2007 Murata et al., 2000 Op de Beeck et al., 2001 Rolls & Tovée, 1995 Young & Yamane, 1992). Only one has applied an intrinsic approach to visual space (Sereno & Lehky, 2011). Using unlabeled neurons, it produced a representation of visual space that was relational rather than atomistic. Therefore, the global structure of space came out naturally without additional assumptions or analyses. It was possible to extract relative stimulus positions from neural populations not only in the dorsal visual stream (lateral intraparietal cortex) but also the ventral stream (anterior temporal cortex) of monkeys. Further, whereas the dorsal representation of space was quite metrically accurate, the ventral stream representation was only topologically (or categorically) accurate.

A widespread view in studies of monkey extrastriate visual processing is that large RFs throw away spatial information to produce spatially invariant object representations by pooling spatially localized responses received from earlier levels (e.g., Tanaka, 1996 Gochin, 1996, in the neurophysiology literature Riesenhuber & Poggio (1999) in the theoretical literature). Instead, in a modeling study using intrinsic coding, large RF diameters produced the most accurate reconstructions of space (Lehky & Sereno, 2011). The better performance of large RFs in intrinsic coding holds true whether the population is noise free (see Figure 5) or noisy (see Figures 6a and 6b). In contrast, small RF diameters, as would occur in the earliest visual areas, produced poor representations of space (see Figure 5c).

Example of intrinsic coding of visual space. Multidimensional scaling was used to recover stimulus locations from a population of model neurons without noise. The radius of receptive fields was defined as one space constant of gaussian tuning curve, so the diameter was 2 . Spacing between RF peaks was 0.25 , although previous work (Lehky & Sereno, 2011) shows that results are independent of RF spacing for noise-free systems. (a) Physical stimulus locations. Forty locations are arranged in a radial grid. (b) Recovered positions using large receptive fields, producing an accurate representation of space. (c) Recovered positions using small receptive fields, producing a highly distorted representation of space. Locations in the outer ring (lightest green) have curved inward, so that the representation is not even topologically accurate. In panels b and c, recovered positions were linearly rescaled by a Procrustes transform to allow quantitative comparison with physical locations. Stress is an error measure, with smaller values indicating better fit between recovered locations and physical locations. (Adapted from Lehky & Sereno, 2011.)

Example of intrinsic coding of visual space. Multidimensional scaling was used to recover stimulus locations from a population of model neurons without noise. The radius of receptive fields was defined as one space constant of gaussian tuning curve, so the diameter was 2 . Spacing between RF peaks was 0.25 , although previous work (Lehky & Sereno, 2011) shows that results are independent of RF spacing for noise-free systems. (a) Physical stimulus locations. Forty locations are arranged in a radial grid. (b) Recovered positions using large receptive fields, producing an accurate representation of space. (c) Recovered positions using small receptive fields, producing a highly distorted representation of space. Locations in the outer ring (lightest green) have curved inward, so that the representation is not even topologically accurate. In panels b and c, recovered positions were linearly rescaled by a Procrustes transform to allow quantitative comparison with physical locations. Stress is an error measure, with smaller values indicating better fit between recovered locations and physical locations. (Adapted from Lehky & Sereno, 2011.)

Comparison of population coding of visual space under intrinsic and extrinsic methods, using noisy neural populations. Population characteristics were identical in each case. The radius of receptive fields was defined as one space constant of gaussian tuning curve, so the diameter was 2 . Spacing between RF peaks was 0.25 . Uncorrelated gaussian noise was proportional to neural responses, with a standard deviation of noise equal to 0.3 of response amplitude for each neuron. (a, b) Intrinsic coding, using multidimensional scaling on unlabeled neurons with large and small RFs. Details as in Figure 5. Performance was better with large RFs. (c, d) Extrinsic coding, using weighted peak averaging on labeled neurons with large and small RFs. Performance was better with small RFs.

Comparison of population coding of visual space under intrinsic and extrinsic methods, using noisy neural populations. Population characteristics were identical in each case. The radius of receptive fields was defined as one space constant of gaussian tuning curve, so the diameter was 2 . Spacing between RF peaks was 0.25 . Uncorrelated gaussian noise was proportional to neural responses, with a standard deviation of noise equal to 0.3 of response amplitude for each neuron. (a, b) Intrinsic coding, using multidimensional scaling on unlabeled neurons with large and small RFs. Details as in Figure 5. Performance was better with large RFs. (c, d) Extrinsic coding, using weighted peak averaging on labeled neurons with large and small RFs. Performance was better with small RFs.

## Abstract

Reaching and grasping in primates depend on the coordination of neural activity in large frontoparietal ensembles. Here we demonstrate that primates can learn to reach and grasp virtual objects by controlling a robot arm through a closed-loop brain–machine interface (BMIc) that uses multiple mathematical models to extract several motor parameters (i.e., hand position, velocity, gripping force, and the EMGs of multiple arm muscles) from the electrical activity of frontoparietal neuronal ensembles. As single neurons typically contribute to the encoding of several motor parameters, we observed that high BMIc accuracy required recording from large neuronal ensembles. Continuous BMIc operation by monkeys led to significant improvements in both model predictions and behavioral performance. Using visual feedback, monkeys succeeded in producing robot reach-and-grasp movements even when their arms did not move. Learning to operate the BMIc was paralleled by functional reorganization in multiple cortical areas, suggesting that the dynamic properties of the BMIc were incorporated into motor and sensory cortical representations.

**Citation:** Carmena JM, Lebedev MA, Crist RE, O'Doherty JE, Santucci DM, Dimitrov DF, et al. (2003) Learning to Control a Brain–Machine Interface for Reaching and Grasping by Primates. PLoS Biol 1(2): e42. https://doi.org/10.1371/journal.pbio.0000042

**Academic Editor:** Idan Segev, Hebrew University

**Received:** August 14, 2003 **Accepted:** September 3, 2003 **Published:** October 13, 2003

**Copyright:** © 2003 Carmena et al. This is an open-access article distributed under the terms of the Public Library of Science Open-Access License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**Competing interests:** The authors have declared that no conflicts of interest exist.

**Abbreviations:** BMI, brain–machine interface BMIc, closed-loop brain–machine interface 1D, one dimensional 3D, three dimensional DOF, degree of freedom DTC, directional tuning curve DTD, directional tuning depth DTE, directional tuning of neural ensemble EMG, electromyogram GF, gripping force HP, hand position HV, hand velocity M1, primary motor cortex MIP, medial intraparietal area ND, neuron dropping PETH, perievent time histogram PMd, dorsal premotor cortex PP, posterior parietal cortex S1, primary somatosensory cortex SMA, supplementary motor area

## Results

The HD system of rodents encodes the time-dependent azimuth θ(*t*) of the animal's head relative to a landmark-anchored reference direction (Fig. 1) and plays a functional role for spatial navigation (Butler et al., 2017). Notably, during head turns, HD cells start to fire before their preferred direction θ* _{j}* is reached, as though the neurons anticipated the future HD (Blair and Sharp, 1995). Faster rotations lead to stronger shifts in time than slow turns. Denoting the time-dependent AHV by ω(

*t*), the shift can be approximated by the following: where the time constant τ

*is called the ATI of cell*

_{j}*j*and describes the extent to which HD firing seems to be shifted forward in time (Blair and Sharp, 1995 Blair et al., 1997 Taube and Muller, 1998) for a visualization, see Fig. 2

*A*, right.

Population encoding and decoding of HD. ** A**, The HD system acts as an internal compass, encoding the time-dependent azimuth θ(

*t*) of the animal's head with respect to a landmark-anchored reference direction.

**, Polar plot of idealized HD tuning curves and activities. HD cells are broadly tuned (∼90°) so that ∼25% of all cells are active at any moment. The tuning curves, highlighted for 20 neurons by the inner blue and red lines, have an elevated baseline (innermost circle), and their maximum is achieved at the cell's preferred direction, as illustrated for the red cell. The activation levels of the different cells at a particular time**

*B**t*are denoted by the disk radii. The population vector θ̂(

*t*) is calculated from these stochastic activities and provides the time-dependent direction estimate.

**, Temporal evolution of the HD activity packet for a small clockwise head movement θ(**

*C**t*), shown as a dashed line. Blue, red, and gray dots in the raster plot represent spikes generated with inhomogeneous Poisson processes whose rates are determined by the current HD via the tuning curves shown on the left. Model parameters: peak firing rate,

*f*= 50 Hz background firing rate,

_{max}*f*= 2 Hz.

_{bg}Tuning curves: shapes and parameter ranges. ** A**, Left, Middle, Illustration of von Mises tuning curves for different choices of the minimum, mean, and maximum values of the peak firing rate

*f*and tuning width σ (for anticipatory time delay τ = 0). Right, Shift of a tuning curve during fast head turns (AHV: ±360°/s) note the different scale on the θ-axis.

_{max}**, Parameter distributions from which the tuning curves were sampled to generate inhomogeneous populations (beta distributions whose ranges were adapted to the experimental observations). The distributions have means, SDs, and range values are as shown in Table 1.**

*B*The rodent HD system consists of multiple processing stages along which ATIs decrease from stage to stage: from the lateral mammillary nuclei (LMN, τ ≈ 40–75 ms) to the anterodorsal thalamic nucleus (ADN) and retrosplenial cortex (τ ≈ 25 ms) to the postsubiculum, where neurons tend to neither lag nor anticipate the cell's preferred firing direction (Taube, 2007). As with other HD cell parameters (see also Table 1), ATIs vary from cell to cell (Blair et al., 1997 Taube and Muller, 1998). At the mechanistic level, ATIs have been explained by an interplay between afferent dynamics and movement statistics (see also Zhang, 1996 van der Meer et al., 2007 Tsanov et al., 2014), but a consistent functional interpretation of ATIs is still missing.

As we will show, anticipatory firing improves the accuracy with which the present HD can be decoded from time-dependent HD signals. For concreteness, we use the population-vector decoder (Georgopoulos et al., 1983, 1986), which provides a reliable estimate of variables encoded by the distributed activity pattern of a neural ensemble (Seung and Sompolinsky, 1993 Salinas and Abbott, 1994 Glasius et al., 1997 Stemmler et al., 2015). We focus on HD cells in the ADN, which contains the highest fraction of HD cells (Taube and Bassett, 2003) and is considered a “core region” (Taube, 2007) or even “central hub” (Peyrache et al., 2017) of the HD system. To cover the dynamic character of head movements, we explicitly consider time-varying stimuli (Mosheiff et al., 2017). We take a computational approach and use experimentally recorded HD trajectories θ(*t*) to generate artificial spike trains (Fig. 3). The periodic tuning curves of individual HD cells can be captured by von Mises functions (Zhang, 1996) whose tuning width, peak firing rate, and background firing rate are chosen according to the available experimental literature (Fig. 2*B*). Spike generation is modeled by inhomogeneous Poisson processes (see Eq. 1). From the spike trains of this simulated neural population, we compute an estimated trajectory θ̂(*t*) based on the population-vector decoder. We then measure the accuracy of the HD system by analyzing the distribution of θ̂(*t*) and comparing its mean 〈θ̂(*t*)〉 with the true trajectory θ(*t*), as described in Materials and Methods. Using this framework, we systematically explore the HD representation and characterize the effect of anticipatory firing. We derive an analytic approximation for the population code's accuracy and use numerical simulations to check its validity and to study populations with inhomogeneous tuning curves. An approach based on a reconstruction from experimentally measured spike trains, as in Johnson et al. (2005) for the postsubiculum, is currently not feasible because simultaneous recordings of HD cells in ADN are only available for small populations of less than 20 HD neurons (Peyrache et al., 2015).

Encoding model and HD trajectory. ** A**, Tuning curves, modeled as von Mises functions, of an inhomogeneous population of HD cells in the ADN (

*N*= 100). The parameters of the von Mises functions were drawn from the distributions shown in Figure 2.

**, Time-dependent HD θ(**

*B**t*) obtained from a two LED recording of a foraging rat (black line). In addition, spike times from one realization of inhomogeneous Poisson processes based on the rat's movement trajectory are shown for each neuron.

### HD trajectories

The HD trajectories θ(*t*) for this study are taken from a dataset that was recorded in the laboratory of G. Buzsáki (Mizuseki et al., 2009). Because the HD is a circular variable, temporal averages and expectation values need to be consistent with circular statistics (see Materials and Methods). In particular, the temporal average θ̄* _{T}*(

*t*) of a trajectory segment θ(

*s*) within some time interval [

*t*−

*T*,

*t*], is defined by: so that for constant AHV (i.e., θ(

*t*) = θ

_{0}+ ω ·

*t*) the trajectory average corresponds to the midpoint of the trajectory segment, θ̄

*(*

_{T}*T*/2) = θ

_{0}. To avoid overly complicated mathematical expressions, we will omit indices, such as

*T*in θ̄

*(*

_{T}*t*), or variables, such as

*t*in θ̄

*(*

_{T}*t*), unless this could cause misunderstanding.

### Tuning curves and spike-train model

HD cells in the ADN have been studied in great detail (see e.g., Knierim et al., 1995, 1998 Blair and Sharp, 1995 Blair et al., 1997, 1998 Goodridge and Taube, 1997 Stackman and Taube, 1998 Taube and Muller, 1998 Taube, 2007 Clark et al., 2009, 2012 Peyrache et al., 2015, 2017 Tan et al., 2015). Their stereotypical tuning profiles show a single hump and are well fit by von Mises functions (Zhang et al., 1998): with peak firing rate *f _{max,j}*, background firing rate

*f*, “concentration” parameter κ

_{bg,j}*, and preferred direction θ*

_{j}*. The cell's tuning width σ*

_{j}*can be calculated as σ*

_{j}*= 180°/( ). Based on these tuning curves, each model neuron generates an inhomogeneous Poisson spike train (see Eq. 1) with time-dependent rate λ*

_{j}*(θ(*

_{j}*t*)). If we use λ without subscript, we refer to a neuron with preferred direction θ = 0.

According to Taube and Bassett (2003), there are *N* ≈ 12,000 HD cells in ADN per brain hemisphere. We therefore study ensembles up to this size but also consider smaller *N* to analyze size-dependent effects.

The tuning parameters of HD cells vary strongly from neuron to neuron. To evaluate the effect of such variations on the neural code, we describe the parameters by probability distributions whose ranges, means, and SDs are estimated from the literature (see Table 1 Materials and Methods). Sampling the parameters of each model neuron from these distributions results in what we call an inhomogeneous population (Fig. 3). To model the preferred HD of each cell, the equidistant arrangement is altered by adding to each θ* _{j}* a random shift chosen from a Gaussian distribution with mean 0 and SD 2π/

*N*. The mathematical analysis focuses on homogeneous populations. Here, neurons differ only by their (equidistant) preferred direction but share all other parameters (so that we omit the subscript

*j*), chosen as the mean values in Table 1. For simulations of inhomogeneous populations, we sample ATIs from the distribution shown in Figure 2 for homogeneous populations, we assume τ = 25 ms for all cells, unless noted otherwise.

### Causal population vector reconstruction

The population vector *PV*(*t*) is a weighted sum, *PV*(*t*) = ∑ _{j=1} N *k _{j,T}*(

*t*)

*V*, where the

_{j}*V*are unit vectors in θ

_{j}*direction and*

_{j}*k*

_{j}_{,}

*(*

_{T}*t*) denotes the number of spikes fired between time

*t*−

*T*and

*t*(Georgopoulos et al., 1983, 1986). Introducing the phasors exp(

*i*θ

*) = cos(θ*

_{j}*) +*

_{j}*i*sin(θ

*), the population vector can be written as ∑*

_{j}_{j=1}N

*k*(

_{j,T}*t*) ·

*e*

*i*θ

*its direction is given by the argument of this complex number. Defining the spike-count vector*

_{j}*K⃗*(

_{T}*t*) = (

*k*

_{1,T}(

*t*),

*k*

_{2,T}(

*t*), …,

*k*(

_{N,T}*t*)), the animal's HD can be decoded from the experimental or modeled HD cell responses as follows: The decoder θ̂

*(*

_{T}*t*) is chosen to obey causality: only spikes from spike-count windows ending at time

*t*are used to estimate θ(

*t*), as illustrated in Figure 4. This choice reflects the fact that any downstream processing step of the HD signal can only depend on spikes that were fired in the past.

Causal decoding and effect of the spike-count window. ** A**, Within a causal framework, only spikes from the past can be used to estimate the encoded variable. To decode the time-dependent HD θ(

*t*), shown as dashed black line at the time of the solid black line, one might, for example, rely on spikes generated within the gray area. The blue line indicates the time-dependent estimate θ̂(

*t*), which was calculated from the shown spike responses using the population-vector decoder.

**,**

*B***, Estimated HDs θ̂(**

*C**t*) using a short (

**) and a long (**

*B***) spike-count window**

*C**T*. For the short spike-count window, θ̂(

*t*) jitters strongly around the true value θ(

*t*), whereas for longer

*T*the estimated HD is much smoother but lags behind the true trajectory because spikes from the more distant past are used.

### Read-out accuracy: a function of systematic lag (bias) and random noise (variance)

The theoretical framework now at hand allows us to calculate how accurately the current HD θ(*t*) can be determined from the time-dependent population activity of non-anticipatory HD cells in a moving animal. Because the PV estimator θ̂* _{T}*(

*t*) relies exclusively on spikes fired before time

*t*, the computed HD always lags behind the current HD θ(

*t*). In statistical terms, this means that θ̂

*(*

_{T}*t*) is biased with respect to θ(

*t*). Additionally, because the spiking behavior of an ensemble of neurons is noisy, θ̂

*(*

_{T}*t*) exhibits random fluctuations, which are measured by the variance

*V*. As shown in Materials and Methods, the (circular) error

*D*for the HD read-out from a neural ensemble is given by the sum of the (squared) bias

*B*and the variance

*V*as follows: In the following, we describe the behavior of the bias

*B*and the variance

*V*of the causal PV estimator and demonstrate at which point anticipatory firing introduces an improvement of the read-out accuracy.

To determine the bias *B* of the PV estimator, we start by evaluating the estimator's average at a given instance of time. As derived in Materials and Methods, This means that, when averaging out the noise induced by stochastic spiking, the PV estimator provides the average HD of an animal during the spike-count time *T*. Because the bias is given by the difference between 〈θ̂* _{T}*(

*t*)〉 and θ(

*t*) (see Fig. 5), the dynamic nature of the HD induces a systematic error. To visualize this, consider a trajectory segment with constant AHV: θ(

*t*) = θ

_{0}+ ω

*t*. Here, 〈θ̂

*(*

_{T}*t*)〉 ≈ θ(

*t*) − ω

*T*/2, such that the bias amounts to

*B*= ω

*T*/2. The PV estimator is thus biased for every spike-count window for which the average HD does not match the final HD. Generally, longer spike-count windows will lead to larger biases because the HD has more time to change so that the average HD deviates stronger from the endpoint HD (see also Figure 6). We evaluated the average squared bias

*B*2 along all experimentally recorded trajectories as a function of

*T*. The results are shown in Figure 7

*A*, and all share an increase with integration time

*T*.

Circular error measure, bias, and variance. ** A**, Blue line indicates circular distance function. For small circular distances, this function resembles the squared error (red dashed line).

**, A total of 300 realizations of population vectors (blue arrowheads) calculated from a population of**

*B**N*= 100 neurons encoding a circular static variable θ. The mean direction 〈θ̂〉 represents the directional average of all possible realizations. If 〈θ̂〉 does not agree with the value of θ, which θ̂ is supposed to represent, as illustrated in the figure, the estimator is called biased. The (circular) variance describes the directional extent of all possible realizations of the population vector given a certain encoded variable (the spread is caused by the noisy encoding).

Relation between the population vector and the encoded trajectory. ** A**, Distribution of the population vector's direction θ̂(

*t*) for short trajectory segments with constant angular velocities. The segments are shown below the histogram their values at

*t*= 25 ms are highlighted by large dots. The segment plotted with the red dotted line has ω = 360° s −1 and represents fast angular head velocities, as observed for foraging rats. The threefold faster yellow trajectory was chosen to generate a noticeable deviation of the histogram from the static (gray) case. To sample θ̂(

*t*= 25 ms), a homogeneous population with

*N*= 100 neurons and time-dependent Poisson processes was used to generate spike-count vectors

*K⃗*(spike-count window

*T*= 50 ms). Even for fast head turns, the distribution of θ̂

*(*

_{T}*t*= 25 ms) is very similar to the distribution for the static case. In particular, the preferred direction for all three trajectory segments is at 90°, the trajectory average θ̄(

*t*= 25 ms), and the variance of the distribution increases only slightly with increasing ω. Regarding the trajectory endpoints, the estimator θ̂

*(*

_{T}*t*= 25 ms) is thus biased for ω ≠ 0.

**, In essence, the population vector provides a smoothed and delayed copy of the trajectory. Dashed black line indicates an artificial trajectory θ(**

*B**t*). Continuous lines indicate trajectory averages θ̄(

*t*) for various window lengths

*T*. The mean direction of the population vector agrees well with the time-lagged trajectory averages (i.e., 〈θ̂(

*t*)〉 ≈ θ̄

*(*

_{T}*t*)), even for trajectory segments with time-varying ω(

*t*). The mean directions 〈θ̂(

*t*)〉 are shown as dots. Error bars indicate the SD of the distribution from which they were computed.

Bias and accuracy of the population vector. ** A**, Left, Mean bias

*B*for all trajectory segments of the dataset (Mizuseki et al., 2009) as a function of segment duration

*T*. The duration-weighted average of all trajectory segments is also show. Error bars indicate the duration-weighted SD. Light and darker yellow areas represent the sets of trajectory segments between the 25th-75th and 5th-95th percentiles, respectively. Red dashed line indicates the representative trajectory, which was used as a basis of the simulations. For causal decoding without anticipation, the plotted data represent the best achievable accuracy for given

*T*. Because longer time windows admit stronger changes of the HD, the bias term increases with increasing

*T*. Right, Approximation results for the reconstruction accuracy given various population sizes (analytical results: continuous lines). For each population size, there is a spike-count window, which leads to an optimal accuracy. Larger populations can reach a higher HD accuracy. While the contribution of the variance to the total error decreases with the size of the population (dashed lines), the bias term (red dashed line) is independent of the population size. Dots represent simulation results from Monte Carlo simulations using homogeneous populations. Transparent lines indicate simulations results from different inhomogeneous populations (one line per population). As evident from the figure, the approximate solution derived in the text captures the population vector accuracy of homogeneous populations as well as of inhomogeneous populations (for large enough population size).

**,**

*B***, Same as in**

*C***, but for ATIs τ = 25 ms and τ = 50 ms, respectively. Left panels, In both cases, the dip with smallest average bias**

*A**B*is found at

*T*≈ 2τ because for these values the interpolated trajectory ϕ(

*t*) is closest to θ(

*t*), on average. The length of the spike-count window with minimum bias is thus controlled by τ. Right panels, In addition to data for the nonzero τ, the analytic results from

**are plotted as gray background lines. The simulation results (dots and transparent lines) again agree well with the analytic description (continuous, nontransparent lines), especially for large population sizes**

*A**N*. As the bias term has a minimum at

*T*> 0, longer spike-count windows can be used compared with the nonanticipatory case. This reduces the variance and bias of the estimator and results in smaller minimal reconstruction errors. However, too long ATIs can also deteriorate population codes. One such example is shown for which the optimal accuracy of a population with

*N*= 10,000 neurons is worse for anticipatory firing with τ = 50 ms than for τ = 25 ms.

Based on this analysis of the bias, it appears reasonable to use very short spike count times to keep the spikes used for decoding as close as possible to the current time point *t*. This will, however, increase the noise of the read-out as less and less spikes are involved in the estimation and their stochastic variations cannot be averaged out anymore. Together, the two opposing effects define an optimal read-out time for which there is an ideal trade-off between systematic lag and statistical fluctuations.

The amount of error induced by noisy spiking (i.e., the variance *V*) can be approximated analytically using Fisher information. Because this noise component is independent of the AHV, we can refer to results derived for the well-studied static case (for the full result, see Materials and Methods Eq. 13). We find that the variance decreases with the average number of collected spikes as 1/RT, thus decreasing as spikes are collected during longer time windows (Fig. 7*A*, dotted lines).

According to Equation 12, the variance term depends only on the scaling factor (2*NT*) −1 and the ratio (λ̃_{0} − λ̃_{2})/λ̃ _{1} 2 . Comparing different tuning curves, normalized such that they result in the same population firing rate, numerical values for this ratio are 0.083 (von Mises tuning), 0.080 (Gauss tuning), and 0.085 (triangular tuning). This calculation shows that the variance term for von Mises tuning differs by less than 5% from the other two tuning functions often used for HD cells.

The trade-off arising from the two different error sources is in Figure 7 (right column). Along with data points obtained from applying our analytical treatment to an experimentally recorded trajectory, we show results from Monte Carlo simulations (see Materials and Methods). These involved either homogeneous or inhomogeneous HD cell ensembles and confirm our mathematical prediction. In particular, as *N* increases, results from inhomogeneous and homogeneous cell ensembles align because all possible HDs are uniformly covered by tuning curves.

When decoding time-independent stimuli, the read-out window *T* is usually assumed to be a free parameter that needs to be chosen in a biologically or methodologically useful way (see, e.g., Seung and Sompolinsky, 1993 Brown et al., 1998 Zhang and Sejnowski, 1999 Mathis et al., 2012). Considering the dynamic nature of a stimulus, as we do in this study, leads to specific spike-count windows *T* for which the highest possible accuracy is achieved (see also Mosheiff et al., 2017). The bias depends exclusively on the shape of the HD trajectory, whereas the variance depends exclusively on the parameters of the neural ensemble. For a non-anticipatory population of HD cells, the only chance to decrease the optimal read-out error is to invest more neural resources (*N*, *f _{max}*) and thus reduce the contribution of the random spiking noise to the error. A causal decoder cannot directly mitigate the systematic lag. We show in the following paragraph that anticipatory firing does influence the bias term and can lead to significant improvements of the HD code.

### Effect of anticipatory firing in HD ensembles

As proposed by Blair and Sharp (1995), we assume that the shift of the preferred HD scales linearly with AHV. Writing λ̃* _{j}* for the average firing rate of an anticipatory neuron in a homogeneous population and inserting Equation 15 for the preferred direction, we obtain the following: This formulation treats anticipation in HD cells as a shift in the angular trajectory given by ϕ(

*t*) = θ(

*t*) + ω(

*t*)τ, as illustrated in Figure 8. If the AHVω(

*t*) were constant between

*t*and

*t*+ τ, the product ω(

*t*)τ equals the total turning angle in that time interval.

Effects of anticipatory firing on population-vector decoding. Effect of anticipatory firing on the distribution of θ̂(*t*) and its bias. ** A**, Bottom row represents the three trajectory segments with constant AHVs ω (0, 360°/s, 3 × 360°/s) used for the simulations (as in Fig. 6). Three top rows represent distribution of θ̂(

*t*) for τ = 10 ms, τ = 25 ms, and τ = 50 ms, respectively. An ATI of τ = 10 ms leads to a reduction of the bias compared with the nonanticipatory case shown in Figure 6. The bias is reduced as the expected directions 〈θ̂(

*t*)〉 = 25 ms (vertical lines) are closer to the trajectory's endpoints (depicted as dots). For τ = 25 ms, the condition τ =

*T*/2 (see main text) is fulfilled, and the expected directions agree exactly with the trajectory's endpoints. In this situation, θ̂

*(*

_{T}*t*) is an unbiased estimator of θ(

*t*). For τ = 50 ms, the bias is as large as in the nonanticipatory case, but with opposite orientation. The width of the distributions does not depend on the ATI.

**, Raster plots of 100 HD cells encoding the trajectory segment θ(**

*B**t*) (black dashed line) without anticipation (τ = 0). Blue line indicates the decoded trajectory θ̂(

*t*), which lags behind the true trajectory θ(

*t*).

**, With an anticipation of τ = 25 ms, the decoded trajectory no longer lags behind the true trajectory. Red line indicates the modified trajectory ϕ(**

*C**t*) = θ(

*t*) + ω(

*t*)τ.

**, Same as in Figure 6, but for anticipatory firing with τ = 25 ms. Continuous lines indicate the lagging circular averages of the modified trajectory ϕ(**

*D**t*). They agree well with the mean directions 〈θ̂

_{τ}(

*t*)〉, which were computed from the anticipatory spike trains.

Accordingly, the expected direction of a homogeneous population of anticipatory neurons is given by the following: The subscript τ indicates that we are describing an anticipatory HD population. Because the trajectory ϕ(*t*) corresponds to an estimate of the future HD, this change of the expected direction θ̂_{T,τ}(*t*) can lead to a compensation of the bias of θ̂(*t*) compared with the non-anticipatory case (Eq. 20).

To better understand this effect, we first examine trajectory segments with constant AHV ω (i.e., with θ(*t*) = θ_{0} + ω · *t*). In this case, ϕ(*t*) = θ_{0} + ω · (*t* + τ) = θ(*t* + τ) so that ϕ(*t*) is a time-shifted version of θ(*t*). We obtain for the expected direction of the causal estimator θ̂(*t*). The ATI τ thus temporally shifts the expected direction of θ̂(*t*) along the trajectory θ(*t*). Evaluation of the average bias term for a linear trajectory segment leads to *B* = |ω·(*T*/2 − τ)|. Choosing τ within (0, *T*) reduces the bias compared with the situation without anticipation. For constant AHV, it is even possible to fully remove the bias by choosing τ = *T*/2, as illustrated in Figure 8.

Anticipation also reduces the bias when the velocity of the HD θ(*t*) varies in time. To quantify this effect in the experimentally relevant case, we recalculate *B* as for Figure 7*A*, but with anticipatory firing. We use Equation 21 to compute the expected HD. The results of this analysis are shown in Figure 7*B*, *C* (left panels) for τ = 25 ms and τ = 50 ms, respectively. In stark contrast to populations without anticipation (Fig. 7*A*), *B* does not increase monotonically with *T* but has a minimum. Similar to the case of trajectories with constant AHV, the average bias *B* is smallest for *T* ≈ 2τ. In this case, the implicit interpolation into the future described by ϕ(*t*) has the optimal size. However, because of the time-varying nature of the AHV, the bias is not completely canceled.

In contrastg, the variance of θ̂(*t*) does not depend on the ATI τ, as seen by considering trajectory segments with constant AHV or the data shown in Figure 8. To describe the reconstruction error *D* for anticipatory firing analytically, we directly adopt the variance term from Equation 13 and replace the bias term by the anticipatory bias as analyzed above. We now numerically calculate the average circular error *D* with homogeneous and inhomogeneous populations. For the inhomogeneous populations, the ATIs are sampled from the distribution depicted in Figure 2 (for τ = 50 ms, the distribution of τ is shifted by 25 ms to the right). Figure 7*B*, *C* (right panels) shows the analytical result from Equation 13 together with simulation results for ATIs of τ = 25 ms (to mimic experimental ATIs) and τ = 50 ms (to illustrate the effect of longer anticipation).

As in Figure 7*A*, the data shown in Figure 7*B*, *C* demonstrate a qualitative agreement between the analytic description and the simulations. Most notably, the results for populations with realistic ATI distribution align closely with the results for homogeneous populations (for sufficiently large *N*) and show that cell-to-cell ATI variability does not reduce the advantages of anticipatory firing. Compared with populations without anticipation, the minimal errors of anticipatory populations are shifted toward longer spike-count windows *T* because the bias does not increase monotonically with *T*, but rather has a minimum for *T* > 0. Consequently, spikes can be collected over longer time windows (as long as the bias remains small), which reduces the variance. For τ = 25 ms, this improved trade-off leads to a reduction of the minimal error for every population size shown in Figure 7*B*. For large ATIs, however, the minimum achievable bias can become so high that the minimal reconstruction errors are larger than without anticipation (Fig. 7*C*). Hence, anticipation only improves the accuracy of the causal population-vector decoder if the ATIs lie within a certain interval.

In Figure 9*A*, we show how much the minimum achievable error can be reduced by anticipatory firing with different ATIs. To generate this figure, the function *A*(*T*) was numerically minimized for τ = 0 ms and various values τ > 0 ms. The minimum errors for τ = 0 ms are shown in the inset as a function of population size *N*. The main plot presents the (average) fraction of the minimum *A* with and without anticipatory firing. The figure reveals that anticipatory firing can lead to large improvements in reading out the HD compass: the experimentally determined average value of τ = 25 ms increases the accuracy by up to 40%. Figure 9*B* demonstrates how large the population size *N*_{0} of a non-anticipatory population has to be to reach the same accuracy as an anticipatory population of size *N*. For a large range of population sizes and ATIs, non-anticipatory populations need to be more than three times as large as populations of anticipatory neurons (Fig. 9*B*). For τ = 25 ms, the ratio *N*_{0}/*N* is particularly high for (biologically plausible) population sizes between 1000 and 10,000 neurons. These results reveal that the experimentally reported value of τ = 25 ms lies exactly in the range that leads to an optimal improvement of the HD code given the broadly tuned neurons and the temporal statistics of the head movements.

Anticipation improves the accuracy of the HD compass. ** A**, Inset, The best accuracy

*A*(

*T*) achievable for a homogeneous population with nonanticipatory neurons. The main figure represents the relative change of this smallest error if HD cells fire in an anticipatory manner. Data points are duration-weighted averages over all trajectory segments. Error bars indicate the duration-weighted SDs. For population sizes between 1000 and 10,000 neurons, the biologically realistic value τ = 25 ms belongs to the best options and improves the accuracy by up to ∼40%. Anticipation can, however, also deteriorate the code quality, in our examples, for τ = 50 ms and population sizes

*N*> 6000.

**, Increasing the population size reduces the minimum error, too. The figure represents how much larger the population size**

*B**N*

_{0}of a nonanticipatory population needs to be to yield the same effect as an anticipatory population of size

*N*(color code as in

**). For example, the performance of a population with**

*A**N*= 10,000 neurons and an ATI of τ = 25 ms matches that of a population with

*N*= 40,000 without anticipation.

## References

Victor J, Purpura K: Nature and precision of temporal coding in visual cortex: A metric-space analysis. J Neurophysiol. 1996, 76: 1310-1326.

van Rossum MCW: A novel spike distance. Neural Computation. 2001, 13: 751-763. 10.1162/089976601300014321.

Kreuz T, Haas JS, Morelli A, Abarbanel HDI, Politi A: Measuring spike train synchronization. J Neurosci Methods. 2007, 165: 151-161. 10.1016/j.jneumeth.2007.05.031.

Aronov D, Reich DS, Mechler F, Victor JD: Neural coding of spatial phase in V1 of the macaque monkey. J Neurophysiol. 2003, 89: 3304-3327. 10.1152/jn.00826.2002.

Houghton C, Sen K: A new multineuron spike train metric. Neural Computation. 2008, 20: 1495-1511. 10.1162/neco.2007.10-06-350.

The Matlab source code for calculating and visualizing all ISI-distances as well as information about their implementation can be found under. [http://inls.ucsd.edu/