Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps

Cognitive maps are mental representations of spatial and conceptual relationships in an environment, and are critical for flexible behavior. To form these abstract maps, the hippocampus has to learn to separate or merge aliased observations appropriately in different contexts in a manner that enables generalization and efficient planning. Here we propose a specific higher-order graph structure, clone-structured cognitive graph (CSCG), which forms clones of an observation for different contexts as a representation that addresses these problems. CSCGs can be learned efficiently using a probabilistic sequence model that is inherently robust to uncertainty. We show that CSCGs can explain a variety of cognitive map phenomena such as discovering spatial relations from aliased sensations, transitive inference between disjoint episodes, and formation of transferable schemas. Learning different clones for different contexts explains the emergence of splitter cells observed in maze navigation and event-specific responses in lap-running experiments. Moreover, learning and inference dynamics of CSCGs offer a coherent explanation for disparate place cell remapping phenomena. By lifting aliased observations into a hidden space, CSCGs reveal latent modularity useful for hierarchical abstraction and planning. Altogether, CSCG provides a simple unifying framework for understanding hippocampal function, and could be a pathway for forming relational abstractions in artificial intelligence.

: The room in Fig 2a learned without observing action sequences. CSCG was able to recover an imperfect layout of the room, showing that observing the actions help, and absence of those observations produce a degradation, but not necessarily a catastrophic failure. Smaller rooms are recovered perfectly, despite aliasing and not observing the actions. the identifier of the executed action, there is no meaning associated to it. I.e., every time that the agent 895 moves west, it knows it is doing the same thing, but it does not have any prior knowledge of what that 896 thing is. 897 In the case of empty rooms, most of the observations received by the agent are the same (the empty 898 observation, as it wanders through empty space) and it is hard for the agent to locate itself in the room, 899 since only the walls and corners provide context. We have experimented with CSCGs learning in empty 900 rooms of different sizes. We use 50000 steps 2 in a room of size (4 + d) ⇥ (6 + d), where d is a parameter 901 controlling the room size. For rooms of size 6 ⇥ 8 and below, EM learning recovers exactly the structure 902 of the room. For larger sizes, it starts to make some mistakes in its understanding of the room, slightly 903 decreasing its predictive ability as the room grows, see Supplementary Fig. 1a. Supplementary Fig.   904 1b shows the learned transition matrix in graph form for a room of size 9 ⇥ 11. The graph looks almost 905 perfect, but if we follow the path between observations '1' and '3', we should traverse seven observations of type '7', whereas there are only six. The CSCG has merged two physical locations in the room (that have a large neighborhood of identical sensory cues) into the same perceived location.

908
Learning the structure of a room would be trivial if each observation was unique to a single room 909 location. In that case, a CSCG with only one clone would learn the correct solution in one EM step. 910 We experiment with different numbers of unique symbols randomly placed in a room. Supplementary

915
Reusing learned graphs as a schema 916 Using the learned transition graph as a schema assumes that the transition structure remains constant 917 between the learned setting and the new setting, while the mapping to the observations, the emission 918 matrix, is relearned. Relearning the emission matrix can also tolerate some imperfections in the mapping 919 between the learned graph and the new environment it is trying to model. If the new layout is smaller 920 than the original room, the smaller layout will still be learned well just by using parts of the original 921 transition matrix. Here we show that learning larger layouts compared to the original layout results in a 922 gradual degradation, of prediction accuracy.

923
In the original experiment, we showed that reusing the graph for a learned 6 ⇥ 8 room speeds up 924 learning for a new room with identical layout but different observation mapping. We further tested this 925 with three novel environments whose layouts differed from the original 6 ⇥ 8 layout -7 ⇥ 9, 8 ⇥ 10, and  Note that the prediction accuracies using an imperfectly matched graph are still significantly higher than the chance level accuracy of 0.05 obtained without the use of a graph.

932
Event-specific representations in maze elongated with novel observations 933 In the maze elongation experiment in Fig 5(d), the maze was elongated by repeating two observations 934 along the horizontal arms. We performed an additional experiment to test whether the clone activity 935 traces will be similarly preserved if the maze was elongated with a novel observation instead. Fig 3   936 shows the resultant clone activity traces. Similar to the original experiment the clone activity trances are 937 preserved across this expanded section because the novel observation appears as observation noise and 938 smoothing in CSCG ensures that the history is maintained during the noisy section. cessor representation[25, 60] loses precise temporal information and, as a result, contains strictly less 948 information than the CSCG. Additionally, unlike the CSCG, the successor representation assumes full 949 observability of the state, so it cannot be derived from partial or aliased observations. 950 We take the CSCG learned from the aliased 6⇥8 room in Fig. 2a and generate the SR from the CSCG 951 transition matrix. Then we identify which clones correspond to which spatial locations by observing 952 which clones activate in each location during inference. In Supplementary Fig. 5a we visualize the SR  for each hidden state. We also compute the eigenvectors of the matrix containing the SR of each state. 954 We visualize these eigenvectors in Supplementary Fig. 5b and we also observe various grid patters of 955 different scales, similarly to [60].
shows the inferred position in the agent's cognitive map (which has been learned from data). The agent 959 only observes the current color (and not even its own actions). There are two patches (marked in black) 960 that have identical colors, so at the beginning of exploration, the agent's belief in the cognitive map 961 (right) is split between the two possible realities. As soon as the agent exits the duplicated patch, it can 962 figure out its precise location and track it properly from that point on, as shown by the lack of ambiguity 963 in the cognitive map when the agent returns to the repeated patch.

964
Supplementary Movie 2 Inferred cognitive map over learning iterations. The CSCG transition matrix is 965 updated after each EM iteration, and the current state of the model is displayed as a cognitive map. To do 966 this, the training data is decoded as a sequence of clones using Viterbi, and the resulting clone transitions 967 are represented in a graph. The layout of the graph is obtained automatically using python-igraph.