Virtual environments as memory training devices in navigational tasks for older adults

Cognitive training approaches using virtual environments (VEs) might counter age-related visuospatial memory decline and associated difficulties in wayfinding. However, the effects of the visual design of a VE in route learning are not fully understood. Therefore, we created a custom-designed VE optimized for route learning, with adjusted levels of realism and highlighted landmark locations (MixedVE). Herein we tested participants’ route recall performance in identifying direction of turn at the intersection with this MixedVE against two baseline alternatives (AbstractVE, RealisticVE). An older vs. a younger group solved the tasks in two stages (immediate vs. delayed recall by one week). Our results demonstrate that the MixedVE facilitates better recall accuracy than the other two VEs for both age groups. Importantly, this pattern persists a week later. Additionally, our older participants were mostly overconfident in their route recall performance, but the MixedVE moderated this potentially detrimental overconfidence. Before the experiment, participants clearly preferred the RealisticVE, whereas after the experiment, most of the younger, and many of the older participants, preferred the MixedVE. Taken together, our findings provide insights into the importance of tailoring visualization design in route learning with VEs. Furthermore, we demonstrate the great potential of the MixedVE and by extension, of similar VEs as memory training devices for route learning, especially for older participants.

Navigation is a key component of human daily life, both when moving between locations in familiar environments, and when reaching new destinations in unfamiliar environments. Especially in unfamiliar environments, navigation can be a difficult task. Because of the age-related decline of some of the perceptual and cognitive abilities that support navigation, this difficulty increases as people age 1,2 . In this paper, we seek to develop a better understanding of difficulties and facilitators in route learning for older adults. Specifically, we examine the potential of virtual environments (VEs) that are custom-designed for route learning in compensating for age-related decline in navigational skills.
There are numerous technology-driven approaches to assist with wayfinding, and many dedicated devices provide real-time navigation instructions such as mobile phone apps or in-car navigation devices 3 . These devices assist people in navigating in the real world, but they are not necessarily optimized for learning novel routes or entire environments 4 . In fact, the real-time assistance might contribute to the decline in the ability to independently navigate, as a large portion of the mental effort is externalized to the device and no active engagement from the user is necessary 5 . This argument would be in line with cognitive aging propositions of "use it or lose it" 6 . In this context, we view VEs as candidate visuospatial memory training devices. Such a memory training device might benefit everyone, but might be especially meaningful for those who struggle to remember routes when walking in unfamiliar environments, such as is often the case for older adults 7,8 . Training can lead to improvements in the trained domain, such as spatial abilities 9 , route learning performance [10][11][12] , as well as other cognitive skills [13][14][15][16] . VEs are widely used for navigation-related cognitive training 17 , as they provide a safe, controlled substitute for the real world. As such, user errors pose no harm to people while navigating, display design can be personalized, and one can navigate the virtual route as many times as needed without too much physical effort. VEs, however, also possess various limitations. Importantly, most VEs only provide visual stimulation. Other sensory information typically involved in locomotion in the real world, such as vestibular-, proprioceptive-, and efferent-information, are reduced or non-existent in most VEs 18 . The VE setups that stimulate senses other than

Navigation in Older Adults: Remembering, Forgetting, and Training
As mentioned previously, it has been well-documented that aging has a negative effect on navigation performance 7 . Especially in unfamiliar environments, older adults experience greater navigational difficulties than younger adults 7,8,30 . Such difficulties can discourage older adults from exploring new environments, and negatively affect their independence and overall quality of life 31 . These age-related navigation difficulties derive from a decline in the relevant visuospatial abilities and memory capacity, both of which vary widely across individuals 32 . Most memory systems, including visuospatial memory that is necessary for navigation, seem to weaken across the lifespan 33 ; and this has been documented both in virtual and real world experiments [34][35][36][37][38] . As memory declines, people make more misattribution errors 39,40 , that is, an actual experience of an event may be misplaced in time, place or source when retrieved from memory 41,42 . Misattribution errors are common amongst older adults, especially when there are many things to remember 41 . Additionally, it has been shown that older adults overestimate the accuracy of their memories and they are too confident on specific details of their recent experiences 43 . However, the findings on misattribution errors and memory-related overconfidence in older adults might be context dependent. Existing studies are often limited to memorizing lists of words 44 , and in certain cases, to presenting pictures and videos, such as videos of crime-scenes 45,46 . When it comes to route learning in unfamiliar environments, it has been shown that older adults tend to confuse the location of landmarks at critical decision points 47 . At this point, however, we know little about the effects of visual design on misattribution errors and memory-related overconfidence. Thus, identifying the optimal design choices for the features of visuospatial training material that facilitates navigational performance in later life seems warranted.

Visuospatial Displays as Training Devices for Route Learning
Learning is the consequence of a complex interplay between sensation, perception, cognition, and experience 48 . In many learning tasks, visuospatial information processing plays a key role. A number of design decisions on how the visuospatial information is represented might affect memory, and consequently, impair or improve route learning [49][50][51] . We find that a VE optimized for route learning should be balanced for the amount, the quality and the position of the presented information 52 . Quality related considerations are beyond the scope of this paper 52,53 . In this paper, we focus on the amount and the position of the presented information, and we examine their impact on route recall in combination (i.e., not independently).
Cognitive load is one of the strongest arguments against visual realism as a display principle 28 . Controlling for the amount of information by varying the levels of realism is one way to address cognitive load in route learning in a VE. Depending on the context, one can also highlight landmarks by using symbols (e.g., arrows, letters, colors, outlining the object) or by placing discrete objects at critical locations serving as landmarks, and it has been shown that such approaches increase their memorability 54 . On the other hand, as mentioned earlier, an important argument in favor of realism is that a high degree of realism might make it easier for people to recognize, name and thus relate to the elements (e.g., trees, benches, windows) on a display 22 as they acquire a meaning 24 . 'Nameability' of items might be helpful in memorizing them, for example, people remember nameable colors better than others 55 . It has been proposed that the verbal memory systems help in such cases, because people do not rely only on visuospatial memory systems for key executive functions such as the encoding, storage and recall of information (i.e., the dual channel theory) 56 . However, in learning from visualizations, the question of 'how much information is too much/too little?' remains persistent 57,58 . In the case of a VE, one might use photo-textures Scientific REpoRTS | (2018) 8:10809 | DOI:10.1038/s41598-018-29029-x selectively to maintain a sense of realism, and to enable recognition of features, while reducing cognitive load at the same time. On the other hand, a certain level of abstraction guides the attention to task-relevant features 27 , which might facilitate remembering and learning.
Besides the amount of information, the position of a feature within the scene imposes an important consideration for route learning. Navigation studies and landmark theories mention some unambiguously relevant visuospatial elements that are positioned in specific places in the visual scene for route learning [59][60][61][62][63] : Structural, visual, and/or semantic features determine the importance of landmarks [64][65][66][67][68][69] . Specifically, decision points are critical as in these points people 'take mental notes' of a feature and retain that as a landmark; and reportedly, these features are consistently located in the direction of turn 60 . Related to the position of features, or classes of features, it appears that the structural network (i.e., street network and its spatial layout) provides another important visual anchor in route learning, and might contribute to the memorability of a scene 70 .

Our Study
Synthesizing previous work summarized above, we designed an 'optimized VE' for route learning, which we call the MixedVE. The VEs in this study were named based on their relation to abstraction-and-realism; however, note that the optimization is based on two important considerations; (a) reducing the level of realism by removing photo-textures from task-irrelevant parts of the VE (i.e., manipulating the quantity of visual information), and (b) deliberately choosing the locations of the photo-textured elements (i.e., manipulating the position of visual information). Because we are interested in optimizing the MixedVE as a memory training device, we combined these considerations when designing the MixedVE. Also note that, while we do not investigate aspects of quality in this paper, we counterbalance the content of the textures for their semantic qualities based on a previous qualitative assessment for their levels of memorability 52,53 . In sum, in the MixedVE, we highlight selected elements in the scene (i.e., buildings at decision points positioned in the direction of the turn along the route of interest, and the street network) with realistic photo-textures, and suppress the rest by removing photo-textures. Figure 1 shows an illustration of the MixedVE and the other two VEs we used for comparison (AbstractVE and RealisticVE). We chose to compare the MixedVE with a RealisticVE as a high-fidelity representation of the real world. The RealisticVE contains all the visual information including the photo-textures at the navigation-relevant scene elements, however, it does not highlight the navigationally relevant environmental features. The AbstractVE, on the other hand, serves as a baseline condition with no photographic information, and again, no highlighting effect. The fact that the AbstractVE contains considerably less information should significantly reduce the cognitive load induced by photo-textures, although it might increase task difficulty otherwise, because of the lack of anchor points.
We previously demonstrated that younger adults overall benefit from the MixedVE compared to the AbstractVE and RealisticVEs in visual, spatial, and visuospatial memory tasks in a route learning context 53 . In this paper, we examine the potential of the MixedVE as a memory training device in route learning, particularly for older people. Our leading hypothesis is that the MixedVE will successfully serve older people as a memory training device in the context of route learning, specifically, in memorizing and identifying the direction of turns at the intersections, because of the following sub-hypotheses: • Due to the balanced cognitive load and the selective highlighting as a consequence of retaining photo-textures only in navigation-relevant scene elements, irrespective of their age, participants should identify the direction of turn at intersections better (thus, recall the route better) with the MixedVE than with the other VEs, both immediately after the experiment, and a week later. • Irrespective of age, participants' overall confidence in their responses should better align with their recall accuracy with the MixedVE than other VEs. In addition, overall, older participants should be overconfident in their responses in comparison to younger participants. Thus, the moderating effect of the MixedVE should be more pronounced for the older participant group. • Before the experiment, both older and younger participants should prefer the RealisticVE 25,28 . After the experiment, younger participants should change their preferences to the MixedVE. Older participants, however, due to the decline in some of the relevant spatial abilities, might not be able to identify which visualization supports them better, and thus should still prefer the RealisticVE after the experiment.
We tested our hypotheses in a between-subject experiment with an older group (65-75 yrs.) and a younger group (20-30 yrs.) as a comparison group. In the experiment, participants watched a driving simulation video, in which they viewed the route from the 'passenger seat' and were asked to memorize the route. After they watched the videos, participants were given various visuospatial recall tasks in two 'recall stages' (immediate vs. delayed by a week) to measure learning. In this paper, we focus on one of the task types; that is, identification of heading direction at intersection points. This is a typical task in route learning studies and previous findings allow us to build our age-related hypotheses for this task type 67,[71][72][73][74] . In the Procedure section, we describe all of the tasks for full disclosure, and elaborate further on our choice on focusing on this task type. We report the main findings for the two other tasks in the Appendix: Additional Analysis. Our three independent variables are visualization type (the three VEs), age (older vs. younger), and recall stage (right after the experiment vs. one week later), whereas we measured three dependent variables: participants' route recall accuracy, their confidence in their recall performance, and their visualization preference before and after the experiment.

Results
We first report the overall route recall accuracies of the younger and older participants (age), for the immediate and delayed recall stages (recall stage) with all three VEs (visualization type). Furthermore, we report the forgetting rates, (the difference in recall accuracies between the two recall stages) for both groups. We then analyze participants' confidence in their responses. Since confidence in one's success in solving a task can be viewed as one's "perceived accuracy" on that task; we compare the perceived and the actual accuracies of participants to examine underconfidence or overconfidence (known as calibration error 43 ). Last but not least, we present participants' visualization preferences, and how these preferences shifted among the three VEs before and after the experiment.
Sample size has been estimated via a power analysis using the G-power software. In all tests in which significant results were obtained, the F test was followed by Bonferroni's post-hoc test for multiple comparisons. Associated p-values < 0.05 are reported as statistically significant, along with the effect sizes (η p 2 , r, and Cohen's d).
Even though we did not observe interactions between age × recall stage × visualization type F(1, 79) = 0.58, p > 0.05, η p 2 = 0.00, we present an overview of the relative recall accuracies of the two age groups in the two stages in Fig. 3. This is accompanied with the inferential statistics in Table 1, to demonstrate how the MixedVE facilitates recall performance better than other VEs in all conditions. Participants' confidence in their recall performance. The calibration error was obtained by dividing the recall accuracies by confidence ratings ("perceived accuracies"). For better readability, we scaled the obtained values to diverge from zero, with zero being the perfect match between perceived and actual recall accuracy, and values diverging in opposite directions from zero signifying overconfidence(o) and underconfidence(u).
A 2 (age) × 2 (recall stage) × 3 (visualization type) mixed-design ANOVA revealed significant differences in participants' calibration errors. The main effects are shown in Similarly as in the recall accuracy analyses, even though the age × recall stage × visualization type 3-way interaction for the calibration error was not statistically significant F(1, 79) = 1.95, p > 0.05, η p 2 = 0.00, we present an exploratory overview of the calibration errors of the two age groups in the two stages in Fig. 5, along with the inferential statistics in Table 2. These results demonstrate that the two age groups may have different calibration error patterns. The younger participants rated themselves relatively accurately (they were slightly underconfident) in the immediate stage with all three VEs. In the delayed stage, the younger participants grew overconfident with the Abstract and Realistic VEs, whereas underconfidence persisted with the MixedVE. The older participants were consistently overconfident in all tested conditions, but clearly with the least calibration errors with the MixedVE (close to zero) in both stages. With the lapse of time, both age groups became significantly overconfident with the RealisticVE.
Preference for specific visualization types. Participants' preferences for the three VEs before and after the experiment are presented in Fig. 6.
As Fig. 6 shows, before the experiment, the younger participants mostly preferred the RealisticVE (88%), while only 12% preferred the MixedVE (none prefers the AbstractVE). For the older participants, this is even more pronounced: 97% preferred the RealisticVE and the remaining 3% preferred the MixedVE (again none preferred the AbstractVE). After the experiment, however, 69% of the younger participants favored the MixedVE, while 31% kept their initial preference for the RealisticVE (the AbstractVE remains unpopular). Older participants display a different pattern: 54% of them still preferred the RealisticVE, a considerable 38% switched to the MixedVE, and 8% preferred the AbstractVE.
The shift in visualization preference from RealisticVE to MixedVE was statistically significant both for the younger (χ 2 (1) = 67.41, p < 0.001), as well as for the older (χ 2 (1) = 41.86, p < 0.001) participants. No shift from MixedVE to RealisticVE occurred. The odds ratio (i.e., the effect size) of the younger participants changing their preference from RealisticVE to MixedVE were 16.03 (7.440, 37.090), whereas for the older, this was 22.41 (6.635, 119.180). Due to the unpopularity of the AbstractVE (zero values), we did not include it in the chi-square analysis.

Discussion
Despite the popularity and promise of VEs 17 , little is known about how older adults are affected by differently-designed VEs in route learning tasks in comparison to younger adults. This is surprising given the importance of maintaining spatial functioning and navigational skills to independently conduct daily life activities across the lifespan. Synthesizing knowledge from several disciplines, we designed an experiment to investigate the potential of a custom-designed MixedVE as a memory training device in route learning. We expected that the MixedVE would help all users; however, because of the age-related decline in visuospatial memory capacity, we were particularly interested in the performance of the older participant group. We included a fully photo-textured RealisticVE as the 'gold standard' because this is a high-fidelity representation of the real world, and an AbstractVE with no photo-textures as a baseline, to examine if, and how much, our customized MixedVE improves the memorability of the given route in comparison to these two VEs.
Route recall accuracy improves with the MixedVE irrespective of age. Our findings clearly confirm that the MixedVE improves recall accuracy of all participants in intersection-by-intersection visuospatial route learning tasks (i.e., in identifying direction of turn) considerably and consistently with large effect sizes; both immediately after the experiment, as well as one week later (Fig. 3). Note that these results are consistent across the 'spatial tasks' as well, whereas we do not observe a clear pattern for the 'visual tasks' , possibly because recall accuracies are close to chance level with the visual tasks (see Appendix: Additional analysis for overall recall accuracy results based on visual and spatial tasks). Overall, our findings provide clear support for the notion that design decisions are important for successful utilization of VEs for route learning. Note that since we manipulated two design elements in the MixedVE -adjusting the level of realism, and deliberately selecting the landmark locations 60 -we cannot distinguish whether and how much each of the two manipulations influence route recall performance. However, the purpose of the study was not to disentangle the contribution of these two design decisions, but to evaluate an "optimized" design. This required us to consider previous knowledge about what design decisions might improve route learning and recall performance. Our findings suggest that reducing the amount of realism while keeping crucial (i.e., navigationally-relevant) information, indeed assists participants in both age groups in identification of the turn directions, and by extrapolation, route recall in general. In other words, since we were set to measure an optimized design against baseline alternatives, we will not discuss the separate effects of realism levels and landmark locations; these were shown by others in dedicated experiments. Previous work suggests that visualizations that contain too much or too little information can have negative effects on memory performance 28,58 . Our results regarding Abstract and RealisticVEs confirm that both too much and too little information indeed impair performance in a VE-based route learning task, and also importantly, this is true also for older adults (65-75 yrs). As mentioned earlier, another key design decision was the position of the highlighted landmarks in a virtual scene. It is known that people rely on landmarks at specific locations in wayfinding tasks 60,70,75 . Our results with the MixedVE suggest that 'highlighting' landmarks at task-relevant It must be noted that, given that we use VEs as a proxy to real world, using photo-textures for highlighting the features of importance is an appropriate choice, and might transfer well to the real world through the resemblance of detail found in photography. However, the use of photo-textures to highlight the selected features (in this case, buildings and the structural network) is one of the many ways one might design a VE as a memory training device. Other means of highlighting, such as using color or outlining the features of interest, may also prove useful. Therefore, it would be useful to examine other means of highlighting in future experiments for a holistic understanding of highlighting techniques for memory training devices. Also note that the decision to remove realistic detail from a virtual scene immediately triggers the question of where such removal would be most appropriate. Removing realistic textures randomly (or based on other criteria) might lead to different outcomes than what we observed in our study. Because we aimed to optimize the MixedVE for route learning, we retained the realistic detail at locations that are relevant to route learning, and for the task examined in this paper, our design decisions provided benefits to the participants.
Our main findings in the recall accuracy analyses confirm an age-related difference disfavoring older adults in route learning performance with medium to large effect sizes 7,8,29,30 . Overall, younger participants recalled routes more accurately than the older participants; irrespective of the visualization type and recall stage (Fig. 3). A closer inspection reveals that age and visualization type do not interact: Recall performance for both older and younger participants were best with the MixedVE, and the two age groups' recall performance were similar in the two stages. While the age-related memory decline and its various effects on cognitive functions are well documented 47 , studies that examine age differences in connection to levels of realism in visuospatial displays are rare. In this study, we observe that the abundance or lack of visual information do not seem to affect the older group differently than the younger. Our findings suggest that the complications linked to "too little" and "too much" visual information are fundamental problems that transcend age-related differences.
In contrast to the other two VEs, recall performance did not significantly decline after a week with the MixedVE for either age group (Fig. 3a). This finding is important, because it suggests that removing unnecessary information from a realistic VE and leaving it only in navigation-related locations (compare MixedVE vs. RealisticVE); while highlighting relevant information in navigation related locations -in this case, with realistic photo-textures -(compare MixedVE vs. AbstractVE), support learning beyond short-term route memorization.
A surprising finding regarding the two recall stages was that the forgetting rates of the older participants were not stronger than those of the younger ones after one week. Thus, our findings in the context of route learning in VEs support the notion that age differences in memory are stronger in encoding than in retrieval, as our older participants did not necessarily experience problems in retrieving the information (stored in their memory) one week later. Evidence regarding age differences in encoding versus retrieval is mixed, however, current understanding is that both are affected by age. An earlier study that tested memory for positions of the pawns in a chess game 76 (a visuospatial task at a different scale) also suggested that it is more the encoding than the retrieval process that is affected by aging. Some other studies, carefully designed to tease apart encoding and retrieval processes experimentally, in contrast, have shown that encoding, retrieval, as well as forgetting rates are negatively affected by aging [77][78][79] . These differences may depend on a multitude of factors, such as the context in which the studies are conducted or the individual differences among the participants. Further research may help understanding such contradicting observations better.
Overall, as both age groups seem to benefit from the MixedVE, we believe that the basic design assumptions of the MixedVE are fitting choices for route learning in VEs, and that MixedVE, and by extension, similarly designed VEs, have clear potential as memory training devices irrespective of age. Older participants benefit from the MixedVE in calibrating their confidence. It has been previously shown that older people are overconfident in cued recall tasks unrelated to navigation 43 . Thus, we hypothesized that older participants might also be overconfident in route recall tasks. Our calibration error analysis confirms that older participants indeed overestimate their route recall performance in general, in both the immediate and delayed stages with medium effect sizes. This is somewhat alarming, because in a route learning scenario, arguably, overconfidence can be more of a threat than underconfidence. That is, a false belief that one has 'learned the route' might lead to premature action and complications in wayfinding. From this perspective, the fact that the calibration errors with the MixedVE in the delayed recall stage are near-zero for the older group is a very promising result. In other words, with the MixedVE, older people might be less prone to overestimate their performance, and take fewer risks. The younger group is somewhat underconfident with the MixedVE, however, we believe this is less of a threat; as a consequence, they might behave more carefully while navigating after learning with the MixedVE, or practice more. Both age groups, but particularly the older group seems to be overconfident with the AbstractVE and the RealisticVE in the delayed recall stage with medium effect sizes. With the AbstractVE, the overconfidence may at first appear surprising, as with such low accuracy, one would expect the confidence ratings to be low. Perhaps the visual similarity of the objects to one another, such as it is the case with buildings in the AbstractVE, led to misattribution errors, resulting in a false sense of familiarity a week later when recalling is harder than immediately after the experiment. Similarly, we observe that both age groups had a false belief that they are doing better with the RealisticVE in the delayed recall. This might be explained by the previously documented mismatch in people's accuracy and confidence in other contexts 80 . In this case, because people could identify particular elements in the visual scene after a week passed, they falsely believed that these assisted them to recall a route. Note that the results regarding the calibration error analysis should be viewed as an exploratory analysis, as the overall interaction of age × visualization × recall stage did not reach significance; while these results allow us to hypothesize, more testing is needed to confirm them.
Overall, the MixedVE afforded a better self-assessment than the other two VEs with medium effect sizes for both age groups, possibly because participants could more precisely recall what they have seen. Importantly, the MixedVE offered a clear advantage for the older group, enabling them to calibrate their confidence that matches their performance much better; thus lending itself as a promising candidate for the development of novel training paradigms for all, but especially for older adults.
Participants prefer the RealisticVE before the experiment, but many switch to MixedVE after. We find clear signs of naïve realism 28,81 when participants stated their preferences for the visualization types before the experiment. Both age groups overwhelmingly preferred the RealisticVE before the experiment (younger: 88%, older: 97%). These results provide unambiguous evidence of how strongly people are attracted to realistic displays 81 .
After participants experienced the VEs and solved the route recall tasks, however, we saw dramatic changes in participants' preferences. As predicted, most of the younger participants shifted their preference from the RealisticVE to the MixedVE after the experiment. This suggests that the younger participants successfully identified the assistance they received from the MixedVE, and valued their performance with it (i.e., sometimes people prefer the inferior product knowingly, simply because they like it). Nonetheless, a notable sub-group of younger participants (31%) stayed with their original preference for the RealisticVE. The older participants' preferences after their experience with the VEs show a different pattern than the younger participants': Even though a large number of older participants also switched to MixedVE (38%), the RealisticVE remained their favorite choice also after the experiment (54%). This may be linked to the overall lower exposure and experience with VE technologies. Furthermore, in Smallman and John's 2011 81 naïve realism study, participants with lower spatial abilities did not necessarily change their preference towards less realistic displays after the experiment, even though those with higher spatial abilities did. Perhaps our findings and theirs are linked; one can speculate that people who do not perform too well for various reasons (age or lower spatial abilities) might be less deliberate about the tools they choose. Thus, when designing future visuospatial memory training devices intended for people with limited experience and abilities, it is important to remember that the acceptance of the proposed device might be a barrier to achieving the memory improvement goals, and additional considerations might be necessary.

Conclusions
Motivated by earlier work on cognitive training, and informed by the principles of visualization design, we tested if one can customize a VE, which could eventually be used as a memory training device in a route learning context. Importantly, because visuospatial memory is negatively affected by age, we focused our efforts on understanding how well our candidate memory training device (the MixedVE) would work for older adults. Specifically, we focused on the visual design of the VE, because design choices can have a strong impact on how well a visualization functions, including its memorability. Thus, we examined aspects of design that should be considered for creating memorable VEs, especially for route learning. Our intuition, as well as the previous work suggested that we represent the world with high fidelity, and replicate the reality in a simulated environment. However, previous empirical evidence in various other contexts led us to believe that we can improve the design of the VE to better function as a memory training device for route learning if we control the amount of visual realism instead. However, we did not 'randomly' remove redundant information. Instead, we designed the MixedVE, in which we used photo-textures only at the navigation-relevant locations, that is, where we knew people would look for landmarks. By 'translating' the previous empirical evidence into design from two perspectives (realism and landmark use), we essentially highlighted navigation-relevant information in the locations that matter to the viewer to increase their saliency and memorability, and we suppressed less relevant information to reduce cognitive load.
Our results provide new insights for the design of VEs and their possible use as visuospatial cognitive training devices for route learning, especially in older adults. Overall, the MixedVE was more memorable than the others, and it facilitated high recall accuracy in identification of turn-of-direction tasks at the intersections (and by extension, in route learning), irrespective of age, both in short and long term. The fact that the MixedVE facilitated both immediate and delayed recall and in both age groups shows how effectively the design choices can improve performance whether one is old or young. Furthermore, the stable recall performance with the MixedVE even a week after the participants watched the simulated video (only once), clearly demonstrated its promise as a potential training device. Participants' confidence in their performance matched their actual performance better with the MixedVE compared to the other VEs, and this is especially evident for older participants. The fact that the MixedVE helps with adjusting for overconfidence in older adults has important positive implications on their potential navigational behavior. Furthermore, a large number of participants preferred the MixedVE to others after working with it, even though some more design adjustments might be necessary for an older audience.
Taken together, our findings demonstrate the potential of the MixedVE as a memory training device, for all ages but especially for the older adults, which encourages us to continue this line of research. Aside from these applied implications, we developed a better understanding of the age differences in learning from a VE. Specifically, we know more about the effects of combined visualization design choices (realism levels with landmark locations) on the recall accuracy, confidence and visualization preferences of people from two distinctly different age groups in route recall tasks.

Methods
We conducted a controlled experiment with a mixed factorial (2 × 2 × 3) design. Age was a between-subject factor (younger vs. older), visualization type (i.e., Abstract, Mixed, Realistic VEs), and recall stage (i.e., immediate vs. delayed) were within-subject factors. All participants performed route learning tasks in all three VEs and at two stages one week apart. As dependent variables, we measured the recall accuracy in all the tasks, with a focus on the direction of turns at intersection points, where we also measured participants' confidence in their responses, and their visualization preferences.
Participants. In total, 81 participants took part in the study: 42 in the younger group (27 ± 2 yrs., 23 female), and 39 in the older group (70 ± 4 yrs., 17 female). The younger participants were between 20-30 years of age and were recruited by word of mouth. The older participants were between 65-75 years of age and were recruited using the participant pool of UZH's University Research Priority Program "Dynamics of Healthy Aging" (http:// www.dynage.uzh.ch/en.html). This experiment was approved by the Ethical Committee of the Philosophical Faculty -University of Zurich with the form "Checkliste für die Selbstbeurteilung von Studien auf ethische Unbedenklichkeit". All methods were performed in accordance with the relevant guidelines and regulations and all participants received informed consent, which after agreement they signed. All participants volunteered to participate, signed a written consent form and could withdraw their participation at any time. All participants performed the Mini-Mental State Examination to measure their cognitive status (MMSE). They were included in the study only if they scored a minimum of 27 out of 30 82

Materials
Apparatus. The experiment was performed in the 3D visualization/virtual reality lab of the GIVA unit of the Department of Geography, of the University of Zurich. Passive drive-throughs of the routes were presented as videos to participants on a large projection screen (230 × 140 cm). The participants were seated at a distance of 2.2 m from the screen to ensure that they could see the whole scene.
Stimuli. Participants were shown videos of drive-throughs in a virtual fictitious city. Using procedural modeling, we designed the city to look as homogenous as possible to control for salient elements which might potentially interfere with route learning. Thus, the city contained buildings and other structures similar in size and architectural style, similar street network (intersection points with ~90 degree angles) over the whole city, and other visual elements (e.g., trees) were also kept similar to each other in size and other visual characteristics. We manipulated the design to obtain three different virtual environments (VEs, visualizations), as illustrated in Fig. 1. These three VEs differed in their degree of realism, and they represent the three main experimental conditions: • a plain grayscale VE without any photo-textures (AbstractVE) • a color photo-textured VE (RealisticVE) • a mix of the two above, in which the buildings at all decision points towards the direction of the turn, and the structural network (street floors) are textured using color photography (MixedVE). Thus, we used photo-textures as a particular type of highlighting choice, because we work with realism as an important concept in route learning for transferability of acquired knowledge to the real world; and the position of the highlighted landmarks were selected based on landmark theories (we selected the positions that were previously shown as important positions where people took mental notes).
We created two routes in each of these three VEs. Each route consisted of seven intersections (three left, three right turns and one straight). The videos of the drive-through of these two routes were recorded at the same eye-level (1.50 m), had the same duration (100 sec) and were played back at the same speed (30 km/h). Each participant experienced two videos in all three VEs, adding up to a total of six different videos. Videos of the routes were shown only once. Using a Latin squares approach; we systematically rotated the order of the videos.
Task. Participants were instructed to memorize the routes to the best of their ability. Once they watched the video, participants performed a series of different tasks as follows: (1) Identifying if a scene (screenshot) was on their route ("yes" or "no" answer), based on six screenshots from each path in each VE type (three were correct and three false). We call this set "visual tasks" as the participants would predominantly rely on visual information, while the location of the information was not relevant. (2) Drawing a sketch of the route using top-down screenshots of each VE. We called this set of tasks "spatial tasks", because location and orientation are the key to solving the task, while the visual information is not as important. (3) Identifying the direction of turn at each of the seven intersection points based on a screenshot of the intersection point. We called this set of tasks "visuospatial tasks" one has to make use of visual cues, as well as location and orientation to solve the task. In other words, based on previous work 67, [72][73][74]83 , we believed that participants would have to rely on both visual and spatial memory to solve this task. We thus considered this predominantly a visuospatial memory task.
More specifically, because participants responded to questions based on three VEs, for two routes in each VE, with a total of seven intersections at each video, they provided a total of 42 individual responses (3 × 2 × 7) in this task set. As mentioned earlier, the intersection points were presented as screenshots from the videos in a randomized order in the recall phase. Participants were asked to choose the direction in which they continued their route among the given options. Approximately one week after the first session (immediate recall stage), participants repeated the tasks without watching the videos again (delayed recall stage). Besides the "I don't know" option, participants could mark left, straight, and right; giving them a 33% chance to guess the correct answer.

Procedure
Upon arrival at the lab, participants signed an informed consent form. We briefed them about the procedure, introduced them to the hardware setup, and answered their questions, if they had any. We then assigned each participant to one of six videos (3 visualization types, 2 routes each). Before starting the actual route learning experiment, we showed participants a representative screenshot from each of the three VEs, and asked them to rate their preferences for a hypothetical route learning task. Immediately after this, the main experiment begun. Participants were given a scenario in which someone took them to a market in an unfamiliar neighborhood, and they were told to memorize the route as they would have to navigate the same route later by themselves. After watching each video only once, they answered a set of recall questions based on this specific video, and rated their confidence for each of their responses using a 5-point Likert scale that varied from "Not at all confident (1)" to "Very confident (5)". After solving the tasks with three of the videos, participants could take a short break for approximately three minutes to counter potential fatigue. The last three videos followed in the same fashion. We then asked the participants which of the three VEs they preferred. There were no time limits in the experiment, thus the experimental duration of the first session (immediate recall stage) varied from 1 h to 1h40min. Participants came back six to eight days after the first session for the second session (delayed recall stage). In this stage, participants were not shown the videos again, thus they responded to the questions based on what they could recall from the first session. The duration of the second session varied from 40min to 1 h. Appendix: Additional analysis Spatial task. A 2 (age) × 2 (recall stage) × 3 (visualization) mixed-design ANOVA revealed significant differences in the sketching task for two out of the three independent variables (no difference for recall stage). Figure 7 depicts the descriptive and inferential statistics; statistically significant differences were observed for (a) Figure 7. Spatial tasks. Main effects of (a) age, (b) recall stage, (c) visualization type on sketch task, and (d) interactions between age × visualization type (irrespective of recall stage). ***p < 0.001, *p < 0.05. Error bars: SEM. 158) = 11.69, p < 0.001, η p 2 = 0.01 (Abstract: 53.6% ± 32.9%, Mixed: 57.6% ± 33.9%, Realistic: 50.5% ± 33.4%) and (c) age × visualization F(2, 158) = 3.80, p < 0.05, η p 2 = 0.01. This interaction was driven by the significantly larger difference in the sketching performance between the Mixed and the Realistic visualizations for the younger participants compared to that of the older (young: 11.0% ± 21.3%, older: 3.0% ± 15.0%, t(149.35) = 2.77, p < 0.01, r = 0.22). Interestingly, the recall stage did not reveal statistically significant differences (immediate: 55.4% ± 32.1%, delayed: 52.4% ± 34.7%), neither did any other of the interactions.
Overall, the results from this task are in line with the visuospatial task. Age and visualization seem to matter for the performance, with the MixedVE resulting in best performance compared to both the Abstract and the Realistic VEs. Participants' performance in the different recall stages was not significantly different in the spatial task. This might be explained by the active involvement required to fulfill the task. That is, the fact that participants actively drew the path immediately after the experiment, may have resulted in them learning to solve this task better than the other tasks in which they only passively watched the stimuli.
Note that regarding the age differences, the literature suggests that spatial memory functioning tends to decline with age, but visual memory might be 'spared' 84 . Therefore, we have speculated from the start that the results for the visual task would be different than other memory tasks, which load on spatial memory more heavily. Initially, the results from the visual task do not seem to agree with the results from the visuospatial and the spatial tasks. Especially the interaction between age × visualization shows a conflicting pattern for the two age groups, with the MixedVE being more supportive for the young but not for the older, who seem to achieve higher recall with the AbstractVE. When examining the exact performance values from the visual task, however, we see that the performance is close to the chance level (50%) for the older participants. Thus, lack of interactions in this case may be due to task difficulty, which likely caused a "floor effect" for the older group and for both age groups in the delayed recall stage. In other words, overall task difficulty could have overshadowed these interactions. Data availability. The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.