Cultural determinants of the gap between self-estimated navigation ability and wayfinding performance: evidence from 46 countries

Cognitive abilities can vary widely. Some people excel in certain skills, others struggle. However, not all those who describe themselves as gifted are. One possible influence on self-estimates is the surrounding culture. Some cultures may amplify self-assurance and others cultivate humility. Past research has shown that people in different countries can be grouped into a set of consistent cultural clusters with similar values and tendencies, such as attitudes to masculinity or individualism. Here we explored whether such cultural dimensions might relate to the extent to which populations in 46 countries overestimate or underestimate their cognitive abilities in the domain of spatial navigation. Using the Sea Hero Quest navigation test and a large sample (N = 383,187) we found cultural clusters of countries tend to be similar in how they self-rate ability relative to their actual performance. Across the world population sampled, higher self-ratings were associated with better performance. However, at the national level, higher self-ratings as a nation were not associated with better performance as a nation. Germanic and Near East countries were found to be most overconfident in their abilities and Nordic countries to be most under-confident in their abilities. Gender stereotypes may play a role in mediating this pattern, with larger national positive attitudes to male stereotyped roles (Hofstede's masculinity dimension) associated with a greater overconfidence in performance at the national level. We also replicate, with higher precision than prior studies, evidence that older men tend to overestimate their navigation skill more than other groups. These findings give insight into how culture and demographics may impact self-estimates of our abilities.


Scientific Reports
| (2023) 13:10844 | https://doi.org/10.1038/s41598-023-30937-w www.nature.com/scientificreports/ Age, gender, self-estimates and wayfinding performance. Following the data processing and sampling operations described in the "Methods" section, the resulting sample size used to explore the relationships between age, gender, self-estimates of navigating skills and wayfinding performance included 383,187 individuals (from 46 countries) who completed six wayfinding levels of the Sea Hero Quest game: 3, 6, 7, 8, 11 and 12 (N Females = 172,969, MeanAge Females = 37.08 ± 14.53, N Males = 210,218, MeanAge Males = 37.14 ± 13.39; for more details, see Supplementary Information, Appendix A, Table S1). The ordinal logistic regression analysis (with self-estimates nominal responses as the dependent variable; gender and age bands as the independent variables) revealed that males were significantly more likely to report good navigating skills than females (LR χ 2 1 = 21,895.2, p < 0.001). Based on the model (Supplementary Information, Appendix A, Table S2), men were almost two times more likely than females to rate themselves as very good navigators and approximately half as likely as women to rate their navigating skills as either very bad or bad across all age bands ( Fig. 2A). Age was also found to be a statistically significant positive predictor of navigating self-estimates (LR χ 2 4 = 1299.5, p < 0.001). The best self-reported navigation ability in the sample was reported by 40-59-year-old males (Fig. 2B), however older males (between 60 and 70 years old) rated their navigating skills more favourably than the youngest men in the study (19-29-year-olds) despite their much poorer actual wayfinding performance when compared to all younger age bands of the male sample (Fig. 2C). This finding supports the results of earlier studies 22,28 which found similar overestimation of navigating abilities amongst oldest male participants. With our larger sample we can now reveal that at each age band there is a preserved ordering of self-ratings in relation to performance, e.g. people who self-rate as very good are the best navigators (Fig. 2C). This provides evidence that when people self-rate they may do so in relation to their peers, both with regard to people of the same age and people of the same gender.
We used a hierarchical multivariate linear regression analysis (Supplementary Information, Appendix A, Tables S3 and S4) to establish whether the reported ratings of navigation skills could predict wayfinding performance as the dependent variable. A model with self-ratings included captures more of the variance than a model without them (F 3 = 536.73, p < 0.001) with self-rated ability the fourth strongest predictor of wayfinding performance amongst all independent variables used in the model (F 3, 370561 = 536.73, p < 0.001, η 2 = 0.004), behind age (F 1, 370561 = 55,125.38, p < 0.001, η 2 = 0.123), gender (F 1, 370561 = 19,685.08, p < 0.001, η 2 = 0.044) and home environment (F 2, 370561 = 587.35, p < 0.001, η 2 = 0.003), but before education (F 3, 370561 = 361.11, p < 0.001, η 2 = 0.002) and commute time (F 2, 370561 = 61.27, p < 0.001, η 2 = 0.0003). Independent variables in both models were tested for multicollinearity with the variance inflation factor and generalized variance inflation factor-the latter was used as some terms (e.g. categorical predictors) had degrees of freedom greater than 1. Both metrics for each predictor were close to value 1 suggesting that the results of the hierarchical regression models were not due to high correlations between each predictor and the remaining independent variables.
Cultural determinants of self-estimates and wayfinding performance. To explore how cultural differences might relate to self-ratings, we aggregated the data from the 46 countries sampled into the 11 cultural clusters derived by Ronen and Shenkar 56 ( Fig. 3A; also see "Methods" for details): Germanic, Eastern European, Nordic, Far East, Near East, Confucian Asia, Latin Europe, Eastern Europe, Anglo, Arabic and African. The highest proportions of population with either good or very good self-estimates of their navigating skills were recorded in Germanic, Eastern European and Latin American cultural clusters (96%, 90% and 88% of their respective samples), whereas groups of countries with the most modest beliefs about their navigating skills were Latin Europe, Nordic and Confucian Asia (85%, 84%, and 81%, respectively; proportions of self-reported navi-   Tables S5  and S6). However, more positive self-estimates did not translate to better wayfinding performance at the national level; there was no significant correlation between these two metrics (Fig. 3H). Notably, we replicated our prior finding that GDP correlates with navigation performance 19 (here focusing on wayfinding in 46 countries) Spearman's rho = 0.572, p < 0.001. By contrast self-estimates do not correlate with GDP (Spearman's rho = 0.11, p = 0.48). Thus, it appears that increased economic wealth in a country is associated with better performance, but not better self-estimates of performance. We next explored the gap between a country's self-rated performance and actual performance. We found that similar to country clusters of self-ratings, country clusters also showed tendencies to be over or under confident as a group. Near Eastern and Germanic countries were most likely to overestimate their ability and Nordic countries most likely to underestimate their ability (Fig. 3). Examining the geographic spread of the data further highlights the tendency for neighbouring countries and countries within cultural groups to be more similar (Fig. 4). We found a similar pattern for both men and women when the linear models were fit to men and women separately ( Supplementary Information, Appendix A, Fig. S1). Thus, overestimation of performance in some countries is not due to, for example, predominantly the men overestimating their performance and female participants accurately estimating them.
Based on past literature 42,43 , we considered that one reason some countries might overestimate ability is that they have a stronger association with male typical stereotypes, with navigation one of these stereotypes. If a country feels it is important to be good at male associated activities, it is possible that nationals of that country would be more confident in these abilities. In support of this we found that greater masculinity ratings for a country was We also found a similar association with purely self-estimate ratings (Spearman's rho = 0.40, p = 0.008). To explore more broadly the cultural dimensions that might impact the gap we examine the full set of 6 metrics developed by Hofstede 45,46 (Supplemental Information, Appendix B, Table S7). We found that masculinity was the strongest predictor of www.nature.com/scientificreports/ the gap and survived Bonferroni correction for significance in this model. We also found that Hofstede's cultural dimension of uncertainty avoidance was also a predictor of the gap in this statistical threshold corrected model. Nations who are more keen to avoid uncertain situations were more likely to overestimate navigation ability (see Supplemental Information, Appendix B). Because we found differences between men and women in self-ratings, we explored how national patterns differed for men and women. We found relatively consistent patterns for both men and women across the countries nationwide. In countries with high self-ratings, both women and men rated themselves similarly as high, and in countries with low self-ratings, similar low self-estimates were present for both gender groups. Thus, the patterns observed at the national level do not appear to be strongly influenced by gender, but rather by overall cultural attitudes.
Having previously reported that the navigation performance gap between men and women was correlated with the gender gap index 19 , we also explored the self-ratings-performance gap in relation to such a metric. We found there was significant positive correlation between the self-rating/performance gap and the gender inequality index of a country (Spearman's rho = 0.40, p = 0.007). Thus, the more unequal a country in terms of gender gap, the greater the overconfidence.

Discussion
Here we explored variation across nations in terms of their population's self-estimates at navigating, their ability to perform a virtual navigation task and the gap between these two measures. Extending in more depth than prior studies, we found that older men were more likely to overestimate their ability. We now report that, while across the global sample performance across all participants was related to pooled self-estimates of navigation ability, at the national level a nation's self-estimates and performance were not correlated. We found that participants in Germanic countries were much more likely to report high self-estimates than other countries, whereas Confucian Asian countries reported the lowest self-estimates. Near Eastern countries showed the largest over-estimation in their abilities, whereas Nordic countries showed the greatest under-estimation in their abilities. We found that across nations the gap between self-estimates and performance was associated with the extent of affirmation for gender roles, the greater the association with masculinity in a country, the more likely citizens of that country were to overestimate performance. We discuss how these findings advance understanding of how self-estimates of ability relate to actual performance for a core human ability: spatial navigation.
Consistent with prior research, we found that across the entire sample male participants were twice as likely as females to rate themselves as very good navigators 5,[34][35][36][37][38] . Men also displayed a much lower tendency to report their navigating skills as either bad or very bad. Middle-aged and older participants generally held more favourable views about their navigating abilities than younger adults and this effect was particularly strong in the male group. For example, the oldest males in the sample (i.e. 60-70-year-olds) rated their wayfinding skills higher than the youngest 19-29-year-old men despite their performance being much worse than all younger males including the youngest age group. This finding is congruent with the research from van der Ham et al. 29,30 who reported that overestimation of spatial navigation abilities increases with age, with this more common for males than females. It also validates the earlier research by Taillade et al. 22,28 who found that older individuals rated their spatial abilities more positively than younger individuals, but they also displayed poor accuracy in their self-estimates due to age-related decline in performance.
We also replicate our past finding that the GDP of countries correlates with the performance of these countries 19 , and now extend this to reveal that GDP is not correlated with self-ratings. Thus, while the economic wealth of a country (raising education, health care and access to travel) appears to enhance cognition, it does not relate to the way in which a population will self-reflect on their abilities.
Our results provide the largest cross-cultural study of the difference between self-reported navigation skill and ability. While across the whole world population higher self-estimates are associated with better performance, we found no correlation between a nation's average performance and a nation's self-estimated ability. Germanic countries were found to self-rate their abilities much more positively than other countries, with other cultural  On the other hand, countries representing the Germanic cluster were more susceptible to overestimation: they performed relatively poorly while expressing very positive beliefs about their spatial navigation abilities. We hypothesised that, due to past evidence for gender bias in self-estimates studies of navigation 40,41 , that the attitude towards masculinity in the population might manifest in bias in the self-estimates-with high www.nature.com/scientificreports/ masculinity associated with higher self-estimates. We found that both the population self-estimates and the over/under-estimation in performance were associated with population attitudes to masculinity. This finding supports Oettingen's 44 theory that a domain-specific sense of competence, in this case wayfinding, is likely to vary between countries and cultures based on the shared value systems which an individual has experienced. We show here this extends not only to a self-reported estimate, but to over-and under-estimation of performance. Notably, we found little evidence that such relationships with masculinity were different for men and women. Thus, generally, it is not the self-estimates of men or the self-estimates of women within a country (or their gap to performance) that underlie the association with masculinity, but collective populations. Exploring beyond masculinity as a metric we examined other metrics collected by Hofstede 45,46 in a model with corrected thresholds and found that uncertainty avoidance was also significantly correlated with overestimation of abilities, albeit less so than masculinity. This indicates that people in countries where certainty is valued have a tendency to consider themselves good navigators, whether or not they, as a group, are. It is currently unclear why this might be and future research will be useful to explore this.
A limitation to the current research is the treatment of culture at the nation-state level. Within most nations, different cultures exist, and for many nations, different languages are spoken by sub-populations. Prior research has shown that language, attitudes to childhood, and the environment occupied by sub-groups of a nation can have an impact on spatial cognition 18,57-61,61-65 . Thus, it seems likely that both the language and the environment of relevant sub-groups will affect the relationship between self-ratings and navigation ability at a population level. One further challenge is that different cultural groups will differ in access to technology and smart-phones, and our current results are limited to a self-select group of participants who have access to such technology. However, our current results are unlikely to be mediated by such effects since Germanic and Nordic countries both have good access to technology and occupy different ends of the self-rating scale for nations. Future research involving more traditional cultures will require careful consideration regarding the use of technology in the assessment of cognition. Another limitation of our method was that we only used 1 question with 4 levels to probe self-rated navigation ability, rather than asking a set of questions with known reliability, such as the Santa Barbara Sense of Direction Scale 5 . However, in the design of the game and setting out to provide 17 different languages, we opted to have a single simple question that would appear clear on a mobile phone screen and be easy to answer. See 62 for a discussion of the design process.
In summary, here we reveal world-wide patterns in self-ratings and over/under-confidence of a core cognitive skill: navigating. We replicate prior age and gender patterns with significantly more power than previous studies. We find that attitudes to masculinity across a nation are associated with self-estimates in ability and their over/ under-estimation. Future research will be useful to extend these findings to other cognitive domains.

Methods
Sea Hero Quest game design. The research questions were tested using a large-scale dataset collected via Sea Hero Quest-a mobile game application designed to obtain benchmarks of typical spatial abilities and navigation behaviours of healthy populations worldwide. The game contains eighty wayfinding, path integration, radial maze and other, research-unrelated, game engagement levels along with additional optional socio- www.nature.com/scientificreports/ demographic questions which asked players about their age, gender, country, education, rating of their navigation skills, hand used for writing, the environment they grew up in as well as their average daily travel time and a typical amount of sleep at night. A detailed description of the game design is provided in Coutrot et al. 19 . The ecological validity of Sea Hero Quest has already been confirmed by Coutrot et al. 63 who found that the navigational performance recorded using the game reflects real-life wayfinding behaviour. The full Sea Hero Quest dataset contains behavioural and socio-demographic data collected between April 2016 and April 2019 from over 4.3 million players globally with over 50 million gameplays recorded across all wayfinding and path integration tasks. Access to the full dataset is described in the Data and code availability statement of this manuscript.
Informed consent and ethical considerations. Informed consent was obtained from the participants.
The starting screen of the Sea Hero Quest game included debriefing and informed consent sections with clearly identified goals of the study as well as the purpose and extent of data collection. Additionally, the full explanation of the study was provided within the gameplay (via the 'journal' icon) and the consenting players could access this section and withdraw from playing at any point of the game.
All consenting participants voluntarily downloaded and played the Sea Hero Quest mobile application game. This study was conducted as part of a larger research project which has been approved by the UCL Ethics Research Committee under the Project Number: CPB/2013/015. All experiments and methods including data collection, processing and analysis were carried out in accordance with relevant guidelines and regulations.
Participants and sample size. A subset of 383,187 individuals (representing 46 countries; N Females = 172,969, MeanAge Females = 37.08 ± 14.53, N Males = 210,218, MeanAge Males = 37.14 ± 13.39) out of the total number of over 4.3 million people who downloaded and played the Sea Hero Quest mobile game application was used in the analysis. The sample size was restricted by selecting only those players who provided the self-estimate scores of their navigation skills along with other socio-demographic information such as age and gender and by narrowing the age range of individuals to between 19 and 70 years old in order to control for fake age demographics as well as previously reported selection bias 19 which was found to translate to an unusual spike in performance in older players. Due to a large number of missing values in age, gender and self-estimates variables these data filtering operations excluded 3,394,521 individuals. Also, as not every participant played all game levels, only the records of those who completed six wayfinding testing levels (3, 6, 7, 8, 11, and 12) and the first two practice levels (levels 1 and 2) were considered in this study. By following this criterion, we excluded 398,487 out of the remaining 818,193 individuals who played the game and provided their socio-demographic information. The final sample size was additionally adjusted by (a) removing outliers-defined as players who scored at least two standard deviations above and below the mean of wayfinding performance metric (see below for details), and (b) keeping the records of only those individuals who represented countries with at least 500 players. These two data processing operations lead to 34,065 participants excluded from the sample. Finally, we removed gameplays of individuals representing countries (e.g. Vietnam, Albania, Saudi Arabia, and Croatia) which didn't feature in any of the cultural clusters described in the Cultural clustering of countries subsection below (2454 excluded). Following these data cleaning activities, the resulting final sample size for testing the hypotheses was 383,187 individuals representing 46 countries. The country-by-country descriptives of the sample are provided in the Supplementary Information, Appendix A, Table S1.
Wayfinding tasks. For the purpose of this study, we only used a selection of wayfinding levels in which the overall goal was to navigate a boat from the origin to the destination in various spatial layouts. They included a number of checkpoints (represented as buoys) which players had to visit in a specific sequence in order to complete the task. Before each game level, players were first shown a map of the corresponding layout environment which they could look at for as long as they needed. At some levels the maps were obscured which forced players to learn the layouts by exploration and made the navigation tasks more difficult. The first two levels of the game were designed as tutorials to allow players to familiarise themselves with the game controls. The performance on these two tasks was used in the analysis to account for any variation in computer experience amongst the players: the distances travelled on testing wayfinding levels were divided by the sum of distances recorded at Levels 1 and 2. This approach to Sea Hero Quest data processing was first defined by 19 and since then it has been successfully implemented in other works based on data collected with Sea Hero Quest to ensure reproducibility and robustness of the subsequent analyses.
Cultural clustering of countries. One of the focal points of the analysis was to assess the differences in wayfinding performance and navigating self-estimates between countries as well as groups of countries clustered together based on their cultural similarities. In order to determine the groups of culturally similar countries we followed the cultural clustering implemented in the GLOBE 2004 study 64 with further extensions proposed by Ronen and Shenkar 56 . The original GLOBE 2004 project grouped 52 still existent countries into 10 culturally distinct clusters based on cultural dimensions such as power distance, uncertainty avoidance, societal institutional collectivism, assertiveness, gender egalitarianism etc. which vastly overlapped previously introduced Hofstede's cultural dimensions 45 . Ronen and Shenkar 56 additionally analysed clusters defined in 11 most influential studies (including Hofstede's study and the GLOBE project) and proposed 11 global and 15 consensus clusters to group 70 distinct countries. In this study, the Ronen & Shenkar's solution has been employed to group 46 countries represented in the sample into 11 global clusters as displayed in Table 1 Measuring self-estimates of navigation skills. The self-estimates of one's navigation skills were collected from the players via the optional question "How good are you at navigating?" asked after completion of the second wayfinding task of the game (Level 2). The available answers were distributed on a four-point Likert-type scale without a neutral/average score: 'very bad' , 'bad, 'good' , and 'very good' . The obtained responses were used as a proxy for measuring the self-reported navigation ability. Our analysis treated the self-estimates responses as either categorical (as recorded by the game) or numeric values. A numeric version of this variable was obtained by assigning a numeric value to each response according to the following coding: 'very bad' = -2, 'bad' = -1, 'good' = 1, 'very good' = 2. This numeric-type measurement of navigation skills was applied to quantify the average (arithmetic mean) self-estimates for each gender, age band and country as well as to calculate the country-specific gap between the self-reported ability and measured wayfinding performance. The average self-estimated navigation skills for each country and cultural cluster in the sample were estimated with conditional modes by extracting the intercepts for 46 countries in the sample and 11 cultural clusters from linear mixed models with fixed effects for age and gender and random effect for country (or cluster, respectively): Self-Estimated Navigation Skills ~ Age + Gender + (1|Country) and Self-Estimated Navigation Skills ~ Age + Gender + (1|Cluster). We used lme4 65 and lmerTest 66 packages for the R programming language to implement these models.
Wayfinding performance metrics. Measuring wayfinding performance of individual gameplays followed the methodology of correcting distance presented by Coutrot et al. 19 and was estimated by calculating the Euclidean distances between point coordinates (sampled at Fs = 2 Hz) for each individual gameplay-trajectory recorded at the first attempt of the relevant game level. It should be noted here that, during the game, players were allowed to repeat levels as many times as they wished and, for that reason, the scope of the analysis was limited to the first attempts only.
Following Coutrot et al. 19 , the distances of trajectories in relevant wayfinding tasks were then corrected by dividing them by sum of mean Euclidean distances travelled at Levels 1 and 2 across up to first three attempts (corrected distances). As some participants played Levels 1 and 2 only once but some others approached them multiple times, we considered only up to the first three attempts to control for any undesired outcomes or technical issues encountered in these early tutorial levels and to minimise the influence of potential training effect on these two levels. The obtained corrected distances for each wayfinding level were additionally scaled (i.e. z-score standardised) to allow unbiased comparison between levels of distinct spatial layouts.
We then calculated the average (i.e. arithmetic mean) distance travelled by each individual player across six wayfinding tasks used in this analysis by simply summing up individual corrected and standardised distances obtained for each level and then dividing it by a number of levels each player completed. Finally, as the estimated metric was in fact a mean distance travelled from the origin to the destination (i.e. the smaller the value, the better the performance), we reversed its sign for easier interpretation (i.e. the larger the value, the better the performance). It represented a single and unbiased measurement of one's wayfinding performance across multiple wayfinding tasks.
Wayfinding performance for countries and cultural clusters was estimated with conditional modes by extracting the intercepts for 46 countries in the sample and 11 cultural clusters from linear mixed models with fixed effects for age and gender and random effect for country (or cluster, respectively): Wayfinding Performance ~ Age + Gender + (1|Country) and Wayfinding Performance ~ Age + Gender + (1|Cluster). We used lme4 65 and lmerTest 66 packages for the R programming language to implement these models.
Measuring the gap between self-estimated navigation skills and wayfinding performance. The gap between self-reported navigation skills and wayfinding performance was estimated as the difference between www.nature.com/scientificreports/ the min-max normalised (range [0, 1]) self-estimated ability for each country or a cultural cluster (i.e. average self-estimates controlled for age and gender, extracted as country/cluster conditional modes from the linear mixed models described earlier) and the min-max normalised (range [0, 1]) wayfinding performance (i.e. average wayfinding performance controlled for age and gender, extracted as country/cluster conditional modes from the linear mixed models defined above). The resulting gap metric was distributed on the scale with range of [− 1, 1], where − 1 denoted maximum possible underestimation of navigation skills compared to the measured wayfinding performance, value 0 indicated perfectly accurate estimation of navigation skills across country/cluster, whereas value 1 expressed the maximum possible overestimation of wayfinding abilities.
Statistical analyses and additional predictors of wayfinding performance. Due to the applied standardisation and outlier removal approaches, the normality assumptions were met for some analyses but not for others, and therefore our methods implemented a range of parametric and non-parametric statistical techniques depending on underlying distributions of wayfinding performance metrics, specific cultural dimensions, global indices and particular nature of research questions. These are indicated in the main text of the manuscript and the supplementary appendices whenever statistical tests are reported. Also, apart from self-estimated navigation skills and typical demographic variables such as age and gender, we used a number of other socio-demographic and behavioural variables such as the highest level of education obtained, the daily amount of commute time, and the type of home environment in which the participants grew up as predictors of wayfinding performance in the hierarchical multivariate linear models reported in the Results section and the Supplementary Information (Appendix A, Tables S3 and S4).

Data availability
Access to the full Sea Hero Quest dataset (including all wayfinding trajectories generated by the players) is available upon registration through a dedicated server at https:// shqda ta. z6. web. core. windo ws. net/. Please contact the Lead Contacts directly (H.J.S. and E.M.) to obtain the sample of the full dataset used in this study. Lead Contacts: Hugo J. Spiers (h.spiers@ucl.ac.uk) and Ed Manley (e.j.manley@leeds.ac.uk). Additionally, researchers interested in using the Sea Hero Quest mobile game application can invite participants to play the game and generate wayfinding research data for non-commercial purposes via the SHQ game portal at https:// seahe roque st. alzhe imers resea rchuk. org/. Access information to secondary data sources can be obtained from the following URLs: Hofstede's cultural dimensions (2016)-https:// geert hofst ede. com/ resea rch-and-vsm/ dimen sion-datamatrix/. Gallup World Poll 2005-2017 (Life Satisfaction)-https:// www. gallup. com/ 178667/ gallup-world-pollwork. aspx. Gallup Religiosity Index (2009)-https:// ratio nalwi ki. org/ wiki/ Impor tance_ of_ relig ion_ by_ count ry. United Nations Development Programme (Education Index 2018, Gender Development Index 2019, and Gender Inequality Index 2018)-http:// hdr. undp. org/ en/ data. Global Gender Gap Index 2020-https:// www. wefor um. org/ repor ts/ gender-gap-2020-report-100-years-pay-equal ity. GDP per capita (2019)-https:// data. world bank. org/. All data pre-processing operations, statistical analyses and figures included in this manuscript were made using the R and Python programming languages. The custom code scripts are available from the Lead Contacts (H.J.S. and E.M.) upon request.