The risks of invisibilization of populations and places in environment-migration research

Recent years have seen an increase in the use of secondary data in climate adaptation research. While these valuable datasets have proven to be powerful tools for studying the relationships between people and their environment, they also introduce unique oversights and forms of invisibility, which have the potential to become endemic in the climate adaptation literature. This is especially dangerous as it has the potential to introduce a double exposure where the individuals and groups most likely to be invisible to climate adaptation research using secondary datasets are also the most vulnerable to climate change. Building on significant literature on invisibility in survey data focused on hard-to-reach and under-sampled populations, we expand the idea of invisibility to all stages of the research process. We argue that invisibility goes beyond a need for more data. The production of invisibility is an active process in which vulnerable individuals and their experiences are made invisible during distinct phases of the research process and constitutes an injustice. We draw on examples from the specific subfield of environmental change and migration to show how projects using secondary data can produce novel forms of invisibility at each step of the project conception, design, and execution. In doing so, we hope to provide a framework for writing people, groups, and communities back into projects that use secondary data and help researchers and policymakers incorporate individuals into more equitable climate planning scenarios that “leave no one behind.”


Introduction
R esearch and knowledge production can be affected by biases and contain the risk of being part of the (re)production of inequality. A common challenge of the research process is the failure to consider certain segments of the populations by omission in sampling. Those populations have been regularly named "invisible" or "hidden" (Faugier and Sargeant, 1997;Lambert, 1998). This failure is a direct consequence of a deeply rooted process of structural social marginalization. The "invisibility" of certain segments of society, is the result of a complex pattern of inherent inequalities and injustice (Polzer and Hammond, 2008). Through the exclusion of certain marginalized groups, research can also play an instrumental role in the structural power imbalances and various injustices of our societies, and the process of knowledge production can become selective in harmful ways (Bajgar et al., 2019;Parry et al., 2019).
People are made to be invisible when either they or the categories salient to their lives are not included or are selectively written out at different stages along the research process. Invisibility leads to an underrepresentation of the narratives and problems facing invisible populations, which carries on past the research process and into the zone of policymaking and intervention. The invisible often include disadvantaged or disenfranchised groups of people (e.g. ethnic groups, homeless, (Lambert, 1998) as well as people who do not want to be found or contacted (Brackertz, 2007). Invisible populations tend to be socially vulnerable in other ways as well and can include the elderly, LGBTQ+, sex workers, drug addicts, and mental health recipients, among others (Cruikshank, 2007).
While invisibility has been explored in other contexts in the past, environment-migration research, especially where secondary data is used, introduces new spaces for the production of invisibility through the techniques used to link social and environmental data, the scales at which data are aggregated, and the interest in harmonizing data sets to study large geographies. The increasing availability of digital data reflects economic and human development and has both political and practical implications for the way people are (un)seen and (un)treated. Yet the power of data to sort, categorize and intervene has not yet been explicitly connected to a social justice agenda by the agencies and authorities involved (Taylor, 2017). Similar to other fields, environment-migration research also faces risks of producing invisibility around decisions regarding data analysis and presentation of results, as well as institutional pressures surrounding research interests and funding availability. This paper seeks to broaden the discussion of the risk of rendering populations invisible at all stages of the research process within the field of environment-migration research. By increasing awareness of research decisions that may contribute to the distortion of realities, the risk of invisibility in academic knowledge production, particularly while using secondary data sources, could be mitigated.
At the same time, secondary data allows the production of research at speeds, scales, and costs that cannot be matched by primary research, which brings its own sets of strengths and weaknesses. Unlike the so-called primary data, secondary datasets, on which environment-migration research heavily relies (Fussell et al., 2014), include datasets not personally collected by an individual researcher or research team such as national-scale and regional-scale data collected by governments or intergovernmental organizations (e.g. census data), remotely sensed environmental data, crowd-sourced geolocated data, as well as citizen or community science data. Broadly, these datasets for migration research have been categorized based on primary purpose as administrative (such as border control data, visa data, etc.), statistical (such as census data and household surveys), and innovative (emerging data sources such as social media feeds) (see Migration Data Portal). These valuable datasets support data collection at spatiotemporal scales beyond what an individual researcher can collect and enable more robust, generalizable conclusions to be drawn. However, for many of these datasets, their strength is also their weakness, in that they are broad but not deep, potentially leaving out populations that are hard to sample. This creates the potential for its limitations to become contested and ultimately codified into our understanding of the population-environment nexus and missing the possibility of addressing climate change adaptation equitably by giving a voice to disadvantaged groups, many of which are disproportionately affected by climate impacts (Chu and Michael, 2019).
Through four targeted case studies representing different stages of the research process, we examine (i) potential causes and implications of, data invisibility; (ii) what invisibility means for specific groups; (iii) how data invisibility can become an exacerbating factor to vulnerability; and (iv) how this process may impact policy and service provision. After this analysis, we offer ideas and considerations for improving the visibility of vulnerable people in population environment research while maintaining consent and reserving the right to refusal for specific individuals and groups that may wish to maintain their 'invisibility.'

Theoretical background
Dealing with invisible or hard to reach populations. The challenge of researching hard-to-reach or hidden populations has been acknowledged for some time, with the most significant body of literature in public health studies (Bonevski et al., 2014;Faugier and Sargeant, 1997;Lambert, 1998;Muhib et al., 2001). Some populations, omitted by survey design, are generally overlooked in data collection (Benoit et al., 2005). Marginalized groups, as well as those who live in sensitive areas (e.g. conflict zones, refugee camps, informal settlements), areas experiencing political instability, or simply areas where poor infrastructure increases the cost of conducting research (e.g. uncertain legal status) are often difficult for researchers to access (Atkinson and Flint, 2001).
Invisible or hidden populations could be those who are disadvantaged or disenfranchised (e.g. drug addicts, LGBT, sex workers, mental health recipients, ethnic groups, homeless, (Lambert, 1998)), including people who do not want to be found or contacted due to fear of persecution etc. (Brackertz, 2007). Populations initially socially excluded by their status (i.e. irregular migrants) may wish to remain deliberately invisible. This framing also raises the question of "to whom" those populations are invisible (Polzer and Hammond, 2008) and when "being counted" could be a powerful factor of inclusion or, on the contrary, disadvantage the minority, contributing to the fragility of individuals and communities (Hammond, 2008;Parry et al., 2019).
Numerous publications discuss the difficulties of reaching out to these populations as well as the methodological ingenuities to enable their inclusion in the survey sampling (Benoit et al., 2005;Marpsat and Razafindratsima, 2010;Shaghaghi et al., 2011). While the limitations of secondary data sources in capturing the full diversity of our societies is well recognized, the understanding of the scale of the problem remains a critical research gap (Carr-Hill, 2013;Mitlin and Satterthwaite, 2013). How many are potentially "missing" from population counts and from sampling frames of household surveys? And what does it mean to not include and/or count them? Recent academic work illustrates the problem in the estimation of poverty, identifying how such household surveys are inappropriate for obtaining information about the poorest members of society, particularly due to these omissions by design and in practice (Bajgar et al., 2019;Carr-Hill, 2013Lucci et al., 2018). The consequences are significant: "Population undercounting means that any social program risks ignoring the poorest of the poor. This blindness is a public scandal affecting an estimate of between 300 and 350 million of the poorest in developing countries, leading to an overestimate of progress toward development goals and a substantial under-estimate of inequalities" (Carr-Hill, 2013, p. 40).
Moreover, these studies point out that data quality is not the only source of distortion of reality. The challenges of defining and conceptualizing the phenomena under investigation is also crucial point that impacts the results of research. For example, estimating urban poverty, Lucci et al. (2018) revealed that the type of slum definition used can produce extremely different figures.
Given these challenges and consequences, it is therefore essential to examine possible selection biases in all fields of research. This need becomes even more urgent when we consider the recent incentives to use existing data in the context of research funding and production (i.e. the recommendations of Future Earth to the Belmont Forum). As mobile populations and their complex narratives are also a group at high risk of exclusion from the data, discussing the risk of invisibility in using secondary data collection in the field of environment-migration research is not only, crucial, but needed, as is discussing how invisibility may be (re)produced across stages of the research process and across scales.
The field of environment and migration. The impending challenge of climate change increases the urgency of environmentalmigration research. Climate change is widely expected to cause substantial harm to human populations and the critical ecosystems relied upon for survival. Climate change is already adversely impacting many areas of the world and impacts such as increasing temperatures, heat waves, and extreme events are projected to increase in severity, intensity, and frequency. By midcentury, critical thresholds are expected to be met and surpassed, resulting in dangerous functionality for significant aspects of the natural, social and infrastructural environments, upon which humans depend on for survival and well-being (Matias, 2017;McMichael et al., 2012).
As with other environmental changes, it is believed that climate change will impact population mobility both directly and indirectly (Chen et al., 2017). Under climate stress, migration may serve as a household or community adaptation strategy to undesirable natural and economic conditions (Bardsley and Hugo, 2010;L. M. Hunter et al., 2015;R. A. McLeman and Hunter, 2010;R. McLeman and Smit, 2006). If environmental migration is a positive adaptation strategy, then concern may arise around immobile, or "trapped" populations (Chen et al., 2017). While wealthier households or communities with access to more natural resources may better be able to incur costs associated with migration, poorer communities may not. This suggests that lack of mobility may be a major challenge for especially vulnerable communities to adapt to their environmental conditions, especially in the case of a loss of livelihood (Adger et al., 2015;Black et al., 2011;L. M. Hunter et al., 2014).
The impacts of environmental factors on population mobility are highly complex, and environmental factors are rarely the only contributing factors to migration (Adamo, 2010;Obokata et al., 2014). Rather, decisions surrounding migration are driven by economic opportunity, cultural norms, political structures, individual characteristics, etc (Black et al., 2011;L. M. Hunter et al., 2015). Due to the phenomenological complexity, there is little agreement in the literature surrounding how different environmental impacts affect migration and admittedly, different approaches may exclude certain populations. Findings also vary significantly by location (Gray and Wise, 2016).
While influences to human mobility are complex, increases in environmental stress on populations, especially due to climate change, are expected to contribute to shifting patterns of mobility. As such, environmental migration is increasingly being recognized and discussed at policy-levels, though more research is required for the development of quantified, context-dependent tipping-points, which can increase the understanding of exposure and environmental pressures on mobility patterns and risk trends (Matias, 2017;Rigaud et al., 2018). The complexity and scale of the environment-migration research create new challenges to integrating and representing social and environmental vulnerabilities without causing research efforts to exclude invisible groups.
Advances in data availability and use of secondary data. Methodologically, environment-migration studies could be seen as divided between detailed empirical case studies on the microlevel and global or national assessments on the macrolevel. Microlevel case studies often draw on self-reported environmental information while larger-scale work more often utilizes secondary data from administrative sources such as censuses and earth-observation data for deriving climate-related parameters (Borderon et al., 2019;Piguet et al., 2018). Each of these approaches has associated advantages and disadvantages. Qualitative methods can provide in-depth and rich insights into migrant experiences but are often too context-specific for generalization. Global and national assessments allow broader results or regional comparisons but may not sufficiently represent the local context or the interactions between the different drivers and actors of migration.
Researchers are thus simultaneously developing increasingly sophisticated methods to respond to the contextual complexities of migration while striving to generalize the effects of the environment across scales. Technical advances and increased data availability further support efforts to combine socio-demographic and environmental data to accelerate scientific innovation (Fussell et al., 2014;Kugler et al., 2019). The use of secondary data (census data, DHS-type survey data[1] or data from observatories or monitoring sites such as HDSS[2], for population data…), and their combination with available environmental data (such as land use data classified from satellite images, weather station data, etc.) thus making it possible to produce research at a speed, scale, and cost that cannot be equalled (https:// terra.ipums.org/home).
Despite this progress, few large-sample studies have examined the evolution and transformation of migration systems under changing environmental conditions, due to the remaining difficulties involved in capturing the dynamic components of both dimensions (with HDSS data: Call et al., 2017;Hunter et al., 2017;Hunter et al., 2014;Lalou et al., 2019; with Terra Populus data: Nawrotzki et al., 2016Nawrotzki et al., , 2017; with DHS data: Hallegatte and Rozenberg, 2017). This is due to a number of challenges: data on internal migration remain limited, longitudinal data are rare and costly to produce, and strategies for data integration across scales remain a challenge (Hugo, 2011;Rigaud et al., 2018).
Diversity of secondary data collection and future directions. New sources of data, fueled by rapid technological advancement, offer an increasing amount of migration-related information and could promise better days ahead. Innovative sources of data for environment-migration research such as mobile phone records, social media data, and smartphone-based surveys are likely to increase in the future. A potential benefit is that these large datasets can provide very high-resolution information that has previously been challenging, including information regarding real-time or close to real-time migration flows at the individual level and access to hard-to-reach populations (Bell et al., 2016;Lu et al., 2016). However, the limitations of these data sources are significant and deserve serious consideration. Data bias is a concern, as the data inherently selects for only individuals who have access to a cellular phone or social media technology. Additionally, such high-resolution data raises serious ethical concerns regarding privacy and responsible data management (de Montjoye et al., 2018).
No data source is ever perfect and concerns about secondary data have been raised in recent decades. The growing body of "big data" we will have access to in the future may not reflect our populations as accurately as in the past if traditional data sources disappear in the meantime (Dorling and Gietel-Basten, 2017). In times of austerity, governments cut back their spending on official data collection and an increasing number of countries have replaced the decennial exercise of exhaustive population censuses by population registers or other alternatives (Coleman, 2013). Yet the decennial census is the single largest source of information and the only primary source that has coverage and availability across a wide space. Not to mention that many of the newer data sources use census data for sampling strategies and reference points or they ultimately rely on official census data to sanitycheck their results (see the post of Martin, 2021). It is therefore rather contradictory that more and more countries are abandoning the practice of carrying out censuses -which are considered archaic and too costly-even though the first action of the global program for strengthening migration data stipulates that "the population census is the most valuable tool to establish a baseline for the size, composition, and well-being of the global population" and that "the global program should provide dedicated financial support to countries to ensure that (a) core migration questions are included in the census, (b) enumerators are properly trained in identifying migrants, (c) information campaigns encourage migrants to participate, (d) disaggregated migration data are collected, analyzed, verified and disseminated, and (e) migration data are exchanged between countries".
Thus, the field of environment-migration studies represents a particularly relevant field in which to understand how invisibility may be created and reinforced due to its complex processes across spatial and temporal scales, methodological challenges, and tendency to depend on secondary data. While solutions to addressing invisibility in environment-migration research will likely depend on the specific research, data, and goals, our conceptual framework (Fig. 1) highlights ways in which invisibility may be introduced across stages of research. Through case studies, we apply this framework to the field of environmentmigration research with secondary datasets. In the process, we provide some suggestions as to how researchers may interrogate their own research across the stages of production to attempt to prevent unintentional invisibility in the hope that environmentmigration researchers begin to carefully consider invisibility across all stages of research.
Analytical framework: scales of invisibility. If the process of population exclusion/inclusion in data collection has been extensively discussed, less is known about the process of invisibility during other steps of the research process (e.g. research design, data analysis). To date, no published work provides an overall view of the process of inclusion/exclusion in academic knowledge production, despite the urge for more studies to measure progress in attaining the Sustainable Development Goals and the calls for inclusive development in a world facing climate change (Gupta et al., 2019;Pelling and Garschagen, 2019). Addressing calls focusing on equity in climate policy requires a more thorough understanding of when and how vulnerable people are made invisible by the academic knowledge production of population environment research. To highlight this process of exclusion during the modus operandi of academic knowledge production, we suggest using a conceptual framework that illustrates the stages of the research process in which invisibility may be introduced (Fig. 1). To highlight the usefulness of this framework, we apply it to the case of environmental-migration research.
The conceptual framework ( Fig. 1) highlights five scales at which invisibility may be introduced or reinforced in the research process, focusing on research that depends on the use of secondary data.
1. Invisibility in research focus: The highest level, representing the base of our conceptual framework, is "invisibility in research focus" in which invisibility may be introduced at the highest levels of research such as geographic biases, academic interests, and biases in what research receives funding. This level reflects how biases within an academic discipline or institution including trends and "hot topic" research questions, as well as the fact that some populations have more existing data available, may lead to certain geographies, populations, and questions being neglected in scientific investigation from the beginning of project conceptualization. We highlight invisibility in research focus through the example of populations in the Global South being overrepresented in the environmentalmigration literature. 2. Invisibility in project design: The second level is used to assess ways in which invisibility may be introduced during project design including in conceptual frameworks, research question and hypothesis formation, and decisions about spatial or temporal scales of analysis. Deliberate consideration at this level of research, before the project has started, is critical for reducing invisibility. To highlight this, we introduce the example of mobility versus immobility and how a research focus on mobility may render populations' complex aspirations and motivations invisible to highlight risks of invisibility in project design. 3. Invisibility in data collection: At this stage, limitations in existing data, including survey design, missing data, and sampling methodologies may mean that certain populations are obfuscated or not included in the secondary datasets of interest. Researchers utilizing secondary datasets run the risk of misusing the data due to less familiarity with the data collection and original research goals. To highlight the potential of invisibility at the data collection stage, we offer the example of challenges related to addressing the inherent translocality of migration when constrained by secondary data that limits research to either an origin or destination focus. 4. Invisibility in data processing: This invisibilization may occur when researchers make decisions surrounding data cleansing and manipulation, or when combining different datasets, especially across different spatial or temporal scales. 5. Invisibility in data analysis: At the latest stages of the research process, our conceptual framework highlights and identifies examples of "invisibility in data analysis." Once the project has been designed and data has been collected and processed, invisibility may still be introduced at the level of data analysis and reporting of results. Decisions related to the presentation of results may obscure or entirely omit certain populations, leading to invisibility in the research results. In this work, we discuss the example of gender and migration and gendered intra-household migration behavior to highlight how invisibility may be introduced at the level of data analysis and processing in environment-migration studies.
While by no means comprehensive, our goal is that, by highlighting invisibility beyond the issue of "counting", these examples help to elucidate different ways in which some populations and their stories can be (inadvertently) made invisible in research. In addition to invisibility being introduced at any one of the stages of the levels highlighted in the conceptual framework, invisibility may be compounded and reinforced across scales, especially when research is dependent on secondary data. Geographic biases may dictate what data is available, while both past and current decisions regarding project design, data collection, processing, and analysis may significantly impact data quality. Depending on the decisions that informed the original data collection, some data may not be appropriate to address other questions, which may force the researcher to reconsider the use of that secondary dataset at all. This paper applies the conceptual framework to the field of environment-migration research to highlight how it may guide researchers through considering invisibility in their own work.
Case studies Invisibility in research focus: going where the money (and data) is: over and understudied geographies. In any field, a researcher's focus may be shaped by large-scale external forces including what research is considered a "hot topic" at the time, what funding agencies are willing to fund, what has been previously studied, and "convenience" for the researcher (Hendrix, 2017). Especially in an environment of increased competition for funding, researchers may adapt their research focus to fit the requests of funding agencies (Meirmans et al., 2019;Serrano Velarde, 2018).
While research has considered how researchers are impacted by funding environments, increased competition, and shifting priorities, less focus has been given to how such pressures may reinforce invisibility in research (Serrano Velarde, 2018;Smith, 2010). In this context, choices and trends related to research focus, including geography, funding availability, and academic interest can contribute to the over-studying of certain populations while rendering less-studied populations comparatively invisible in climate migration research. When secondary data is the basis of research, previous biases resulting from funding pressures, geographic bias, or academic interest may be reinforced and compounded in future research.
Within the broader environment-migration research agenda, climate migration research specifically has been shown to focus primarily on populations in the Global South, demonstrating how invisibility may emerge based on the geographic locations that researchers choose to study. In a study of more than 1190 scientific papers and 463 empirical studies of environmental migration, Piguet et al. reveal that many people may be invisible to current environmentally induced migration research simply because they do not live in geographies that are often studied in this field (2018). They use a mapping exercise to highlight that there are swathes of geographies, primarily in the Global North, that are currently overlooked by environmental migration researchers, as well as geographies, primarily in the Global South, that are over-studied.
Not only does this research focus mean that some geographies in the Global North are understudied, and thus rendered invisible by their lack of inclusion in the scientific literature on environmental migration, but areas that are over-studied may also be rendered invisible due to the continuous perpetuation of false narratives related to researched populations. For example, research focusing on communities in the Global South, which is primarily conducted by researchers from the Global North, risks perpetuating a kind of neo-colonialism in which scientists from the North are framed as the creators and keepers of knowledge (Harsh et al., 2018;Piguet et al., 2018). Such framings often do not include room for the voices of local populations to be included in research design and framing, thus erasing these people's voices and rendering them passive research "subjects" rather than active agents of knowledge creation in their own right.
Piguet et al. further argue that the geographic mis-match between the Global South and the Global North in the research cannot be explained by the increased vulnerability in these regions to climate change and environmental stress (2018). Rather, they claim that the "over studying" of the Global South is due to the bias that environmental migration, especially when framed as a problem of "environmental refugees", is primarily a southern problem that could threaten the Global North. The discourse around "climate refugees", "environmental refugees", and "climate conflict" has been shown to be used by development agencies, NGOs, governments, and the media in order to highlight a perceived threat from the Global South to the Global North (Hartmann, 2010). In this way, researchers may play into pre-existing narratives, even unintentionally through research focus, in order to fit their work into existing academic, political, and public frameworks. Funding agencies and organizations may similarly fund research that fits comfortably into existing narratives, thus perpetuating a cycle of invisibility in which the full complexity of the interaction between people, the environment, and migration is lost to an overly simplified narrative. When secondary data that depends on such overly simplified narratives in over-studied geographies is used for research, invisibility is reinforced.
At this level, the implications of invisibility and imposed narratives can have very tangible impacts. Collaboration between institutions from the Global North may also shape the research direction of institutions in the Global South based on research priorities from the Global North (Harsh et al., 2018). The inherent asymmetry in relationships, especially in access to resources, impacts collaboration across all levels of interaction from institutions to individual, day-to-day interactions (Skupien, 2019). Influence from foreign researchers and funding agencies may go so far as to impact public policy in study areas (Whitley et al., 2018). In this way, invisibility in research could inform policies for groups and individuals in a study population, and, perhaps more importantly, those left out of a study population. The process of invisibility should then be understood as the outcome of structural imbalances in power and resources rather than as haphazard blindspots in scientific and state knowledge (Parry et al., 2019).
Invisibility in project design: trapped, voluntary, or something else? Forcing individuals into frameworks. Frameworks under which research is informed and projects are designed may reinforce patterns of invisibility in environmental migration scholarship. Environmental migration research projects, especially those utilizing secondary data, may design a project to focus exclusively on the experiences of certain categories of people, thus effectively erasing the experiences of others who do not fall into the researcher's project design framework.
As an example, much environmental migration research has focused on understanding the drivers of migration, with less attention paid to households and individuals who remain in a location, choosing not to migrate (Adams, 2016;Mallick and Schanze, 2020;Zickgraf, 2018). When environmental migration focuses exclusively on migrants, rather than individuals who may stay in place (voluntarily or involuntarily), an entire category of individuals may be rendered invisible (Lubkemann, 2008).
Invisibility may further be produced at the project design level when researchers begin a project with preconceived notions about individuals' motivations to migrate or remain in place. The language of "trapped populations" has been used broadly to describe immobile groups of people, suggesting that the poorest, most vulnerable households may be involuntarily forced to stay in an inhospitable location due to lack of resources to move (Adams, 2016;Black et al., 2013;Zickgraf, 2018). Researchers may be tempted to label mobile categories of people as adaptive migrants and assume that immobile people are trapped and therefore less able to adapt without explicitly exploring migrants' and nonmigrants' motivations. While this framework can be useful to demonstrate how social inequities may impact the migration decision, simply framing groups as migratory or trapped erases a considerable degree of nuance and complexity that exists in the decision to stay. Just as there may be voluntary and involuntary migrants, research should consider a spectrum of involuntary (trapped) and voluntary non-migrants to avoid erasing groups of people who actively choose to remain in a location (Mallick and Schanze, 2020).
Current survey research methods and, therefore, existing datasets in the environmental migration field may be ill-suited to capture the complexity of the decision to migrate or stay, which contributes to the generalization across populations. For example, simple yes/ no questions about migratory experience or aspirations may be unable to capture the full spectrum of individual considerations (Adams, 2016). Rather, households that are extremely likely to migrate or stay will give predictable answers, while those in the middle of the continuum will not be accurately represented in data, as their answers will largely depend on context (such as social norms and expectations around migration) and the specific framing of the question (Carling and Schewel, 2018). This dynamic means that individual agency and complexity of a decision to move or stay may be rendered invisible in data collection, as well as groups of people who fall between the extremes of the mobility/ immobility spectrum.
The tendency of researchers to introduce arbitrary or uninformed categories into the project design stage can erase individual agency in the decision to stay or migrate, as well as a person's unique cultural, social, or political context. Where environmental migration studies are used to inform policy, this process of invisibility can have very real, serious implications. For instance, a project design that focuses on migration as adaptation, without deeply considering the local context and individual motivations, may jeopardize individuals' right to stay in place by enforcing the idea that mobility is preferred over immobility. As Adams describes, immobility, like mobility, exists on a continuum and therefore requires research focused on the needs and aspirations of an individual before "labeling populations as trapped and promoting relocation" (2016).
Invisibility in data collection: origin or destination? The overlook of translocal dimension in data collection. To date, the majority of environment and migration literature has relied on social survey data, either cross-section or longitudinal which has been linked to satellite, weather station, or field data of local environmental conditions (Bilsborrow and Henry, 2012;Borderon et al., 2019). Survey data on migration has proven both costly and difficult to collect. Limitations in the data collection process have often led researchers to define migration in terms that are measurable within their data, can be easily or clearly linked with environmental data, and will remain visible later during the analysis process (Eklund et al., 2016). These limitations have led to a privileged partial perspective that emphasizes drivers and outcomes over the process while at the same time making moves differentially visible.
Migration often becomes defined as a single move over a particular distance for a particular period of time. Short-term moves, return moves, and repeat moves are often lost in both cross-sectional and longitudinal designs. Furthermore, the unilocational design of most studies, limited by the costs of collecting across sites or localities biases our understanding of migratory lives, ignoring translocal livelihoods, which have become a dominant modality in many locations around the globe (Munshi and Rosenzweig, 2016;Sakdapolrak et al., 2016). Only multi-sited studies would take into account both sending and receiving places, either because of an interest in connections between these places or because of an interest in comparisons between the place of origin and place of destination (e.g., to produce analyses on the causes or consequences of migration). In any case, such studies require information related to several distinct locations, possibly at the global level when considering international migration. This exceeds what is usually contained in conventional data sources (Beauchemin, 2014;Neumann and Hilderink, 2015).
Yet, a translocal livelihoods approach, which considers migrants and their households of origin as trans-locally connected, is often key to understanding the livelihood situation in their places of origin as well as in their destinations, and also their need for or aspiration to further migration (Greiner and Sakdapolrak, 2013). In origin-based projects, the migrants themselves can be made invisible, especially temporary migrants, who may have moved for periods shorter than specified in the questionnaire or are in the survey household at the time the survey is administered. In destination-based projects, the population left behind would be made invisible. Key mechanisms between the migrant(s) in the place of destination and the household in the place of origin could also be overlooked if the data collection does not permit to view those places as translocally linked through structures (e.g. migrant networks, exchange infrastructure), processes (e.g. resource flows, visits, or chain migration), and actors (e.g. migrants, labor agents, etc.).
Within the translocal household itself, established as an entity geographically embedded in different places, the vulnerability of its members can vary greatly. The workloads and household chores of members who remain behind may increase due to the reduction in available labor, although remittances that may be sent by the migrant(s) may improve the overall financial situation. The pressure placed on the migrant(s) in relation to the "success" of the migration and his/her ability to support the household may also be a difficult mental and financial burden for the migrant(s). Aggregated data at the household level would not allow these disparities to be captured.
Invisibility in data processing and analysis: gender and migration: creating invisibility along gendered lines. Some methods of data management, processing, and analysis serve to increase invisibility. Often, decisions demanded by statistical analysis (data transformation, removal of outliers, scaling data to match across sources) and data integration (combining qualitative and quantitative methods, incorporating environmental data) requires researchers to exclude, simplify, and condense data, potentially leading to the invisibility of certain groups or types of data (Fielding, 2012;Scholes et al., 2013). Additionally, management and analytical strategies including data interpolation, extrapolation, smoothing, and re-scaling are increasingly common as researchers rely more heavily on data collected across broad spatio-temporal scales and by different research bodies. The growing availability of global scale gridded population products has increased the opportunities for researchers to incorporate multiple secondary datasets, which creates a critical need for researchers to carefully consider what products are most appropriate and how processing and analysis may increase invisibility. Particularly when using secondary datasets, researchers may choose to exclude or condense certain data intentionally, to match research foci, cultural norms, or prevailing values of the time (Grady, 1981;Pannucci and Wilkins, 2010).
Who is made invisible varies according to decisions made by the research team to answer questions of interest. Often in climate migration research, there is a habit of creating invisibility along gendered lines. Despite a growing recognition of and interest in the role of gender in climate migration (Hunter and David, 2009;Kartiki, 2011), decisions made by researchers at both the data collection and analysis stage, can lead to invisibility of certain genders and types of migration. Gender-related invisibility is generally introduced by analysis methods that ignore variation in exposure, sensitivity, and response capacity associated with climate hazards and migration. A growing body of research demonstrates that gender-related inequalities influence both exposure and sensitivity to climate-related stresses (Alston, 2007;Chindarkar, 2012). Due to these variations in exposure as well as differences in agency, decision-making, and access to migration options, climate migration is often gendered. The invisibility is thus produced, for example, when data analysis ignores gender (of household head, of gender ratios within a household) during model creation. This is likely to be exacerbated when using secondary datasets that may not explicitly consider variation in exposure. Further, a lack of consideration of variation in gendered cultural norms which influence who might engage in certain types of migration can also lead to invisibility. For example, in some contexts, men are more likely to migrate than women, changing the dynamics of household and familial responsibility (Ampaire et al., 2020;Bhatta et al., 2016;Tsikata, 2016). In other contexts, women and girls may be more likely than men to migrate, experience forced migration, or experience more vulnerability upon migration (Hunter and David, 2009;Kartiki, 2011;Rao et al., 2019).
Limitations in agency and access associated with land tenure, requirements to participate in development projects, adaptation options, and gendered caretaking responsibilities often influence migration by women. The outcomes of such migration are highly context-dependent and heavily couched in socio-cultural traditions and norms and changing global economies. Perhaps as often, men and women migrate at similar rates but for different reasons and often with different outcomes (Abdul-Korah, 2011; Ahmed et al., 2016;Rademacher-Schulz et al., 2014). Research and analytical strategies that explicitly examine the gendered drivers and outcomes of such migration, and their interactions with historical vulnerability and cultural norms are critical for developing context-relevant solutions and ensuring the visibility of all potential migrants. A failure of research to disaggregate gendered climate migration processes in secondary data may lead to invisibility, exacerbating existing vulnerabilities and working against efforts to minimize climate stress for the world's least protected.
Analysis that does not consider the complex interactions between gender, environmental exposure, and migration, may support investing in strategies that are not viable or increase inequity. The result may be interventions that hinder endogenous adaptation and disempower certain vulnerable groups. Further, climate interventions that do not incorporate that variation are likely to fail and/or widen gaps in vulnerability across gender (Cook et al., 2019;Hemmati and Röhr, 2009;MacGregor, 2010).
Discussion: mitigating the risk of invisibilization in climate migration research based on existing data Though solution(s) to addressing invisibility in environmentmigration research depend on the specific questions, data, and goals, we offer a few suggestions for how this can be considered at the different stages of the research process.
At the level of research, scholars should think critically about how they select the geographies and focus areas where they conduct research, and how their research could play into problematic, overly simplistic narratives. Especially when secondary data is used, the geographic and power asymmetries in environment-migration research described here may be perpetuated and exacerbated due to "convenience" and the reliance on existing data. Researchers must acknowledge the tension between convenience, funding pressures, and potential invisibility. In this process, researchers must acknowledge and critically consider that some secondary datasets may be at higher risk for invisibilizing certain populations, while others might be more exhaustive in their coverage. Researchers, regardless of where they ultimately focus their research, should also carefully consider ways to meaningfully include local participation from the very beginning of research development, perhaps in addition to secondary datasets, in order to help ensure that local voices are not silenced.
To avoid the introduction of invisibility at the project design stage, researchers should be considerate of how their conceptual frameworks may impose labels arbitrarily onto groups of people and should think deeply about how these frameworks do or do not consider the complexity of individual agency. While more challenging when using entirely secondary data, researchers should consider ways to supplement data with context and stories from individuals who are in the study population. When researchers have an opportunity to collect their own data, they should design a survey or observational interview methods/ instruments, as well as have in-country, in-region specific knowledge and expertise, in a way that allows for capturing individual aspirations, needs, and motivations within the most appropriate measures.
When it comes to data collection, researchers need to be clear about what migration mechanisms they are seeking to shed light on and how this questions the management of data collection visà-vis translocal or transnational households. Does it matter for my problem if I understand what happens only in places of emigration? On the contrary, if I focus on migrants' places of destination, does this lead to a partial and biased view of my study? And what can I do about transit destinations, about the migratory journey itself? Asking "who counts" in the context of climate-influenced migration and "where do they count" would allow an honest understanding of which data collections are useful for which research questions, or at least to work with secondary data in full awareness of their interests and limitations in relation to the topic under study.
In the example of invisibility in data processing and analysis, we highlight gender in climate migration analysis as an example of a common source of invisibility. Solutions to overcoming this potential source of invisibility include adopting a gendered lens in climate migration research and data analysis. Rather than aggregating data across groups, households, communities, regions, or countries, explicitly consider demographic variables in statistical analyses and in the data collection stages. Analyses that allow households (a common unit of research) to have multiple migration statuses may aid our ability to differentiate gendered activities and outcomes. While this may be done in other scientific fields, is it not as common in climate migration research (Gubhaju and De Jong, 2009). For example, the medical field increasingly recognizes gender and sex as a set of complex phenomena that are simultaneously biological and social, which influence best research practices and analysis methods, such as intersecting control groups and sensitivity analyses (Springer et al., 2012). Climate migration research could adopt similar analytical methods that consider interactions between social context, history, and gender in order to effectively disaggregate data, establish relevant hypotheses, and isolate causal relationships. Such methods may support a more nuanced understanding of climate vulnerability, migration, and gender, potentially leading to more successful interventions. While we acknowledge the statistical challenges involved in considering such groups (e.g. low statistical power, over-fitted models), explicit consideration and acknowledgment of sources of analytical invisibility provide a more robust understanding of research limitations and potential future research directions.

Conclusions
Invisibility serves as a tool to examine the complex dimensions of social vulnerabilities and experiences of populations often omitted from climate and migration research. It further identifies the many studies that do not have categories or sections recognizing their unique lived experiences that impact data collection and results from relevant studies. Through four examples that correspond to different research stages, we have drawn from the following lessons: • There is a need to critically examine existing datasets before using them in environment-migration research, including considering how the dataset itself may contribute to invisibility. Among secondary data, some may be more at risk of invisibilizing populations whereas others might be more exhaustive in their coverage. Some data might not be appropriate for certain questions, and it is up to researchers to consider this at the earliest stage of a project. Along with this, there's a need to explicitly acknowledge (in methods, results, and discussion sections) when invisibility is reproduced in analyses and what recommendations/future research directions exist as a result. Transparency should be encouraged as it increases our opportunities to monitor, adapt, and improve our research.

•
There is a need for more community engagement across all stages of the research process. This includes engagement with vulnerable communities in ways that are sensitive to their histories with science and climate research, and broader engagement with the scientific, policy, and management communities to facilitate long-term engagement and strategies to move beyond invisibility. It may go without saying that such engagement must go beyond the performative. Genuine co-production of knowledge with local communities is time-consuming, requiring investments into trust and relationship building (Djenontin and Meadow, 2018). It also requires that community stakeholders have a real say in guiding the research direction and process in a way that challenges traditional power dynamics in scholarship and funding mechanisms (Djenontin and Meadow, 2018;Mitlin et al., 2020;Turnhout et al., 2020). However, such co-production of knowledge with local stakeholders will help to supplement and fill in gaps of invisibility that reliance on only secondary datasets may produce. In this way, challenging the convenience and speed of only secondary datasets must continue, with emphasis on relationships, respect for communities' needs and cultures, and researcher humility.
• Future environmental migration research should focus on "hard to reach" populations or "vulnerable" populations in a way that respects their agency and privacy. This can include better efforts for participatory research and coproduction of knowledge that supports the needs and values of these communities, recognition, and valuation of the variety of ways of knowing that may be found within such populations, and/or development and strengthening of consistent, substantive, and equitable opportunities for knowledge sharing and exchange. This can also include developing methodologies that protect the anonymity of individuals while ensuring research accuracy. For instance, Hunter et al. (2021) offer a brilliant example of how to link people and places balancing research and privacy needs. Using secondary data from one of the 50 health and demographic surveillance systems worldwide, they offer an important first step in exploring anonymization prospects for population-health-environment research utilizing secondary data sources.
The frontier of environmental migration research is advancing fast, more empirical case studies are being produced, and a greater diversity of data is used, which is promising for the field. However, awareness and mechanisms for combating data-related discrimination are not keeping pace with the rate of research and knowledge production. While many populations with a history of invisibility can perhaps benefit from additional equitable research and data collection, there is a broader, more urgent need to identify sources of vulnerability in existing data and ensure that our research does not exacerbate existing social inequities. More important than gathering more data, is critically examining the widely used secondary datasets we already have, improving our analysis and integration of existing secondary datasets to minimize invisibility in our results and developing improved instruments, methodology, etc., for future data collection. The transfer of knowledge, conceptual tools, and ethical practices between disciplines on how to work with and research the most peripheral and marginalized communities must be enhanced. As such, it is necessary to reflect on the construction of a more holistic and inclusive research framework (based on mixedmethod and multidisciplinary approaches) that would allow for a better understanding of the social and ethical implications of research programs. The recent call for tackling structural inequality through a better understanding of intersectional vulnerabilities across the mobility continuum reflects this necessity very well (Cundill et al., 2021;Versey, 2021). Such tools and frameworks are necessary to ensure that future research is conducted in a way that does not reinforce existing structural and intersectional inequalities.

Data availability
All data analyzed are included in the paper.