Social media reveal ecoregional variation in how weather influences visitor behavior in U.S. National Park Service units

Daily weather affects total visitation to parks and protected areas, as well as visitors’ experiences. However, it is unknown if and how visitors change their spatial behavior within a park due to daily weather conditions. We investigated the impact of daily maximum temperature and precipitation on summer visitation patterns within 110 U.S. National Park Service units. We connected 489,061 geotagged Flickr photos to daily weather, as well as visitors’ elevation and distance to amenities (i.e., roads, waterbodies, parking areas, and buildings). We compared visitor behavior on cold, average, and hot days, and on days with precipitation compared to days without precipitation, across fourteen ecoregions within the continental U.S. Our results suggest daily weather impacts where visitors go within parks, and the effect of weather differs substantially by ecoregion. In most ecoregions, visitors stayed closer to infrastructure on rainy days. Temperature also affects visitors’ spatial behavior within parks, but there was not a consistent trend across ecoregions. Importantly, parks in some ecoregions contain more microclimates than others, which may allow visitors to adapt to unfavorable conditions. These findings suggest visitors’ spatial behavior in parks may change in the future due to the increasing frequency of hot summer days.


The impact of weather on outdoor recreation
Outdoor recreationists often select their destinations and the timing of their trips based on the climate 15 . Once on-site, weather influences the types of activities chosen, the length of stays, and the amount of satisfaction obtained 16 . However, tourists' sensitivities to and preferences for weather differ depending on the climate of their destination 17 . For instance, tourists in mountain areas or urban areas believe the ideal temperature is lower than the ideal temperature desired by beach tourists 18,19 . There is substantial variation found in the literature for optimal temperatures and thresholds for outdoor recreation, largely because outdoor recreation settings and the activities they support vary widely, and many studies tend to be focused on one or two specific settings 20,21 . For example, precipitation was found to be negatively correlated with summer visitation to a forested and beach park in Canada, and temperature positively correlated with visitation, up to a threshold of 33 °C, after which visitation declined 14 . A different study in five desert U.S. national parks found visitation declined at three parks once a threshold of 25 °C was reached, while two parks did not exhibit a temperature threshold 6 . We utilize nationwide visitation and weather data to analyze the impact of daily weather on the spatial behavior of visitors across multiple settings. www.nature.com/scientificreports/ Changing temperature and precipitation patterns are likely to directly impact both the supply of and demand for outdoor recreation opportunities, although the impacts will also differ by activity and geographic region 3,22 . For example, previous research has found the impact of monthly weather averages on visitation to Australian parks varied by climate region 23 . Increased temperatures due to climate change have already expanded the length of the peak season in U.S. national parks 24 . Warmer than average temperatures generally equate to longer seasons in which individuals can participate in warm-weather recreation activities 1 . However, the ways in which weather impacts park visitation is likely to be dependent upon the geographic features of particular parks. Some outdoor recreation destinations may see visitation decline after reaching a certain temperature threshold (e.g., 25-33 °C), while parks with a greater number of different microclimates accessible to visitors (e.g., mountain parks or those with deep canyons) may continue to experience visitation increases above the threshold 6 .
Most studies to date have not taken into account different microclimates within a single destination. For example, Rutty and Scott 25 found that coastal tourism areas contained varying microclimates, with thermal conditions differing up to 4 °C at various areas of a particular resort. Although some outdoor recreation destinations may appear "too hot" under altered climatic conditions 4 , it is unknown whether visitors may adapt by visiting different areas within a park (e.g., higher altitudes or near bodies of water). By joining the location and date of social media posts with historical weather data, we provide the first high-resolution understanding of how temperature and precipitation impact the spatial behaviors of outdoor recreationists within parks in the U.S.

Using social media data in parks
Over the last decade, researchers have found social media data to be helpful to inform outdoor recreation management in parks and protected areas 13,26,27 . Social media can be used as a relatively accurate estimation of visitation to parks and protected areas at annual and monthly scales [28][29][30] . For example, social media from Flickr was found to be useful to discern monthly trends in visitation to national parks in the western U.S. 29 . Although many land management agencies in the U.S. estimate visitation through surveys, administrative data, and traffic counters 31 , social media data are unique in that they allow for visitation estimates at fine spatial and temporal resolutions. The NPS only produces visitation estimates at the monthly scale 31 , whereas social media data can show temporal trends in visitation at the hourly resolution 32,33 . This is because the timestamp that the photo was taken, and the geographical coordinates of the photo, are recorded in metadata automatically recorded by and stored on individuals' smartphones 34 . For instance, one study used multiple years of geotagged Flickr data to understand trends in what time of day, and what day of the week, people tend to visit a national park in Spain 32 . Additionally, geographic coordinates of posts are typically accurate within 5 m if photos are taken with a GPSenabled device 35 , making the spatial resolution higher than other sources of visitation data.
Researchers have also leveraged the spatial specificity of geotags to show trends in where visitors go within parks and protected areas 32,[36][37][38] . By mapping social media along with other geospatial data, researchers can better understand what factors relate to visitor demand within a park 36,39,40 . For example, previous research has concluded the spatial patterns of Flickr posts in parks differ by season, and the presence of trails was the most important factor predicting Flickr photos in the summer in national parks 36 . The resolution of geotagged social media can be leveraged to understand how visitation patterns relate to infrastructure, like trails and roads, as well as environmental factors like weather.

Correlations between flickr data and NPS-reported visitation. The correlation between Flickr
Photo-user-days (PUDs) and NPS-reported visitation across 108 units was R s = 0.707 (n = 108, p < 0.001). At the monthly scale, the correlation was R s = 0.709 (n = 540, p < 0.001). These data are summed from 2006 to 2018 and include the months of May-September. This correlation is similar to other studies comparing social media posts in parks to other sources of visitation data 27 . Thus, results suggest geotagged Flickr data are a useful proxy for summer visitation in NPS units.
Descriptive statistics. Table 1 shows all the means and standard deviations by ecoregion for daily maximum temperature at the visitor centers and Flickr points, daily precipitation at the visitor centers and Flickr points, and elevation at the visitor centers and Flickr points. Mean maximum daily temperature at visitor centers was highest in the warm desert ecoregion (37.1 °C) and lowest in the marine west coast forest ecoregion (22.5 °C). Mean daily precipitation at visitor centers was highest in the tropical wet forest ecoregion (6.3 mm) and lowest in the Mediterranean California ecoregion (0.1 mm). Overall, there was not much variation in the amount of daily precipitation at visitor centers compared to Flickr points. Elevation at visitor centers was highest for the cold deserts ecoregion (1829.0 m), and highest for Flickr points in the Northwest forested mountains ecoregion (1999.2 m). Flickr points in the Northwest forested mountains ecoregion had the largest standard deviation for elevation, indicating this ecoregion has the largest range of elevations visitors frequent. Elevation was lowest in the tropical wet forests ecoregion (1.2 m at the visitor centers, and 1.1 m at Flickr points). Table 2 shows the means and standard deviations by ecoregion for the distance from each Flickr point to the nearest road, waterbody, parking area, and building. Mean distance to roads ranged from 9.3 m (Southeastern USA plains) to 165.2 m (temperate Sierras). Across all ecoregions, the mean distance to roads was 63.0 m, and the median distance to a road was 10.9 m. This indicates many visitors to NPS units stay very close to roads in the summer. In most ecoregions, visitors were farther from buildings and designated parking areas compared to roads. These results suggest many visitors may take photos from their cars, or from pullout areas on the side of roads.  Figure 2 shows the distributions for the difference in daily maximum temperature between the visitor center and individual Flickr point locations. Wider distributions (e.g., Northwest forested mountains ecoregion) indicate more microclimates within the parks, while narrower distributions (e.g., Southeastern USA plains) indicate daily temperatures are similar across the whole park unit. These microclimates represent the differences in temperature between where people visit compared to the visitor center; they do not necessarily represent differences in daily temperature across all park areas. Since some places may be inaccessible, we only explored temperature differences, and thus microclimates, in park areas that receive visitation.
Overall, there is less variation in the difference in daily precipitation between the visitor centers and Flickr point locations. At least 50% of the Flickr points had the same daily precipitation as the visitor centers in every

Differences in visitation patterns between hot and cold days. The cutoff points for what was
defined as a cold day, average day, and hot day differ by park unit and can be found in Supplementary Table A1.
The effect of maximum temperature on visitors' elevation and distance to roads, waterbodies, parking areas, and buildings varied by ecoregion (Fig. 3). There is not a consistent trend in how temperature impacts the spatial patterns of visitation across ecoregions for any variable. In some ecoregions (e.g., tropical wet forests, mixed wood plains), visitors stay closer to parking areas and buildings on cold days, but in other regions (e.g., cold deserts, warm deserts), visitors travel farther from infrastructure on cold days. Visitors tend to frequent lower elevations on cold days in most ecoregions, but there is not a consistent trend in elevation on hot days. Although temperature does affect visitors' spatial distributions within parks, the effect sizes were all very small or small. Boxes without values in Fig. 3 indicate there was no statistical differences across the three temperature classifications for that particular ecoregion; this does not necessarily mean no difference exists. Some ecoregions had smaller sample sizes (e.g., temperate Sierras at n = 797), while some had very large sample sizes (e.g., Northwest forested mountains at n = 209,173). Statistical power is higher when sample sizes are larger, so we were inherently more likely to detect significant differences in ecoregions with larger sample sizes. Sample sizes for each ecoregion based on temperature and precipitation grouping are available in Supplementary Table B1, and additional statistical information associated with Fig. 3 is available in Supplementary Table C1. Figure 4 shows examples of how spatial distributions differ during cold and hot days for two parks: Yosemite National Park (Northwest forested mountains ecoregion) and Death Valley National Park (warm deserts ecoregion). These maps suggest some trails or regions are more popular on hot days, while others are more popular on cold days. In Yosemite, the map shows visitors are more likely to stay closer to roads on cold days. This is consistent with findings from the results in Fig. 3 from the Northwest forested mountains ecoregion, that visitors stay 19.6 m closer to roads on cold days compared to average days. In Death Valley, visitors appear more likely to stay near roads on hot days, consistent with results from the warm deserts ecoregion that shows visitors stay 12.1 m closer to roads on hot days, and 20.4 m farther from roads on cold days, compared to average days. Maps showing general spatial distributions of visitors in each study site, as well as spatial distributions on cold versus hot days, are available online 41 . Differences in visitation patterns between wet and dry days. The effect of daily precipitation on visitors' elevation and distance to roads, waterbodies, parking areas, and buildings also varied by ecoregion, although there are some trends across ecoregions (Fig. 5). Overall, on rainy days, visitors were more likely to stay near roads, waterbodies, parking areas, and buildings. However, this trend does not hold for some of the warmest ecoregions (e.g., warm deserts), where visitors were farther from infrastructure on rainy days. In the warmer ecoregions, visitors went to higher elevations on rainy days, but in the cooler ecoregions, visitors stayed at lower elevations on rainy days. Although rain does impact visitors' spatial behavior in all ecoregions, the effect sizes are

Discussion
Our results suggest visitors do change where they go within NPS units based on daily temperature and precipitation. The effect of temperature on elevation and distance to a road, distance to a waterbody, distance to a parking area, and distance to a building varied by ecoregion, with no consistent trends across all ecoregions. Overall, visitors were more likely to stay near infrastructure and waterbodies on days with precipitation, although this is not true in every ecoregion. However, the effect sizes of the differences are mostly very small, indicating that maybe only a subset of visitors are impacted by weather. Weather impacts visitors differently depending on their activity type and demographic characteristics, so some visitors may be more or less impacted by the weather 43 . The majority of visitors stay very close to roads (i.e., over half are within 11 m from a road); it is possible weather may have less of an impact on visitors who plan to stay near roads, most likely very close to (if not in) a vehicle. More research is needed to determine if and why only certain groups of visitors alter their spatial behavior within parks based on the weather. Climate change is expected to alter the total number of visitors to parks, with the majority of parks in the U.S. expected to see an increase in visitation 4 . This could strain park resources and cause overcrowding in some parks. Since most visitors stay close to roads, it is important to maintain the roads and infrastructure that are already present. Accommodating visitation demand may not require substantial increases in some types of outdoor recreation infrastructure (e.g., trails), but rather a re-thinking of what the typical park experience is for most visitors. With most visitors choosing to stay extremely close to existing park infrastructure, capital investments should be focused on infrastructure upgrades and developments (e.g., remodeling and expanding visitor centers) that are better able to serve the needs and desires of more visitors in the future.
Previous work has found total visitation to parks is influenced by daily and monthly weather conditions 6,9 . Our findings suggest some visitors will respond to warmer than average temperatures by adapting where they go within a park. For example, some visitors may go to higher elevations on warm days, while other parks may see more visitors at lower elevations, possibly in cooler canyons or near the ocean. In some ecoregions, visitors may also choose to stay closer to roads or bodies of water on exceptionally hot days. Once a visitor is already at a park unit, they can respond to adverse weather by not visiting (i.e., staying in nearby towns), visiting a different location in the park, or changing activities 43 . More research is needed to understand how visitors decide to respond in different ways, and how that varies by user group. Park managers can help visitors adapt to extreme temperatures by providing information on which areas of the park, that are accessible by road, are comparatively cooler. However, not all parks contain microclimates that may allow for adaptation.
Parks in some ecoregions have more microclimates than others. Our analyses showed parks in the warm deserts, cold deserts, and the Northwest forested mountains ecoregions had wide distributions in the difference in temperature between visitors' locations in the park and the temperature at the visitor center. In other ecoregions, such as the Southeast USA plains, visitors were almost always at a location in the park that had the same temperature as the visitor center. Visitors may therefore have a greater ability to adapt and spatially substitute outdoor recreation settings within park boundaries at some parks compared to others. However, we only investigated microclimates with regards to where people currently visit; it is possible that some parks in this study do have microclimates within their boundaries that are not currently visited, but may see visitation in the future. www.nature.com/scientificreports/ In parks that do not have varying microclimates, visitors may be less likely to visit on days with unfavorable temperatures rather than change their spatial behavior within the park. This is consistent with previous research showing visitation declined in some Utah national parks once temperatures were above 25 °C, but visitation continued to increase above this threshold in parks that seemingly had more microclimates 6 . Although this analysis only covered the summer season, it is likely that some trends may be attributed to within-season variability. For instance, it is more likely to be cold in May and September, and hot in July and August. In some mountainous parks, certain roads or trails may be closed at the beginning of the summer season until snow melts. Therefore, visitors may not have had the option to visit some park areas on colder than average days. Parks in the Northwest forested mountains ecoregion are the most likely to have areas closed due to snow, so these managerial factors are likely to have the biggest influence in this ecoregion. In some parks, visitors' spatial behavior may be driven by managerial factors (i.e., closed roads or trails) rather than solely visitors' decisions.
As with any data source, social media has its limitations. Social media may not be representative of the spatial patterns of all park visitors, since only a small portion of total visitors post photos to Flickr 32,44 . Additionally, some parks tend to have substantially more social media posts than other parks, indicating the most popular parks were overrepresented in this analysis. We explored the impact of weather on visitors at the ecoregion level; however, future research is needed to determine if there is additional variation across parks within the same ecoregion. OpenStreetMap was an excellent resource for large-scale volunteered geographic information, but the accuracy of this data source does vary by location and feature [45][46][47] . While the road and water features appeared to be complete across all NPS units in this study, the parking and building datasets were likely not entirely complete. In other words, some buildings and parking areas were missing, but all of the parking areas and buildings documented on OpenStreetMap did exist in that location. Therefore, the estimates for distances to parking and buildings likely represent high estimates. In addition, distances to features do not necessarily indicate how far a visitor hikes or ventures; a visitor could hike for over 500 m and still be within 10 m of a road.
Our investigation began with an effort to understand how weather may impact visitors' spatial behavior across NPS units. Further studies could explore if weather changes spatial patterns of visitors outside park boundaries, such as to gateway towns and surrounding parklands. Additionally, future work could explore how weather impacts spatial patterns of visitors to parks in other countries. This approach of using social media data to understand spatial patterns could be replicated in other locations that have daily weather data. We found that the effect of daily weather on visitation patterns was not homogenous across the U.S. Our results indicated large differences across ecoregions, so results from one ecoregion cannot necessarily be extrapolated onto parks with differing climates or topography. We would expect parks in other countries may exhibit comparable results to the ecoregion that has the most similar climate and topography; however, this needs additional research. In addition, this analysis demonstrates the utility of social media for revealing visitation patterns within parks at high spatial and temporal resolutions, which can be useful to understand visitor behavior beyond the context of weather-dependencies 13 .

Conclusions
In certain ecoregions, visitors alter the locations they go to within NPS units based on daily weather conditions. The effect of temperature and precipitation on visitors' spatial behavior varies by ecoregion, likely because the climates, topography, and availability of microclimates within parks differ by these ecoregions. Some parks may see an increase in visitors to higher elevations on hot days, while other parks may see more visitors at lower elevations on hot days. Visitors are overall more likely to stay near infrastructure on rainy days. Park managers should expect spatial distributions of summer visitors within parks to change in the future due to increasing numbers of hot days. In parks that contain more microclimates, visitors may have a greater ability to adapt to adverse temperature conditions by spatially substituting one outdoor recreation setting for another.

Methods
Study sites. Study sites include all NPS units in the continental U.S. larger than 10,000 acres (4047 hectares).
NPS units include national parks, national monuments, national recreation areas, and national seashores, among others. Each park unit was assigned both a level I and a level II ecoregion based on the location of the centroid of the unit. Level I ecoregions represent the most general category, while level II ecoregions are more detailed. For nearly all ecoregions we used the level I ecoregions. However, two level I ecoregions (North American deserts and Eastern temperate forests) were split into their level II ecoregions due to their vast size and the number of study sites contained within them. Figure 1 shows the study sites along with the ecoregion categories used in this paper; a full list of all NPS units included in this study and their ecoregion classifications can be found in the Supplementary Table E1.
Data collection and processing. All data used in this paper are publicly available. Table 3 lists all datasets used along with their sources. In cases where an R package is listed as a source, we downloaded the data directly through R, using the specified packages to interact with the Application Programming Interfaces (APIs). All R code written for data collection, processing, and analysis is available 41 .
We downloaded Flickr data within the study sites between May and September, from 2006 to 2018, from the Flickr API using Python. We downloaded these data in October 2019. We deleted any photos by the same user, on the same day, within 10 m of another photo posted by the same user; therefore, we only retained one photo per user, per location. This is similar to the concept of PUDs 29,30 , except we only deleted duplicates in close proximity rather than duplicates anywhere within the unit. We did this believing it was important to retain posts by the same user if they were in different locations within the park. Sample sizes by unit are available in Supplementary Table F1 www.nature.com/scientificreports/ We joined each Flickr point to the daily weather on that day at that location using weather data from Daymet. Daymet contains weather data for every location in the continental U.S., which is modeled from individual weather station data, and has a high accuracy 53,59 . Our analysis does not include any Flickr points tagged in an ocean (e.g., off the coast of a national park) because Daymet does not provide weather estimates over oceans. We also connected each Flickr point to the elevation at that particular location. We downloaded data on roads, waterbodies, parking areas, and buildings from OpenStreetMap in December 2019 (specific information on download criteria is in Supplementary Table G1). For each Flickr point, we calculated the straight-line distance to the nearest road, waterbody, parking area, and building.
Analysis. Social media data validation. We compared the number of Flickr PUDs within each unit between the months of May and September from 2006 to 2018 to the NPS-reported visitation for each unit during the same time period to ensure the Flickr data are a reliable and representative indicator of visitation. PUD indicates that only one photo per visitor was counted each day; duplicate posts by the same visitor on the same day were removed even if they were in different areas of the park. Subsequent analyses used the full dataset filtered to include just one photo per user, per location. We obtained Spearman's correlation coefficient as a measure of association between Flickr PUD and NPS-reported visitation. We used Spearman's rank correlation because the distributions were found to be non-normal after running a Shapiro-Wilk test. Two parks were not included in this analysis because NPS did not have visitation data for these parks during this time period.
Understanding how weather impacts visitors' spatial behavior. We first explored if and how individual parks have different microclimates (i.e., the park offers different areas where visitors can go that may have slightly different climates). We recorded the differences between the daily maximum temperature and precipitation at Flickr points compared to the main visitor center on that day. We plotted distributions of differences by ecoregion to see if visitors were going to places within parks that have substantially different weather than at the visitor centers.
We then investigated the effect of maximum temperature and precipitation on visitors' spatial behavior by grouping visitors by the weather during the day they visited. For maximum temperature, visitors were grouped into three categories: cold day, average day, or hot day, based on the temperature at the visitor center on the day of the visit. Average days were defined as those within one standard deviation from the unit-specific seasonal mean maximum temperature. Cold days were defined as days with a maximum temperature lower than one standard deviation below the unit-specific seasonal mean maximum temperature. Hot days were classified as days with a maximum temperature greater than one standard deviation above the unit-specific seasonal mean maximum temperature. We grouped these observations by unit rather than ecoregion to reduce bias. For instance, one park within an ecoregion could be warmer than the others; grouping by unit avoids having all data from one park classified in the same temperature category. Precipitation was split into two groups based on whether or not there was precipitation at the visitor center on the day of the visit.
We tested if maximum temperature or precipitation affected: (1) the elevations visitors were traveling to within a park; (2) their distance to roads; (3) their distance to waterbodies; and (4) their distance to designated parking areas or buildings. We ran Welch's ANOVA tests to determine if there were differences in the spatial patterns between cold, average, and hot groups. If the results were significant at the 0.05 level, we ran Games-Howell post-hoc tests to determine where the significant differences were (i.e., if differences were between the cold and average group, hot and average, hot and cold, or all three). We used Games-Howell tests because they do not require the assumptions of equal variances or equal sample sizes to be met 60 . Additionally, if there were significant Table 3. Datasets and sources used in this paper. a These data also include raw polygon files (representing loop roads) that were converted to line features. License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.