Introduction

Human caused climate change and the concerning new climate reality continues to threaten the way of living of the current and following generations. The development of effective mitigation strategies thus remains a key challenge in the 21st Century1 which requires targeting various aspects of everyday life2,3. Mitigation strategies are often categorized as supply-side and demand-side solutions. While the more traditional supply-side solutions focus on technological advancements to, for instance, decarbonize supply chains and improve energy efficiency of appliances, demand-side solutions directly address individual behavior and needs, such as promoting reduced car usage or energy consumption4,5. Supply-side solutions are traditionally considered as crucial for reaching global climate goals; but there is also a growing acknowledgment of the importance of demand-side solutions in tackling climate change from various angles. This shift mirrors the understanding that relying solely on potential technological advancements on the supply-side is unlikely to effectively accomplish global climate and sustainability objectives in due time5. However, since demand-side solutions aim to address individual behavior, their effectiveness in the real world heavily relies on individualsā€™ willingness and capacity to adopt these measures3. To devise effective regulations and policies in this area, it is thus crucial to gain a comprehensive understanding of the factors that drive or hinder individual sustainable behavior in the first place. Consequently, researchers have dedicated immense effort to identifying these factors in the past. To date, studies have recognized a wide range of individual and situational factors that are relevant to individual behavior in various life domains such as the role of social norms, values and structural barriers for electricity consumption or mobility behavior3,6. While identifying singular factors can advance our general knowledge about existing drivers of and barriers to sustainable behavior, the intricacies of their relationships and their absolute and relative strength of influence on sustainable behavior is still widely unknown. This issue has also been raised in the most recent IPCC report: Although a lot of evidence linking singular factors to individual sustainable behavior exists, there still is a high need for investigating and understanding the mutual interactions of factors and their relative importance for individual behavioral change3.

In the present study we aim to close this knowledge gap by (1) considering a multitude of potential influencing factors of sustainable behavior and their interaction in an interdisciplinary setting, (2) measuring individual sustainable behavior in total greenhouse gas (GHG) emissions for different life domains, and (3) using machine learning (ML) models to analyze the data. Taking this specific methodological approach is crucial for several reasons. First, most of the potentially influential factors have been obtained in study designs investigating only a singular or handful of factors at a time, even though researchers have repeatedly argued that individual sustainable behavior is shaped by a multitude of factors and their interaction with each other7,8,9,10,11,12,13. Studying factors in isolation and thus not being able to control for the influence of other factors can lead to severe over- or underestimation of their relative importance for sustainable behavior14. Consequently, if policy makers base demand-side solutions on factors that do not (or only marginally) affect individual behavior, it will not be surprising that these measures fail to reach the desired impact on climate change in the real world15,16,17. Relatedly, even when study designs considered multiple factors at the same time in previous research, scholars often used simple linear models to analyze the data, effectively neglecting complex (e.g., non-linear) associations among the factors. In doing so, any obtained results likely have led to biased conclusions in how particular factors influence behavior in the real world14 which ā€“ again ā€“ can result in poor regulative decision making. Beyond studying factors in isolation and neglecting their associations, previous study designs often did not capture impactful behavior, that is, behavior high in GHG emissions18,19. In detail, studies often investigated specific types of sustainable behavior that can be assessed rather easily but may only exert relatively low GHG emissions, such as self-reported recycling or water-saving behaviors. Notably, the determining factors of such low GHG behaviors can differ drastically from those influencing high GHG behaviors8,20, which may have led researchers and policymakers alike to focus on less important factors for climate change mitigation. On the other hand, studies that did measure high GHG behaviors (such as mobility and consumption patterns or energy use) often only measured parts of the respective behaviors (e.g., traveled distance by car but not accounting for car type or fuel consumption) which may also affect the obtained relative importance of factors. As a result, measuring sustainable behavior and related GHG emissions comprehensively and focusing on behaviors with high GHG impacts seems necessary to devise solutions effectively battling climate change8,20,21.

Right now, only a few studies exist which at least satisfy some of the criteria outlined above to a certain degree. For instance, studies have analyzed the influence of multiple factors on mobility behavior22, accounted for complex interplays of factors using ML models to identify high emission household23, or measured behavior in GHG emissions more comprehensively18. However, to the best of our knowledge, no previous work has yet fully met all these study requirements in a single analysis which would be necessary to comprehensively investigate drivers and barriers of sustainable behavior and derive effective demand-side solutions.

To provide a comprehensive view on the factors relevant for the demand-side, we first compiled previously identified factors influencing sustainable behavior in an extensive literature review. We then asked a large sample representative of the general German population to report on all these factors as well as their GHG emissions in different life domains. In a single analysis of this data set we used ML models to predict individual behavior measured in GHG emissions with the candidate factors. We also compare the performance of our ML models to simple linear models which are still predominantly used in the literature. Finally, we analyze the relative importance of all factors in our models to identify which factors should best be targeted to initiate changes in high GHG behaviors. Our findings demonstrate that employing ML models for predicting sustainable behavior not only enhances the accuracy of predictions compared to simple linear models, but also facilitates the identification of important factors among numerous contenders. While certain factors investigated in our study are relevant for sustainable behavior across various life domains (e.g., perceived behavioral control, behavioral habits), others exhibit domain-specific relevance (e.g., infrastructural barriers, pro-environmental attitudes). Our results regarding the relative importance of factors for sustainable behavior are largely in line with those of previous studies but we also uncover notable disparities from previous results regarding the importance of some crucial factors (e.g., easy access to public transportation, income, or availability of sustainable food options). We assert that demand-side solutions aimed at mitigating climate change must recognize the intricate interplay of behavioral drivers. Based on our results, we offer recommendations regarding which factors one may want to focus on to maximize the beneficial effects of behavioral change for climate change mitigation.

Results

Study design

To collect data for our study, we ran surveys in German households from April to July 2021. The recruiting and sampling procedure was conducted by the panel provider Respondi AG who ensures representative panels on national population statistics. Our sample was quoted by gender, age, and federal state. The final sample size used in the analyses after performing several quality checks was Nā€‰=ā€‰10,993. More information about the sampling procedure and sample composition can be found in the Supplementary Methods and Supplementary TableĀ 1.

In preparation for constructing our survey, we first identified the relevant factors of individual sustainable behavior in a comprehensive literature review and selected factors which are grounded in established theories on sustainable behavior as well as those which were identified as relevant in meta analyses and large scale studies. In general, the identified drivers and barriers of sustainable behavior can be categorized into internal (i.e., person-related) and external (i.e., situation-related) factors10. Internal factors mainly include psychological factors such as individual beliefs, attitudes, values, and intentions. External factors subsume political, social, economic, and cultural conditions people find themselves in. A detailed description of the literature review and all factors included in our analysis can be found in the Methods and Supplementary NotesĀ 1 and 2.

To measure impactful sustainable behavior, we asked participants to report on their past behavior in the most important domains of everyday life: 1. shelter (electricity & heating), 2. mobility, 3. consumption and 4. diet. We then calculated peopleā€™s GHG footprint in CO2 emission equivalents to quantify an individualā€™s contribution to climate change based on validated calculation principles for the German population24,25,26. The detailed calculation and life domain selection principles can be found in the Methods. We then used all the collected internal and external factors to predict domain-specific footprints. To account for the multitude of factors and their associations with each other, we used ML models to analyze the data. To this end, we chose popular models used in previous studies investigating GHG emissions and survey data. In detail, we used Random Forests (RF), support vector machines (SVM) and Lasso Regression (LASSO) and compared their performance to a traditional linear ordinary least squares regression model (LM) which still represents the current practice of predicting climate-relevant behavior. More information about the ML model selection, working principles and advantages over more traditional models can be found in the Methods and Supplementary Methods.

Footprint prediction capacity of internal and external factors

In the first step of our analysis, we aimed to test whether we can predict GHG emissions in the different domains of life through the internal and external factors included in our study (for an overview of all predictors used in the models, see TablesĀ 1ā€“7). To this end, we evaluated prediction performance of all models on out-of-sample data (i.e., data the models were not fitted on). That is, we cross-validated all model fits on a training dataset (CV) and assessed the final prediction performance on a separate test set. The mean absolute error (MAE) was used as the general prediction error metric and the explained variance (R2) was used to quantify the variance in the GHG emissions that is accounted for by the internal and external factors. A more detailed description of the analysis procedure and rationale can be found in the Methods and the Supplementary Methods.

Table 1 Overview of previously investigated factors (predictors) in the models to predict sustainable behavior.
Table 2 Overview of previously investigated factors (predictors) in the models to predict sustainable behavior (continuation of TableĀ 1).
Table 3 Overview of previously investigated factors (predictors) in the models to predict sustainable behavior (continuation of TablesĀ 1ā€“2).
Table 4 Overview of previously investigated factors (predictors) in the models to predict sustainable behavior (continuation of TablesĀ 1ā€“3).
Table 5 Overview of previously investigated factors (predictors) in the models to predict sustainable behavior (continuation of TablesĀ 1ā€“4).
Table 6 Overview of previously investigated factors (predictors) in the models to predict sustainable behavior (continuation of TablesĀ 1ā€“5).
Table 7 Overview of previously investigated factors (predictors) in the models to predict sustainable behavior (continuation of TablesĀ 1ā€“6).

The results show that all ML models outperformed the standard LM. In detail, the RF exhibited the highest prediction performance of all models across domains (on average), followed by the SVM, LASSO and LM (see Fig.Ā 1 and Supplementary Discussion for full model description and discussion). Not only did the ML models outperform the LM in most domains, but they also consistently exhibited low variance in prediction performance (Fig.Ā 1, Supplementary TableĀ 4 and Supplementary Discussion). In the best performing RF, predicting individual GHG emissions was most successful for the domains mobility and diet. In these domains, prediction error on the test set was the lowest (Mobility: RFMAEā€‰=ā€‰0.59, Diet: RFMAEā€‰=ā€‰0.68) and explained variance the highest across domains (Mobility: RFR2ā€‰=ā€‰33%, Diet: RFR2ā€‰=ā€‰24%). Relatedly, the biggest performance gains of the RF over the LM could also be observed in domains mobility and diet, the latter domain showing the highest improvement. In all domains, test set performance of the RF matched the performance of the CV on the training set (test set prediction values are within 1ā€‰SD intervals of the CV, Fig.Ā 1).

Fig. 1: Prediction performance of individual GHG emissions in all life domains (cross-validation and independent test set).
figure 1

a Explained variance and b mean absolute prediction error of the Random Forest (RF), Linear Regression (LM), Lasso Regression (LASSO) and Support vector machine (SVM) models. Colored error bars indicate performance in the 10-fold cross-validation (CV) on the training set, spanning a 1ā€‰SD interval. Star and square indicators represent average performance across folds. Upward pointing triangles below the error bars indicate prediction performance on the independent test set. Dashed lines indicate absolute performance difference between the LM and RF. Values are rounded to two decimal places. Negative R2 values indicate a worse model fit than always predicting the mean GHG emission (R2ā€‰=ā€‰0) (see Supplementary Methods).

These results illustrate three important points: First, low overall prediction performance and high variance in performance in the LMs indicates that using standard LMs could have led to overestimation of model performance in the past, potentially leading to misjudgments in the relative importance of factors for predicting individual behavior. Second, the ML models managed to capture the relationships between internal and external factors with individual GHG emissions more accurately. Third, performance gains of more complex (non-linear) models like RF and SVM over linear models (LM and LASSO) indicate that internal and external factors have complex relations with and interactive effects on sustainable behavior (Fig.Ā 1), which empirically supports the assumption that individual sustainable behavior is influenced by various interacting factors12.

Key drivers and barriers of impactful sustainable behavior in domains mobility and diet

Besides overall GHG emissions prediction performance, we were also interested in evaluating the relative importance of the factors included in our models. Since the RF performed the best among all models and prediction of GHG emissions worked best in domains mobility and diet, we focused on analyzing factor importance of the RF in these domains. Further, domains mobility and diet represent the first and second highest contributing domains to householdsā€™ overall GHG emissions in many European countries27,28. Thus, it seems promising to develop tailored GHG reduction programs for them. Note that due to the high number of factors investigated in our analysis and our study goal of identifying important factors, we only focus on those factors considerably contributing to the model prediction in the following (see Methods). All important factors for predicting mobility and diet related emissions in our models are depicted in Figs.Ā 2 and 3, sorted by their relative importance.

Fig. 2: Predictor importance for individual GHG emissions for mobility domain in the Random Forest.
figure 2

Summary (beeswarm) plot and predictor importance (black bars) for life domain mobility, predictor importance was calculated using Shapely values. The summary plot shows the relationship of individual predictor values with model prediction compared to the average prediction. Dots represent individuals in the dataset, overlapping points are jittered on the y-axis. Individual values on the respective predictors range from low (blue hue) to high (red hue). Positive SHAP values indicate a change in model prediction towards higher emissions. Black bars indicate the overall importance of the predictor for the model prediction performance. Predictors are sorted by their relative importance. Only predictors with average importance (mean SHAP value) above the mean importance of all predictors are shown (i.e., most important predictors) but the plot is based on including all predictors in the model. For more information, see Supplementary Methods.

Fig. 3: Predictor importance for individual GHG emissions for diet domain in the Random Forest.
figure 3

Summary (beeswarm) plot and predictor importance (black bars) for life domain diet, predictor importance was calculated using Shapely values. The summary plot shows the relationship of individual predictor values with model prediction compared to the average prediction. Dots represent individuals in the dataset, overlapping points are jittered on the y-axis. Individual values on the respective predictors range from low (blue hue) to high (red hue). Positive SHAP values indicate a change in model prediction towards higher emissions. Black bars indicate the overall importance of the predictor for the model prediction performance. Predictors are sorted by their relative importance. Only predictors with average importance (mean SHAP value) above the mean importance of all predictors are shown (i.e., most important predictors) but the plot is based on including all predictors in the model. For more information, see Supplementary Methods.

Within the mobility domain, two of the most important demographic factors predicting GHG emissions are income and professional status. Employed and wealthier people exhibited much higher mobility related GHG emissions than people who are retired and/or have lower incomes (Supplementary Figs.Ā 4 and 5). This finding is consistent with previous studies investigating impactful behavior and the environmental impact of affluent citizens5,18,20,22,29. Further analysis of these relationships indicated that higher-income individuals are more likely to possess cars (most important predictor of GHG emissions in our model) and tend to travel by car more frequently (Supplementary TableĀ 5). Again, this finding is in accordance with other previous findings30,31. The relationship of GHG emissions with professional status, on the other hand, seemed to be exclusively driven by retired individuals, since all other groups were attributed a comparable amount of GHG emissions in our model (Supplementary Fig.Ā 5). Whereas previous research seems to be inconclusive regarding travel activity at retirement age32,33,34, our results suggest that retired individuals exhibited less mobility related GHG emissions. This may be due to overall low usage of high GHG transportation means such as cars, planes, and buses (Supplementary TableĀ 8).

Of the internal factors, perceived behavioral control constituted the most important predictor of mobility-related GHG emissions. Low subjective control over using alternative transportation means to get to places of interest (e.g., work, grocery stores) was associated with higher GHG emissions (Fig.Ā 2). This result illustrates that if individuals perceived alternative mobility options like public transportation or bike riding as not feasible, they exhibited higher GHG emissions, likely due to more frequent and extensive car use (Supplementary TableĀ 5). Similarly, negative emotions, attitudes (behavioral beliefs) and low personal norms towards alternative transportation means as well as dissatisfaction with alternative mobility options were associated with higher GHG emissions (Fig.Ā 2). That is, if individuals were dissatisfied with infrastructural circumstances of public transportation, simply do not like sitting in buses or riding a bike to work or if their family and friends also do not use alternative options, they may have been more likely to resort to the car more often (Supplementary TableĀ 5). Peopleā€™s travel mode habit was another important predictor of GHG emissions in our analysis. Individuals who routinely chose alternative transportation means over driving had much lower GHG emissions than their counterparts. This finding was expected, since travel mode choices are very habitualized overall35. Although being less important for GHG emission prediction overall, behavioral beliefs, norms and perceived behavioral control regarding air travel seem to influence peopleā€™s mobility footprint as well. Like car usage, if individuals and their social contacts held positive attitudes towards air travel and saw no feasible alternatives, they were more likely to fly, leading to higher emissions (Supplementary TableĀ 6).

Although these findings suggest mobility related GHG emissions to be largely under individual control through travel mode choices, which is in line with some previous arguments22,36, we also found external factors affecting travel mode choice to be particularly relevant in predicting GHG emissions. The relative importance of these factors in our models strongly highlight the role of mobility infrastructure and living circumstances. Most prominently, added travel time using alternative mobility options played a major role (most important predictor after car possession and income) in mobility behavior: If individuals had to spent considerably more time for their daily trips using alternative mobility options, their footprint increased substantially (Supplementary Fig.Ā 6), presumably due to more extensive car usage (Supplementary TableĀ 7). This result is in line with findings from another recent study showing relative travel time (among other travel mode attributes) to affect subsequent commuting behaviors37. As expected, people reporting low subjective control over and less habitual use of using alternative options for their everyday trips also mostly reported high levels of added travel time for these options and vice versa (Supplementary Figs.Ā 8 and 9). Furthermore, travel time requirements seemed to interact with some internal factors. The GHG reducing effect of positive attitudes and emotions towards travel modes like trains and buses seemed to be less pronounced for individuals who face substantially prolonged travel times when using these alternative means (Supplementary Figs.Ā 10 and 11). Although to lesser extent, external factors like the physical distance to a train station and living area (city vs. rural) also seem to influence mobility decisions and resulting GHG emissions (Fig.Ā 2, Supplementary Fig.Ā 7, and Supplementary TableĀ 7). This finding is partly consistent with previous study results15,22,38.

Next, we focus on the diet domain (Fig.Ā 3), which highlights the importance of internal factors. Dietary habits were the key determinant of individual GHG emissions. That is, self-reported automaticity of sustainable diet habits was most predictive of a GHG-emission-friendly diet. Although still exhibiting high relative importance, other internal factors like perceived behavioral control, behavioral intentions, and attitudes (behavioral beliefs) on dietary behavior which previously have been claimed to be the main predictors in this domain39,40,41, deemed less important than dietary habits. Two main reasons arguably account for this result. First, our footprint calculation incorporated purchasing frequency of local, seasonal, and organically produced foods as well as individualsā€™ dietary form. This comprises a more comprehensive behavioral measurement than many of the above-mentioned previous studies (e.g., buying frequency of ā€œeco-friendlyā€, ā€œgreenā€, or organically produced food or meat consumption). Second, an individualā€™s diet (e.g., omnivorous, vegetarian, vegan) seems to be comparatively stable over time and comprises the singular most important determinant of overall diet GHG emissions42,43,44,45. This is arguably why we found dietary habits to be very influential. Moreover, contemporary studies showed diet habits to significantly reduce the influence of other factors on diet behavior44,46, which again speaks towards the necessity of investigating factors of sustainable behavior in unison to assess their relative importance.

Value orientations and social norms were also relevant in predicting diet related GHG emissions. People pursuing goals of social status (power) or pleasure and gratification (hedonism) exhibited higher diet related GHG emissions. Relatedly, politically conservative individuals showed higher diet related GHG emissions than liberals (Ideology) and individuals believing that behaving sustainably constitutes a good citizen (Citizen norms: Sustainability) showed lower diet-related GHG emissions overall. These results are in accordance with previous research on meat consumption and veganism showing that, for instance, conservative people and those valuing goals of power reported higher meat consumption and more lapses from a vegetarian and vegan diet44,47,48. Further, our findings suggest that individuals scoring high on conscientiousness and neuroticism eat more sustainable. In general, the literature on the relationship between dietary patterns and personality traits revealed mixed findings in the past49. Our results, however, align with more recent studies and meta-analyses, showing that high conscientiousness is associated with higher fruit and vegetable intake and more healthy diets in general and that vegetarians report higher levels of Neuroticism49,50,51. In contrast to many previous studies on ecological food consumption52, we did not find income to be a limiting factor to pursue sustainable diets in general. Importantly, consumption of ecologically produced foods represent only one part of the diet GHG equation44. Pursuing a conventionally produced, mostly plant-based diet ā€“ on the other hand ā€“ is usually not more expensive than a meat-based diet53. Thus, lower income may not inevitably lead to high GHG emissions, an assumption corroborated by our results.

Surprisingly, and in contrast to the mobility-sector results, external factors did not seem to be overly relevant for diet related GHG emissions. The only two relevant external factors were the availability of sustainable food in restaurants or cantinas people usually eat at and if people do or do not eat out regularly. Those who eat out more regularly and reported lower availability of sustainable food options in their local restaurants exhibited higher GHG emissions (Fig.Ā 3). Otherwise, knowledge about sustainable food options, the availability of local, seasonal, and organically produced foods in supermarkets or their identifiability, for instance, do not seem to be limiting factors. Notably, this finding is in line with only a few previous studies on sustainable food consumption54,55.

In general, our results regarding the diet domain support claims of more general theories of sustainable behavior proposing human values and pro-environmental attitudes to be precursors of sustainable behavior12,56,57. That is, diet-related CO2 emissions seem to not only be driven by diet-specific internal factors like habits or perceived control but also by individualsā€™ personality and more general (ideological) beliefs, values, and norms revolving around sustainability which can form the motivational basis of dietary patterns. However, as argued above, the relative importance of the latter factors is lower in comparison to the former factors (see Fig.Ā 3).

Drivers and barriers of impactful sustainable behavior in domains shelter and consumption

Unlike when predicting mobility and diet GHG emissions, attempts to predict peopleā€™s shelter (electricity & heating) and consumption GHG emissions were only partly successful. Notably, previous studies predicting shelter related GHG emissions showed better performance and found demographic (e.g., income, age), external factors (e.g., community zone) and dwelling characteristics (e.g., fuel and apartment type, household size) to be relevant58,59,60. Although we found similar factors like living area, income and age to be among the most important factors in the shelter domain in our models (Fig.Ā 4), the overall prediction accuracy of our models in this domain remains relatively low (see Fig.Ā 1 and Supplementary Discussion).

Fig. 4: Predictor importance for individual GHG emissions for shelter domain in the Random Forest.
figure 4

Summary (beeswarm) plot and predictor importance (black bars) for life domain shelter. a Domain Shelter: Electricity, b Domain Shelter: Heating. Predictor importance was calculated using Shapely values. The summary plot shows the relationship of individual predictor values with model prediction compared to the average prediction. Dots represent individuals in the dataset, overlapping points are jittered on the y-axis. Individual values on the respective predictors range from low (blue hue) to high (red hue). Positive SHAP values indicate a change in model prediction towards higher emissions. Black bars indicate the overall importance of the predictor for the model prediction performance. Predictors are sorted by their relative importance. Only predictors with average importance (mean SHAP value) above the mean importance of all predictors are shown (i.e., most important predictors) but the plot is based on including all predictors in the model. For more information, see Supplementary Methods.

An explanation for the lower prediction accuracy in our models may be that shelter related emissions are under lower personal control than emissions in other domains18,61. This is indicated by the relatively high factor importance of perceived control in our models (Fig.Ā 4) and becomes even more apparent when reviewing the determining factors of shelter related GHG emissions: The amount of energy necessary for space and water heating (and subsequent GHG emissions) is largely determined by the installed heating source (i.e., fuel type), type and state of the dwelling as well as its size62. An oil heating system produces, all other things being equal, much more GHGs than, for instance, an electrical heat pump fueled by solar power63. Similarly, a newly built, low-energy house radiates much less heat to the outside than a mid-18th century building. At the same time, individuals living in large, single family homes produce much more GHGs than individuals living in a small flat62. Simply speaking, if people live in spacious, old buildings with bad isolation and high-emission energy sources, they can heat as frugally as they want and still produce large amounts of GHG emissions18,25. Regarding electricity, the most important determinant of GHG emissions is the generative source61,64. High GHG electricity generation can quickly undermine benefits of saving behaviors61. Therefore, consumers receiving electricity generated from renewable sources (e.g., solar, wind or hydro-electric) can use considerably more energy than their non-renewable counterparts and still produce less GHGs (see also current net avoidance factors in Germany25,65).

Of course, we could have simply included dwelling characteristics and electricity type in our model as additional external factors. However, in doing so we would have confounded predictor and outcome variables because these variables are already included in carbon footprint calculators (as the one we used) to estimate individual GHG emissions. For instance, peopleā€™s heating behavior (e.g., thermostat setting, ventilation habits) is offset against their dwelling characteristics (e.g., fuel type, isolation, living space, household size) when their footprint is estimated (Supplementary NoteĀ 3). Since dwelling and electricity generation characteristics are the main determinants of individual GHG emission, other internal and external factors in our models can only account for the effect of individual behavior regarding heat and electricity savings on GHG emissions. Individual savings behavior, however, affects the overall footprint to a much lesser extent, as argued before25. The fact that we did not include dwelling characteristics and electricity generation in our models may partly explain the lower prediction accuracies of our models compared to previous studies which did include factors like fuel type, household size, and apartment type to predict shelter related emissions. Nevertheless, we consider it important to not conceptually confound predictors with the measurement of the to-be-predicted behavior. In summary, regulatory efforts on the supply-side (e.g., implementation of renewable electricity generation and energetically efficient dwellings) as well as demand-side (e.g., target peopleā€™s dwelling characteristic needs) seem to be the most promising approaches to notably reduce shelter GHG emissions.

A challenge for investigating consumption related GHG emissions is the tractability of purchased productsā€™ complete environmental life cycle. That is, calculating accurate footprints would require assessing every product bought in the year of interest, including a detailed description of the brand, type, and origin of goods to estimate product lifecycle emissions66. Even though tracking of singular products might be possible, assessing and calculating direct and indirect GHG emissions for all products that individuals bought over a year is virtually unattainable. As a result, we surveyed more general consumption patterns (e.g., frequency of buying second-hand goods, monthly spending on goods) based on recommendations in previous studies (Supplementary NoteĀ 3) to calculate consumption based GHG emissions. However, this estimation method might be more prone to memory errors, leading to less accurate GHG emission values. In fact, the GHG distribution of our sample in this domain differed the most from the expected distribution for the German population (Supplementary Fig.Ā 3). Although in our models we found factors which were relevant for consumer behavior in previous studies (e.g., income59 and habits67) to also be the most important predictors of consumption GHG emissions (Fig.Ā 5), the overall model prediction performance remains relatively low (Fig.Ā 1). For the present study, we found no feasible way of calculating consumption related GHG emissions more accurately. Future research could explore the utility and improvements in calculation accuracy of alternative approaches (e.g., longitudinal recordings of goods consumption via mobile apps).

Fig. 5: Predictor importance for individual GHG emissions for consumption domain in the Random Forest.
figure 5

Summary (beeswarm) plot and predictor importance (black bars) for life domain shelter. Predictor importance was calculated using Shapely values. The summary plot shows the relationship of individual predictor values with model prediction compared to the average prediction. Dots represent individuals in the dataset, overlapping points are jittered on the y-axis. Individual values on the respective predictors range from low (blue hue) to high (red hue). Positive SHAP values indicate a change in model prediction towards higher emissions. Black bars indicate the overall importance of the predictor for the model prediction performance. Predictors are sorted by their relative importance. Only predictors with average importance (mean SHAP value) above the mean importance of all predictors are shown (i.e., most important predictors) but the plot is based on including all predictors in the model. For more information, see Supplementary Methods.

Discussion

Contemporary solutions to climate change mitigation target individual behavior and needs. Developing these solutions effectively, however, requires a comprehensive understanding of the factors influencing individual sustainable behavior in everyday life as well as their interactions. In this regard, our study provides important insights.

First, we were able to empirically demonstrate that individual sustainable behavior in different life domains is shaped by a multitude of different factors and their mutual interactions. The higher prediction accuracy of the ML models compared to traditional linear regression models commonly used to study factors of sustainable behavior shows that the former, more complex models are more appropriate for investigating predictors of GHG emissions.

Second, we were able to analyze the relative importance of predictors of GHG emissions more accurately than previous studies by considering multiple predicting factors for different life domains simultaneously, accounting for their complex interplay, measuring behavior comprehensively, and validating results on out-of-sample data. We found many important factors in our models for different life domains to resemble those identified in previous studies, but we also found the importance of quite a few factors to deviate from previous literature.

For policy makers, our results suggest that reducing mobility related GHG emissions demand actions on multiple levels. First and foremost, alternative mobility options like public transportation, car sharing, or bike riding must become more attractive and widely available. As indicated by our results, people will not refrain from high GHG transportation means such as cars or national flights if alternative options are not accessible, inappropriate, overly time-consuming, or simply out of their behavioral control, even if they have the desire to use them. These findings are in line with previous studies on external factors of mobility behavior15,38 and also point towards so-called ā€œlock-in effectsā€: The situational and infrastructural circumstances of individualsā€™ living sectors may lock them in to certain behaviors (i.e., car usage) to fulfill their needs, like going to work or grocery shopping, which makes direct behavioral interventions less effective68,69,70. These circumstances can only be improved by investing in reliable, fast, and ample public transportation infrastructure and changes in city designs and land use29,70. Unlocking travel mode choices and satisfying individual needs through targeting mobility circumstances is likely to lead to changes in mobility behavior and may also reduce the strong effect of income on mobility GHG emissions found in our analysis and in previous studies3,29. However, when implementing such measures, policy makers must be aware of ā€œrebound effectsā€ (i.e., the saved time and money is spent on other consumables or increased leisure air travel) that may negate some of the GHG reductions29. An alternative approach to reduce the particularly strong effect of income on mobility related GHG emissions may also include more general socio-ecological transformations, such as reducing contracted working times and discussing ways of economic de-growth while still maintaining (or even improving) well-being. However, the wider implications of such measures have been debated29.

Although situational and infrastructural barriers limit individual agency and the potential effectiveness of behavioral interventions, our results indicate internal factors (e.g., attitudes, norms, and habits) to be important for choosing travel modes. Therefore, directly targeting internal factors seems promising for reducing GHG emissions. This may be addressed by supporting habit changes71,72, financial incentives to use and continue using alternative options15,17,73,74, or promoting their benefits75 to change peopleā€™s attitudes and norms. The impact of more general factors of pro-environmentalism (e.g., environmental education, perceived consequences and responsibility or norms regarding climate change) on reducing mobility related GHG emissions seems negligible, which is in line with some previous studies22,72,76. Nevertheless, implementing regulative measures that target these factors (e.g., pointing out consequences or individual responsibility for climate change) may still lead to behavioral changes in the long run3,17.

Regarding diet related GHG emissions, our results highlight the potential and need for demand-side solutions. Dietary habits seem to be the main component of diet-related GHG emissions. Therefore, extrapolating habit-breaking strategies that have been shown to, for instance, help reducing meat consumption or eating healthier to sustainable diets in general may be promising46,77. Although to a lesser extent, positive attitudes, norms and perceived control towards sustainable diets and sustainability in general are also predictive of diet behavior. Since these factors are precursors of intention formation78 and motivate behavior12, regulatory efforts may be best invested in promoting the multitude of positive effects of sustainable diets (e.g., climate and health benefits17,44). In contrast to previous suggestions, factors like availability of sustainable food options in supermarkets, their identifiability or individual income do not seem to contribute much to dietary choices and thus may not be promising targets for encouraging GHG-emission-friendly diets. Crucially, however, this does not imply that targeting availability and prices of conventional food options (meat products in particular) would also be inefficient in reducing GHG emissions. Quite the contrary, implementing policies that shift financial and structural power from conventional food lobbies and the meat industry to the organic sector may substantially complement the effectiveness of campaigns promoting dietary shifts79.

Furthermore, our findings indicate that shelter-related emissions are strongly influenced by dwelling and electricity generation characteristics. Consequently, a combination of supply-side and demand-side solutions should have big effects on GHG emissions in this domain. Making use of technological advancements in dwelling solutions and energetically renovating buildings contribute more to GHG reductions than, for instance, information campaigns on saving energy or promoting environmental awareness. This notion is backed up by multiple studies in environmental science literature showing the importance and climate benefits of directly targeting dwelling characteristics, like house construction and heating systems60,80,81,82. Similar to heating, expending renewable electricity production has a more direct effect in reducing GHG emissions than just educate or encourage people to save electricity61. However, large-scale decarbonization of the electrical grid as well as building and adaptation of more energy efficient homes may not be enough to meet global climate goals. Within the demand-side framework, researchers have advocated for implementing policies that more directly target resource-intensive living standards. As mentioned previously, the size of dwellings and their type (single family vs. multifamily housing) particularly contribute to shelter-related GHGs. Thus, promoting broader societal changes by directly addressing city planning and individual lifestyles, for instance by supporting compact city designs and multifamily housings as well as incentivizing individuals to reduce their living space while still maintaining personal well-being, might also be necessary3,5,68.

Our study comes with a few limitations. Although we did our best to identify all previously investigated internal and external factors influencing sustainable behavior, there is a possibility that certain factors may have eluded our investigation. Nonetheless, based on previous study results and our comprehensive review of existing literature, it seems likely that we managed to include the most important factors in our analyses. Further, whereas the present study investigated individual behavior, emissions directly caused by individuals in their daily life only constitute part of a populationā€™s overall GHG emissions, around 18% of total GHG emissions in Germany65, which seems representative of many Western civilizations28,83,84. Therefore, mitigation efforts must be extended to infrastructural, agricultural, and industrial sectors which emit the remaining part of total GHGs. Notably, average footprints and the observed drivers and barriers of sustainable behavior might not be the same everywhere across the globe85 and thus the relative impact of regulatory measures might also vary between countries as well as over time18,28,86,87. Therefore, future research should extend the proposed design to non-Western countries and also observe changes in behavior and their drivers and barriers over time, another knowledge gap identified in the recent IPCC report3. Being able to flexibly adjust regulative measures in different transition phases and contexts based on knowledge about the respective factors and their interaction could speed up climate change mitigation drastically3. Although we measured impactful behavior by using externally validated GHG footprint calculators, we were unable to examine the calculation principles ourselves due to restrictions in code accessibility. Therefore, future research efforts may focus on further development of reliable GHG assessment tools and make their source code available to interdisciplinary research teams. Due to our overall study aim of identifying the most important factors driving and hindering sustainable behavior, we did not follow up on all potential interactions or relationships between predictors implicitly captured by our ML models. However, since there are likely to be more important interactions that could lead to even more tailored regulative strategies, future studies could use our data set and test for specific interaction effects.

Our findings demonstrate that the interplay of drivers and barriers to individual GHG emissions in everyday life is complex. We thus argue that factors influencing sustainable behavior should be investigated with approaches which are able to account for this complexity. We found an overall high effect of internal factors such as perceived behavioral control, habits, and attitudes on individual GHG-emission-friendly behavior. However, in some life domains, their impact can be altered or even extinguished by external factors such as infrastructural barriers or dwelling characteristics. Policy makers thus need to consider these complex interplays and may focus on the most important factors when designing demand-side solutions to climate change mitigation targeting individual behavior and needs.

Methods

Literature review and selection of influencing factors

We first performed a systematic literature review to identify the relevant factors of individual sustainable behavior. We selected those factors which are grounded in established theories on sustainable behavior as well as those which were relevant in meta analyses, large scale studies and climate change mitigation reports. Since the final list of identified factors was extensive, we provide a summarized overview (see TablesĀ 1ā€“7). A more in-depth description of the literature review, identified factors and previous research results can be found in Supplementary NoteĀ 1, 2 and Supplementary Fig.Ā 2.

In general, the identified drivers and barriers of sustainable behavior can be categorized into internal (i.e., person-related) and external (i.e., situation-related) factors10. Internal factors consisted of constructs derived from the Theory of Planned Behavior, Value-Belief Norm Theory, and habit formation. These include attitudes, (personal) norms and values, perceived behavioral control, behavioral intentions, climate change awareness and personal responsibility beyond others. We further considered environmental knowledge, emotions towards sustainable behaviors, political attitudes and voting intention as well as media consumption and demographics as internal factors. External factors subsume political, social, economic, and cultural conditions people find themselves in. They were highly domain-specific and included items like accessibility to information and feedback about energy and heating behavior, situational possibilities of eating sustainably or infrastructural mobility circumstances (e.g., access to public transportation).

Measuring sustainable behavior

We measured impactful sustainable behavior by calculating peopleā€™s GHG footprint in CO2 emission equivalents to quantify an individualā€™s contribution to climate change. Unlike most previous studies, we assessed emissions separately for the most important domains of everyday life: 1. shelter (electricity & heating), 2. mobility, 3. consumption and 4. diet. This is crucial, since individuals who behave sustainably in one life domain (e.g., exclusively use public transportation), do not necessarily behave sustainably in another life domain (e.g., renounce from eating animal products) and identifying a factor as impactful in one life domain does not imply it having the same effect in other domains11,58,88,89. The respective (sub-) domains were derived from previous studies on GHG emission sectors86 and guidelines proposed by the German Federal Environmental Office24 in cooperation with two major non-profit organizations for investigating sustainable behavior in Germany ifeu gGmbH25 and KlimAktiv gGmbH26. In their work, the authors identified important sectors of GHG emissions, presented suggested methods of assessment and footprint calculation principles, which we adapted. The authors further divided shelter related emissions into electricity and heating related emissions, which we also adapted. Doing so, we adhered to validated measurement strategies, following most of the best practice measurement principles for footprint calculators90, account for the high region-specificity of individual footprints and emission factors18,86 and assess drivers and barriers of electricity and heating behavior separately. For three out of the four life domains of interest, we used the official footprint calculation tool issued by the German Federal Environmental Office24. For the mobility domain, we calculated individual carbon footprints based on current insights of carbon emission budgeting and emission factors for the mobility domain in Germany. These calculations were performed by the TdLab Geography research group (for more information on the calculation principles, see Supplementary NoteĀ 4). Prior to the final calculation, values for each item of the respective calculators were plotted and inspected. Implausible values were identified and removed/recoded by using existing statistics about individual behavior or theoretical maxima of specific items (e.g., >30 liters of fuel consumption for a regular family sedan or the theoretical maximum duration of an inner-European flight). In some cases, we were not able to identify specific cut-off values for items and relied on using boxplot statistics to remove/recode values. After the final calculation, each participant was assigned their designated carbon footprints, represented in CO2 equivalents. The respective mean CO2 equivalents for all life domains of our sample (except from consumption) fell within the expected range of the official governmental report of Germany (Supplementary Fig.Ā 3). Initial preprocessing was done using R programming language, final preprocessing and calculation steps were performed using the Python programming language. For more information on the specific preprocessing steps, calculation principles for each life domain and formulas, see Supplementary NotesĀ 3 and 4.

Model selection

To analyze the relative importance of internal and external factors for individual sustainable behavior, we used ML models. Compared to traditional statistical models, a main advantage of ML models is that they are better able to quantify the impact of different internal and external factors in everyday life when many other potentially influential factors are present23,91,92. This is because ML models can learn complex associative patterns (e.g., non-linear relationships, higher order interaction effects and interdependence between variables) directly from the data, without the need to specify all potential patterns beforehand93,94. This feature is crucial, since manifold relationships between internal and external factors influencing sustainable behavior exist in the real world (as argued before) and the models used to predict sustainable behavior must be able to capture this complexity.

Despite their increasing popularity in environmental and social sciences, ML models have not yet been widely applied to identify factors influencing climate-relevant behavior. Although a few recent studies deployed ML models to analyze influencing factors of emissions, the used models either were not able to account for complex interactions between factors60, focused on specific intervention effects on GHG emissions95 or were used to identify overall household emission clusters23. In our study, however, we aim to predict individual GHG emissions (in all relevant life domains) through a multitude of internal and external factors and account for their complex interplay. Doing so, we applied different ML models. First, we chose the popular Random Forest ML model due to its ability to approximate any input-output mapping function, ability to cope with small to medium-sized dataset and ā€“ on average ā€“ higher prediction accuracies on survey data compared to other ML or traditional models91,94,96. We also directly compared the RFs performance to two other popular ML models, a linear LASSO regression and non-linear SVM, which delivered promising results in related work on predicting GHG emissions mentioned earlier60,95. We compared the ML modelsā€™ performance to a traditional linear ordinary least squares linear regression model which represents the current practice of predicting climate-relevant behavior. For detailed information about the used ML models and full model results, see Supplementary Methods and Supplementary Discussion.

Analysis procedure

All analysis steps were performed using the Python programming language version 3.8, as well as computational libraries such as scikit-learn97, numpy98, pandas99 and scikit-optimize100. Prior to the analysis, the complete dataset was shuffled and randomly split into 80% training set and 20% testing set. The final sample sizes for the training and testing set per domain can be found in Supplementary TableĀ 2. This allows us to assess the resulting modelsā€™ prediction performance on out-of-sample data. We did so because interpreting models that donā€™t generalize well to out-of-sample data (i.e., over-/underfitted models) can lead to biased conclusions regarding the relationships of predictors with the outcome. Focusing on out-of-sample prediction performance is thus crucial to assess how robustly a model captures patterns in the data for a studied population93. In turn, focusing on prediction performance on out-of-sample data aids our main study goal of assessing the relative importance of individual factors for sustainable behavior and their complex relationships with each other more accurately.

All internal and external factors (TablesĀ 1ā€“7) were entered as predictors into the ML and LM models with the respective footprints for each life domain as the dependent variables (see Supplementary NoteĀ 2 for more detailed information about the predictors). To ensure prediction suitability of the used constructs101, we examined the scale reliability which revealed good overall internal consistency (Ī±ā€‰>ā€‰0.70) with only a few exceptions (see Supplementary TableĀ 3). Continuous predictor variables and all footprint values were standardized (z-transformed) to ensure comparability across life domains. Categorical predictors were dummy coded (i.e., one-hot encoded) for the LM, LASSO and SVM to work with. To simultaneously get a first estimate of the modelsā€™ prediction capability and find the best hyperparameters for the ML models, we employed a nested 10-fold cross-validation technique (CV) on the training set. This procedure represents the current best practice of evaluating model performance in ML settings, since finding the optimal setting for the model and prediction performance are not mixed102,103,104. Hyperparameters in the ML models during nested CV were tuned using Bayesian Optimization100 or Grid Search. After the CV, all models were once again fitted on the complete training set and evaluated on the held-out test set to validate the estimated prediction performance from the CV. A graphical depiction and detailed description of the whole analysis procedure can be found in Supplementary Methods and Supplementary Fig.Ā 1. The mean absolute error and explained variance R2 were used as evaluation metrics. The respective model prediction accuracies are depicted in Fig.Ā 1, a tabular version can be found in Supplementary TableĀ 4. To calculate the predictor importance, we used Shapeley Additive explanation (SHAP) values105, which represent the contribution (i.e., importance) of each predictor to the final model output for each single observation. In the main text, we only focused on predictors with mean SHAP values greater than the mean SHAP values of all predictors. For more information on SHAP values, see Supplementary Methods.