Estimating the scale of the US green economy within the global context

The green economy has previously been defined and measured in various, but limited, ways. This article presents an estimation of the scale of and employment in the US Green Economy using a data triangulation approach that uses many sources of data and multiple types of data. This can give a suggestion of the green economy’s role in economic development and employment at the country level. It also makes it possible to compare the scale of ‘green jobs’ to employment in fossil fuel-related sectors, and to compare the US green economy to other economies. Through the Low Carbon and Environmental Goods and Services Sector (LCEGSS) dataset, the US green economy is estimated to represent $1.3 trillion in annual sales revenue and to employ nearly 9.5 million workers; both of which have grown by over 20% between 2012/13 and 2015/16. Comparison with China, OECD members and the G20 countries suggests that the US is estimated to have a greater proportion of the working age population employed (4%) and higher sales revenue per capita in the green economy. Estimated values for other countries suggests that they too have significant production and consumption in the green economy and the US should consider, as other economies are, developing energy, environmental and educational policies relevant to the green economy to remain competitive in these areas. Given the shortcomings of other data sources, this information can contribute to understanding the potential impact of changes to federal-level policies on economic sectors that are vital to combating climate change and protecting the environment.

Introduction: getting at the scale of the green economy T he composition of the US economy has changed significantly in recent decades (Autor et al., 2006;Goodwin et al., 2014;Krippner, 2005) but researchers, business leaders and policymakers have had little information on the size and development of the green economy. Moreover, in the aftermath of the 2007 financial crisis and as a response to persistent unemployment, green economy, green growth and green jobs policies have become increasingly prominent in the US and elsewhere (Deschenes, 2013). In March 2013, due to federal cuts authorised through the budget sequestration, the Bureau of Labor Statistics discontinued the Green Goods and Services survey on employment in the green economy (Barbier, 2014). Its last report detailed positive news for growth in green jobs for 2011 (Bureau of Labor Statistics, 2013a). Although the Department of Energy currently produces some statistics on 'clean jobs' in energy, which is useful for understanding the changes in US energy production, since 2013 there has been a lack of comprehensive data on the development of the US green economy. Current discussions regarding the green economy in the US frequently rely on this clean energy jobs data (E2, 2019; National Association of State Energy Officials and Energy Futures Initiative, 2019). This narrows the potential for debate about the wider US green economy, about responses to climate change from outside of the energy sector, and how the US economy is responding to other environmental changes.
Understanding the economic size of any sector is a significant challenge and measuring the green economy is particularly difficult (Georgeson et al., 2017b). Fully measuring the green economy using national statistics and/or company surveys is a difficult task, frequently requiring additional surveys and research, or alternative data collection methods. Moreover, without a shared definition of its boundaries, the green economy cannot be identified via standard industry classifications (SIC) (Becker and Shadbegian, 2009), like the North American Industry Classification System (NAICS) used in the US. Moreover, SIC codes have significant limitations for measuring new or emerging technologies and sectors (Jacobs and O'Neill, 2003;Kile and Phillips, 2009), and the green economy continues to be defined by innovation and technological change (ECO Canada, 2010).
The challenges in measuring the green economy could be partially overcome through a transactional data approach, a data collection method with certain characteristics of 'big data' (Gandomi and Haider, 2015) that triangulates and cross-checks data from many sources. Using this method, and building on development of low carbon and environmental goods and services data for the UK government (Department for Business Innovation and Skills, 2013), it was possible to estimate the sales revenue and number of jobs in the green economy, including supply chain activities, for the US and other leading economies. This dataset was renamed the Low Carbon and Environmental Goods and Services Sector (LCEGSS), following more recent revisions of the dataset to improve alignment with the Eurostatdeveloped Environmental Goods and Services Sector (EGSS) classification. This paper will explore prior measurement of the green economy in the US, before presenting the definition of LCEGSS and the transactional data approach and the results of the study for the US and in comparison with other leading economies.

Previous green economy measurement by the US Federal Government
The lack of reporting and measurement since the loss of funding to the Bureau of Labor Statistics' (BLS) Green Goods and Services (GGS) survey suggests that there is an important gap to be filled by alternative measurement approaches. This is further evidenced by the fact that studies have been published in 2017 that still rely on using BLS data (Elliott and Lindley, 2017). This study noted that 'green' industries and states seemed to be getting greener at a faster rate, but highlighted the difficulties arising from the lack of data over a longer time series from the BLS GGS survey. There have been several previous efforts to measure the green economy or related concepts in the US (Muro et al., 2011;Pew Charitable Trusts, 2009), and there has been considerable debate regarding the main methods of identifying and counting green jobs in the US (Peters et al., 2011). Moreover, Peters provides an excellent overview of the research into the green economy prior to and contemporary to the BLS GGS survey work (Peters, 2014). To review green economy measurement in the context of this study, we will briefly discuss federal data collection and estimation processes, principally the Department of Commerce's 'Measuring the Green Economy' study, and the BLS' Green Goods and Services survey data, as other measurement reports predate the most recent BLS survey and have not been repeated. For example, prior to BLS' data collection, studies had also continued to use the U.S. Census Bureau's 1995 Survey of Environmental Products and Services (SEPS) to classify environmental products (Becker and Shadbegian, 2009), which had a limited definition of environmental products.
The previous federal report to measure green jobs was the Department of Commerce's 'Measuring the Green Economy', which reported 1.8-2.4 m green jobs, based on a 'narrow' and a 'broad' definition (Department of Commerce, 2010). The Department of Commerce's narrow definition covers 497 product and service codes, only measuring products and services where analysts assumed the existence of wide agreement on their classification as green. The categories are: Pollution Control, Renewable/Alternative Energy, Energy Conservation, Resource Conservation, and Environmental Assessment. Their broad measure contains 732 codes, with additional products and services where their classification as green 'may be more open to debate' (Department of Commerce, 2010). Some examples of additional activities in the broad definition are: nuclear energy electricity generation, biofuels, medium density fibreboard and botanical gardens/zoos. The BLS Green Goods and Services data (GGS) used company survey methods in order to estimate private and public employment in goods and services that benefit the environment or conserve natural resources (Bureau of Labor Statistics, 2013b). It estimates that there were 3.4 million green jobs in 2011 (2.6% of all jobs), an increase of 158,000 on the previous year (Bureau of Labor Statistics, 2013a). From a sample of 120,000 firms, employment numbers are estimated from the reported percentage of revenue derived from green products from sampled firms that responded to the survey and the overall employment of the firm. The estimation process has to account for non-response and also for incomplete or single-year responses (through imputation) (Bureau of Labor Statistics, 2013c).
On the one hand, it has been suggested that, although the BLS estimates are higher than other studies prior to 2013, given the methods used and the data available, they are likely to be more accurate (Peters, 2014). On the other hand, it is also argued that the BLS green jobs data does not represent the total number of green jobs in the US economy (Pollack, 2012) for several reasons: due to the exclusion of any business who earns less than 50% of its revenue from green products and services, it does not cover all industries but a subset of the full North American Industry Classification System (NAICS) classification, and it does not include 'process' jobs (it has a process definition but reporting has generally been provided for the 'output' definition). The NAICS classification was updated in 2012 to better partition some sectors of the green economy, but BLS GGS reporting released in 2012 still used the NAICS 2007 classification, which led to underreporting of certain major sectors of the green economy, such as solar photovoltaic, solar thermal and wind turbine manufacturing (Pollack, 2012). The 2013 release of GGS revised previous 2010 data using the 2012 revision of NAICS and released 2011 data (Bureau of Labor Statistics, 2013a); the revised 2010 data had an increase of 114,000 in GGS employment (Bureau of Labor Statistics, 2013c).
The Economic Policy Institute (EPI) review of output-based BLS data suggests that while the methods that the BLS employed are sound, it can be argued that the BLS methods are too conservative, in particular through their exclusion of value chain activities (Pollack, 2012). Such activities are measured within the LCEGSS approach to fully understand the economic impact of green economy activities. Including supply chain activities has been identified as important for economic measurement of the green economy (Pollack, 2012;Pollin and Wicks-Lim, 2008), but this also contributes to the difference in scale of the estimates derived by this study. Research prior to the release of the BLS GGS data suggested that a mix of different data sources would be useful to federal agencies given the time taken to identify green jobs at the firm or individual level using surveys (Peters et al., 2011).
Another former programme of the BLS was the Green Goods and Services Occupations (GGS-OES) programme, which surveyed firms to estimate green employment shares based on occupational staffing patterns (Bureau of Labor Statistics, 2012). The GGS-OES found 1.95 million jobs from firms where all output is from green activities, 6.11 million green jobs in firms with some green activity, for a total of 8.06 million green jobs (6.3% of all employment) (Peters, 2014).
Differences in the methods of counting jobs and what sectors to include when delineating the green economy will naturally lead to variation between data collection methods. For example, there will be significant differences in totals depending on how jobs are counted where only part of the role relates to the green economy, or where only part of a company's output relates to identified green activities. There are broader methodological variations too; LCEGSS uses data from a variety of scales, including project, transaction, firm, industry and product, whereas other US green economy reports have estimated green jobs from firm level data using revenue share as a proxy for green employment (Muro et al., 2011).
However, significant empirical contributions emerged from the use of the BLS GGS data, and from debates regarding the definition and measurement of green economic activities in the US. This includes research where the BLS data has been linked to the O*NET database to enable the task-based identification of green jobs (Vona et al., 2018a(Vona et al., , 2018b. This approach, and the relationship of the BLS data to NAICS codes, permits analyses of the distribution of green jobs that are disaggregated by location and by education-level. Linking BLS data to the O*NET, in particular, is an important contribution, as the O*NET database from the U. S. Department of Labor is the only government green occupations study that uses empirical criteria to define 'green', but does not provide any estimates of employment (Peters, 2014).
Others have used the GGS data to critically assess the definition of 'green jobs', as well as the claims that have been widely made about the impact of green occupations on the labour market and on job creation (Deschenes, 2013), which also highlighted the need for more empirical research into measuring the green economy. This has parallels with the conclusions of prior research based on the SEPS definition of environmental products and services (Becker and Shadbegian, 2009), which suggested that while green firms may have higher output and factor use per worker, they were in many ways similar to non-green production. In both studies, the data used and their conclusions highlight the need for definitions of the green economy that can support timely data collection, and that can be updated to reflect new technologies.

Methods
The methodology triangulates transactional and operational business data to estimate economic values, frequently where government statistics are not available. It can estimate the sales and employment in the green economy, the share of the country's economy taken up by the green economy, growth in the green economy and the green economy sectors that are leading that growth. This can estimate the contribution to the country's economy of the green economy, the progress made and national priority areas. The methodology, developed by kMatrix Ltd, uses a number of different data sources and data types (transactional, procurement, insurance, industrial benchmarking) to arrive at estimates of economic value that would not be possible from a single data source. Each data point requires at least 7 data sources for 'triangulation', but in the Low Carbon and Environmental Goods and Services Sector (LCEGSS) dataset, the average number of data sources for each observation is 56. The transactional triangulation methodology has been used to: estimate climate change adaptation within ten megacities (Georgeson et al., 2016b), provide data on global private sector investment in clean energy R&D (Georgeson et al., 2016a), analyse global provision of climate and weather information (Georgeson et al., 2017a), and estimate global climate change adaptation spending relating to health (Watts et al., 2017). It has also been assigned official statistics status in order to provide trade statistics to the UK Government's Defence and Security Organisation (Department for International Trade Defence and Security Organisation, 2015).
The transactional triangulation methodology also measures supply chain activity. Transactional data has advantages for measuring full economic impact, but it is not directly comparable to national statistics. A 'core versus supply chain' analysis has been conducted on the approximately 3800 activities in the LCEGSS dataset; the ratio is 45% core to 55% supply chain. Data collection therefore includes both activities by companies that are specialists in LCEGSS and non-specialist companies that operate within the value chain.
Definition of the low carbon and environmental goods and services sector. LCEGSS uses a wide range of different data types and sources, and a sectoral definition that is both 'top-down' and 'bottom-up'. It is a pragmatic estimate of the green economy that collects and measures data only where sufficient evidence is available to support inclusion. The definition was originally developed in conjunction with early efforts by the UK government to define 'Environmental Technologies' in 2007 in a response to the limitations of the UK industry classification system to accurately estimate the economic value of environmental protection within the UK economy. The development of the dataset was tested against 'known' sectors with existing SIC values to test its consistency with SIC-derived values. The 'Environmental Technologies' dataset aimed to identify environmental technologies across a range of sectors, which were primarily related to environmental protection, with less emphasis on low carbon activities. The development of the dataset in subsequent revisions led to a sectoral definition that covered environmental protection, renewable energy and low carbon activities. This revised definition, renamed the Low Carbon and Environmental Good and Services (LCEGS) dataset, was used for UK national reporting by the UK department for Business, Innovation and Skills (BIS) between 2008/09 and 2012/13. Subsequent revisions of the definition analysed new sources of data and new economic activities. The revised dataset has been renamed the Low Carbon and Environmental Goods and Services Sector (LCEGSS) to better reflect efforts to better align the environmental protection, renewable energy and resources management sections of the definition with Eurostat's EGSS. It has been used for research purposes in partnership with the C40, amongst others.
LCEGSS contains 26 sub-sectors (described as 'Level 2' in the definition's taxonomy), which are grouped into three broad categories: Environmental, Low Carbon and Renewable Energy (see Table 1, and a more detailed version in the Supplementary Materials). The Environmental and Renewable Energy sectors largely represent distinct sectors within the broader economy, whereas the Low Carbon sector contains a number of economic activities that exist in a range of traditional industries.
The LCEGSS definition covers 3800 discrete goods and services (described as 'Level 5' in the taxonomy), which are derived from sector supply chain activities (such as componentry and assemblies) and value chain activities (such as R&D, supply and training). The revisions to the LCEGSS definition added 953 activities, both through economic activities that have been identified and added to the definition, and the identification additional data sources that allowed the inclusion of economic activities in data collection. Seven hundred and seventy-eight of these relate to Energy from Waste, 49 to Biodiversity, 40 to Environmental Consultancy, 25 to Water and Waste Water Treatment, and 61 are split across a further eight subsectors. Other major revisions include dividing offshore and onshore wind to reflect their differing supply chain activities. To illustrate how the taxonomy functions, an example of the data taxonomy and values for Air Pollution Control for the US for 2015/16 is available in the Supplementary Materials. The development process for the LCEGSS definition has reflected the lack an internationally agreed definition of the 'green economy' or related sectors. In defining a new sector for measurement, decisions are required where no agreed boundaries for inclusion exist. However, internal quality assurance processes ensure internal consistency in the definition, and the methodology has been externally peer-reviewed or audited on a number of occasions, most recently in January 2017. In defining the boundaries of LCEGSS, decisions had to be made on the inclusion and classification of particular activities. For example, the definition of geothermal energy increasingly refers to both 'deep vertical' and 'shallow horizontal' heat sources. The highest growth in the sector is generated in horizontal applications at a one to two-metre depth, principally for private dwellings, which contributes to the size of the geothermal energy subsector in LCEGSS. At the city-level for example, shallow geothermal applications account for between 93 and 100% of the Geothermal subsector value in the LCEGSS dataset. By comparison, other 'green economy' categorisations measure certain shallow geothermal applications under 'Renewable Heat' or 'Construction' (alongside HVAC).
In the 'Low Carbon' sector, the LCEGSS definition includes industries where low carbon measurement is practical and some consensus exists around what should be included, such as low carbon activities within industries that account for high levels of carbon emissions, such as Building Technologies and Energy Management from Construction, and Electric Vehicles from Transport. Other industries are included because of their significance in responding to climate change, such as Carbon Finance from Finance and Insurance and Environmental Consulting from Professional Services. The historical development of LCEGSS in the UK meant that some subsectors, such as Carbon Finance and Nuclear Power, were included due to preference and policy relevance in the national context. LCEGSS does not currently measure low carbon activities from all industrial sectors. Current and future research is developing a more comprehensive method for classifying and identifying green and low carbon activities across a wider range of traditional industries.
Compiling and classifying economic activities and data. The process of compiling the LCEGSS definition for measurement was iterative and both 'bottom-up' and 'top-down'. The first stage was to search for data sources for activities that fit the 'ideal' definition of LCEGSS. Then, based on the robustness of available evidence, the decision was taken to include or omit aspects of the 'ideal' definition. The resulting definition is therefore pragmatic and only includes economic activities for which multiple sources could be identified.
The deployment of this methodology enables the reporting of comparable estimates of LCEGSS activities from multiple countries. LCEGSS can be applied across multiple geographies (both between nations and at a subnational level) by triangulating international and national data sources. In addition, using a wide range of data types affords a better understanding of each activity. This has benefits for the identification of economic activities, the 'in or out' definitional decisions and the classification within the LCEGSS taxonomy. For example, the use of procurement data can assist in better identifying the 'purpose' of a product or service.
Economic activities were only measured where there was a 'footprint' of economic activity, not economic potential. Green economy sectors with a high potential for future growth but no currently measurable sales activity cannot be measured. Therefore, some subsectors with significant potential, like Wave & Tidal, are measured based upon their current market presence. This approach may understate their future value but does not inflate LCEGSS values based on early stage investments that may not succeed at scale. With definitions established, rule sets and decision trees were written to filter source data to be included in LCEGSS measurement. Rule sets and filters determine what proportion of an economic activity is included in the sector analysis. In some cases, this is straight forward; activities that are clearly and directly associated with renewable energy sources like Wind, Geothermal, and Wave and Tidal are automatically included in the relevant LCEGSS sub-sector. Some accompanying economic activities are more difficult to allocate, such as the engineering support services that are part of the wind energy supply chain, but that may be located within data sources relating to general engineering sector. Filters determine the environmental characteristics of different products, components or materials, such as those that can be identified to save energy, reduce heat loss, use less raw materials, produce less waste, or assist companies to meet environmental standards. Filters also assess the end-use of more generic products or services to determine their inclusion in LCEGSS. For example, filters would be used to assess whether the economic value of road maintenance is due to routine wear and tear from traffic volumes (planned) or in response to new weather, climate, or environmental conditions (unplanned and additional).
Multiple filtering processes are required in differing combinations across the LCEGSS classification to filter relevant activities. The transactional triangulation methodology assesses 'how' or 'why' an activity is carried out and 'where' it is used, whereas industry classification systems classify based on 'what' an activity is. In the methodology, the interrogation of additional data sources (both additional sources and a variety of data types) permits this improved assessment and classification. For example, the use of procurement data can improve assessment and classification of the end purpose of a product or service. For example, for an indicative, simplified set of filters that relates to climate change, if activities meet fulfils one of a set of purposes or needs that can be identified as strictly related to mitigation, then they are included for further filtering within LCEGSS. This methodology has been previously used to estimate spending on 'Adaptation & Resilience to Climate Change' (Georgeson et al., 2016b); therefore, if an activity can be identified as strictly related to adaptation then it is included for further filtering in that dataset.
The process also uses technology filters for LCEGSS that assess whether a particular technology or process can be identified as relevant for inclusion in LCEGSS. Data to inform progress through the filtering process are drawn from market, technology, supply chain and procurement sources, although technology filters are not relevant to all LCEGSS activities. The technology filters include decision gates like: • Is the new technology or process an immediate and beneficial replacement in reducing resources, reducing emissions or reducing energy consumption?
• Does the new technology or process provide robust short or medium term environmental benefits?
• Does the new technology or process provide a solution to new requirements in law or regulation?
Filters also provide sufficient confirmatory evidence of end purpose (generally through procurement-related data) to determine whether a product or service has been used for an environmental purpose. More detailed disaggregation of product or service data is frequently required to ensure that only LCEGSSrelated activities are included and to prevent over-reporting due to the inclusion of non-environmental activity value. This requires interrogating data values at a level of disaggregation greater than 'Level 5' in the taxonomy (which is equivalent to a product, service or economic activity) to filter out non-LCEGSSrelated value from economic activities. This disaggregation is beneficial to the measurement of economic sectors with hard-todefine boundaries (through overlap with other sectors) or hardto-define content (activities that may include both relevant and invalid purposes). This level of data mining was also necessary for measuring climate change adaptation and weather and climate information services using the transactional triangulation methodology (Georgeson et al., 2017a(Georgeson et al., , 2016b. Data acquisition. The data acquisition methodology is based on a system originally developed at Harvard Business School for triangulating transactional and operational business data to estimate economic values in areas where government statistics and standard industry classifications are not available (Jaikumar, 1986). This system, referred to as 'profiling', takes approaches from business intelligence and related fields to track technological and industrial change. It has been established within business intelligence literature (and related fields) that there are significant volumes of information for compilation and aggregation to analyse markets and industries, but this information is often dispersed and unsorted (Zanasi, 1998). Attempts to define these approaches have often taken process-based or demand-driven frameworks as the historical development of these processes was as an input into corporate decision-making (Baars and Kemper, 2008;Jourdan et al., 2008;Lackman et al., 2000;Pirttimäki, 2007).
The filters, rules and decision trees used to select relevant LCEGSS activities are central to data acquisition; the accuracy of the rules and the availability of sufficient robust and reliable data are the basis for estimating economic values. A five-step process is outlined in Fig. 1; the five broad stages are the framework for the specific steps of the methodology detailed below. The data triangulation process takes large quantities of unstructured, singular and fragmented data to construct the dataset of LCEGSS value estimates. Figure 1 suggests a linear process, however there is a degree of iteration to data acquisition. The definitions and data collected must be tested and validated during the process. Unlike a SICbased approach, the transactional triangulation process involves the definition of the activities to be measured. An iterative process, which allows for feedback and adjustments, is therefore necessary. Through these methods, the transactional triangulation methodology is capable of tracking changing and emerging industries; the LCEGSS definition has been revised and extended more than once since data collection began in 2006/7. This is important for measuring the green economy; by comparison, the process of publishing the Eurostat EGSS definition took over 10 years and limitations to the classification of 'Resource Management' remain.
The data triangulation methodology and the underlying data used to produce LCEGSS data have some characteristics that are typical of 'big data' approaches (Gandomi and Haider, 2015): higher volume, higher velocity, and high variety. It uses a significantly higher number of sources than other approaches, processes data more quickly than survey-based approaches, and handles data from a variety of sources in a number of different types. It is not, however, directly comparable with values derived from estimations produced by national statistics agencies.
For each transaction listed in the LCEGSS dataset, a minimum of seven separate sources must independently record the transaction for it to be confirmed and included in our database. Across the entire LCEGSS database, the average number of sources for each data point is 56. At the country/territory level, the average number of sources for each transaction ranges from 52 (Faroe Islands) to 215 (Australia). These databases have been tracked in a Data Management system and their continued relevance and utility has been verified over a number of years. Sources are screened to remove duplicate references to a single source and then shortlisted by removing outliers and unreliable sources. This shortlist is then screened again to stress test inconsistent values and remove them if necessary. From the remaining sources, a value is estimated. These estimates are 'reality tested' by comparing activity values within and across economic or industrial sectors or, where available, with recognised industry benchmarks and government statistics.
Much of this data is already in the public domain, although it requires the corroboration of multiple sources and triangulation between different sources (financial, legal, academic, industry, trade association, procurement, government) before it can be validated and transformed into more usable data. The triangulation process and use of proxy data demonstrates two key characteristics of 'big data' research methodologies (high volume and high variety). The methodology can either; a. select from multiple sources of pre-existing data (mature sectors), b. select from more limited sources of pre-existing data and combine this with triangulated data to achieve more robust results, c. find no pre-existing sources and uses triangulated data to create the sources necessary for analysis (emerging sectors).
As an example, for one historical data point for services relating to corporate governance for climate change, the consulting sector data reported that in 2010/11 250 major corporates commissioned work (the consulting sector data frequently does not report values for commercial reasons). Investor relations and fund management sector data reported that overall £8.75 m was spent on work and trade associations data reported independently that some £9.2 m has been spent. Along with additional sources, triangulating data from these multiple sources is the basis for deriving more accurate estimates of the value of this economic activity. A more detailed example of the value estimation process is available in the Supplementary Materials. Data sources. Given the range of industries and sectors covered, a wide range of data sources is required. The data sources include a wide range of local, national and international sources that have been commissioned, and relevant published data and research. Where other green economy studies may have used a single one of these sources (Yi and Liu, 2015), for LCEGSS the sources include: • a wide range of industry/trade associations (from major national and international industry associations to federations and trade bodies for specialised sectors and manufacturers, including the Solar Trade Association), A total of 1589 data sources are used across the LCEGSS dataset. The process uses general and specific sources, but it is weighted towards sector-specific sources. The number of sources used to compile a single data point (the estimated Sales value) for each of the 3800 lines of economic activities in the LCEGSS definition for each city or country is calculated and collated. The triangulation of data from multiple sources contribute to reducing the impact of biases inherent in certain sources of data. To further minimise this, all sources are tracked and managed for accuracy and reliability over time. New sources of data become available regularly, but these are then subject to an 'incubation' period within the data management system. This establishes the frequency (of relevance) and credibility of the source before it is included in any analysis.
Monitoring new and existing sources enables the quality of data sources to be improved over time; new sources are Fig. 1 The data acquisition process for the transactional triangulation methodology ARTICLE PALGRAVE COMMUNICATIONS | https://doi.org/10.1057/s41599-019-0329-3 monitored for inclusion and older sources are removed if their reliability deteriorates. For each source, a historical log records source name, source value, year it relates to, the number of times used, 'hit rate' (confidence or reliability) and whether it will be accepted for a specific research purpose. The source management datasets relate to each calculated value within any sector data. These data sources are monitored closely internally and are routinely spot-checked each year, and reviewed by data users as part of any peer-review or audit. There is a separate data management system for each sector in the data collection, as sources can be relevant to multiple sectors. Once added into the data source management systems, data sources are tracked and assessed for each individual data collection purpose.
For LCEGSS, revenue data are produced to an average 'confidence range' of 85%; and employment data are produced to an average confidence range of 83%. Confidence ranges are a function of the range of source values assembled for each data point. Each final data point is the mean of the final range of values (after outliers are removed). The confidence range is the difference between the mean value and the most extreme values in the range. An 85% confidence range means that the difference between the mean and the extreme values is 15%. Data estimates were returned for 226 countries and territories.
Employment. Employment values in LCEGSS are a measure of the estimated employment numbers across all aspects of the supply chain. National, regional, city and other economic data sources were used to estimate current employment levels for each sector activity. Where employment information is scarce, or where employment is estimated as a proportion of a company's sales, a comprehensive range of case study materials are assessed to provide industry-specific ratios and benchmarks. The employment figures for LCEGSS can be used to analyse the labour intensity of economic activities across sectors.
Sales per FTE. Productivity is frequently defined as a ratio of a volume measure of output to a volume measure of input (Organisation for Economic Co-operation and Development, 2001). There are many different measures of productivity, but from the measures available in the LCEGSS dataset, we were principally able to produce a proxy measure of labour productivity based on gross output. It provides an estimate to measure how efficiently labour is combined within other factors of production. As a proxy measure of productivity, it has a number of limitations and should only be regarded as a partial measure of productivity that reflects the joint influence of a number of factors, and it should not be interpreted as the productivity of individuals in the labour force (Organisation for Economic Cooperation and Development, 2001). Although it is frequently reported as output/hour, the UK Office for National Statistics notes that labour efficiency can be measured as output/hour, output/worker and output/job (Office for National Statistics, 2017). Given the data available, output in sales revenue ($m) per job (full-time equivalent) is the most appropriate method.
Limitations. The transactional triangulation methodology is different to national statistics, but methods have been developed over time to enable it to be more comparable to traditional data sources. Constructing a definition for measurement of a new sector is complicated by differences between countries in how products and services are described and how these are assigned to industry codes. Therefore, the compilation of transactional data has to overcome variations in how the same activities are recorded in different countries and sectors. The data definition process has to identify how different descriptions vary, group those together that describe the same activities, and then create or adopt a universally applicable description to aide global data collection and reporting. Therefore the 'language' of LCEGSS does not map directly to any national industry descriptors, but it has wide relevance and are based on the descriptions used in industry where possible, especially in the case of more 'mature' sectors where an agreed language for definitions has been established.
Data collection using this methodology means that a sector definition will only include product and service activities that have a traceable economic footprint in the form of a trading history. Publicly funded or academic research and any technologies that have not yet reached the market are not included in the sector definition. This is influenced by the nature of the industry and market-focused sources accessed in the data collection process.
LCEGSS measures economic activities across existing industries and does not just measure environmental protection activities, but it does not currently measure the full extent of the 'green economy' in all existing industrial sectors. As noted, this is partially a consequence of the lack of consensus on how to classify varying categories of low carbon, environmental, green and sustainable economic activities that exist within individual industries. Future research aims to construct such a classification to develop a full 'green economy' model for data collection.
The methodology used means that LCEGSS is not an exact fit with any existing classification systems, nor particular national measurement frameworks. However, while this is a limitation in some ways (especially from the perspective of national accounting), there are advantages from a research perspective; comparison between sectors and countries is possible without the significant time or resource requirements of rewriting the national classifications or accounting systems. Data collection for LCEGSS could be described as an 'overlay' system that can operate above national industry classification systems to better report and analyse the green economy in the short term, without the reclassification of industrial codes required to achieve a measurable definition using industry classifications. Moreover, by using global data sources, some of the limitations of the reporting systems for smaller countries can be overcome by accessing external data and the use of internal and external data sources permits the measurement of trade flows between countries.
Calculating comparison values. GDP (nominal) data (2015 estimates) were taken from the April 2016 update of the International Monetary Fund's World Economic Outlook. Comparisons would be different using data adjusted for purchasing power parity.
While data for many countries are available in the LCEGSS dataset, given the lack of data availability in the US and reduced discussion of the definition of the green economy in the wake of the end of the GGS survey, country data for the US was deemed to be an important focus of the study. More recently, given the revival in contemporary political debates of the concept of a 'Green New Deal', more up-to-date and comprehensive analysis of the green economy through the LCEGSS data could be an important contribution. Although the data was originally developed in the UK, it was decided to compare the US to China, as the other nation with a similar size of LCEGSS sales estimates, as well as the G20 and the OECD, as other important international groups of industrialised or market-orientated economies that also include the major European nations.
As the US and China are analysed and presented separately from these country groupings, the G20 comparison refers to the 19 member states of the G20, minus the US and excluding the European Union and observer country Spain. Similarly, the OECD comparison includes all states that are members of the OECD, excluding the US and China. Population data (2015 estimates) were also taken from the April 2016 update of the International Monetary Fund's World Economic Outlook. Estimates of working age population were taken from the 2015 revision of World Population Prospects, published by the Population Division of the UN Department of Economic and Social Affairs.
Results: the state of the US green economy Through the LCEGSS data, the US green economy is estimated to represent $1.3 trillion in annual sales revenue and to employ nearly 9.5 million Full-Time Equivalents (FTE); both of which have grown by over 20% in the last three years. Comparison with China, OECD members and the G20 countries shows the US has a greater proportion of the working age population employed (4%) and higher sales revenue per capita in the green economy. It also demonstrates that other countries have huge potential to develop their green economy and the US needs to develop energy, environmental and educational policies to remain competitive. Figure 2 shows the estimation of the US green economy using the LCEGSS 'Level 1' definitions for Environmental, Low Carbon and Renewable Energy sectors, for both sales revenue, and jobs estimated in FTEs for the four financial years for which data is available. A greater proportion of employment is taken up by 'Renewable Energy' compared to Sales revenue; this suggests renewable energy sectors are particularly important for green economy job creation. On the other hand, the 'Environmental' sectors, which may be more 'mature' in many cases, deliver a greater amount of revenue per FTE.
LCEGSS in the US has grown from $1.1 trillion and 8 m FTEs in 2012/13 to $1.3 trillion and 9.5 m FTEs in 2015/16. This represents about 7% of the US annual GDP (although this is an indicative comparison). The employment estimate is similar in size to the estimate derived from the BLS' GGS-OES data of 8.06 m green jobs in 2011 (Bureau of Labor Statistics, 2012;Peters, 2014). However, it is higher than other previous estimates from the Bureau of Labor Statistics Green Goods and Services survey (3.4 m for 2011) (Bureau of Labor Statistics, 2013a), the Brookings Institute (2.7 m for 2010) (Muro et al., 2011), and the Department of Commerce (1.8-2.4 m for 2007) (Department of Commerce, 2010). It is likely that there are several reasons for this: this methodology allows us to track activities within the whole supply chain of green economy sectors and thus better gauge the fuller economic impact of the green economy, there are differences in measuring revenue or economic production and how jobs are counted and recorded, and differences between the LCEGSS definition and the definitions used in previous studies.
In terms of jobs it should be noted, of course, that due to differences in methods and definitions, these values are neither directly comparable with the previous Bureau of Labor Statistics (BLS) 'green jobs' survey nor Department of Energy survey data on energy sector jobs. The DoE survey counts the number of 'qualifying workers' (Department of Energy, 2017) (any worker who spends part of their time on activities within the definition) rather than calculating the equivalent number of full-time jobs and the BLS data is recognised to undercount solar jobs. The availability of multiple years of data reveals an interesting trend; the growth in jobs appears to be faster than the growth in sales revenue, suggesting that the US has been successful in driving green economy employment. The LCEGSS sales figures are not, however, adjusted for inflation.
Although data published by other organisations, such as the Department of Energy, are compiled differently and take different definitions of employment, there is value in comparisons. The DoE 'Energy and Employment' report estimates that there are 467,000 'qualifying workers' with fossil fuel mining jobs, but notes that coal mining has declined from 90,000 related jobs to 53,000 since 2012. Fossil fuels employment overall has declined and there are currently less than 200,000 jobs in fossil fuel electricity generation. There are further jobs in manufacturing, construction and other sectors related to fossil fuel extraction and generation, but there are estimated to be much greater levels of employment in 'clean energy' sectors and their supply chains.
Results: the US in the global green economy. Another significant benefit of LCEGSS data is that it is an internally consistent measurement approach that can be used to compare data internationally. This is unlike national statistics data collection, whose methods, codes, sectoral definitions, and reliability varies between nations. The number of sources used to compile each line of data in LCEGSS, and the use of a consistent and comparable definition of the 'green economy', allows us to compare LCEGSS in the US to other major economies, such as China, other G20 countries and OECD members. Figure 3 shows comparisons of revenue, employment and productivity between the US, China, the OECD countries (minus Fig. 3 Comparing the US green economy to China, OECD member states (exc. the US) and G20 member states (exc. the US and China). a LCEGSS sales ($ billions), with totals for OECD (exc. US) and G20 (exc. US and China). b LCEGSS sales as a percentage of GDP (%), with aggregate values for OECD (exc. US) and G20 (exc. US and China). c LCEGSS sales per capita ($), with aggregate values for OECD (exc. US) and G20 (exc. US and China). d Employment in LCEGSS (millions of FTEs), with totals for OECD (exc. US) and G20 (exc. US and China). e Employment in LCEGSS as a percentage of working age population (%), with aggregate values for OECD (exc. US) and G20 (exc. US and China). f LCGESS 'Productivity' (Output per Job) ($ millions/FTE), with aggregate values for OECD (exc. US) and G20 (exc. US and China) the US) and the member states of the G20 (minus the US and China). Figure 3a demonstrates that while the US is a significant part of the global green economy, other major economies also have significant LCEGSS sectors that have the capacity to expand and compete with the US. Figure 3b's comparisons of LCEGSS as a percentage of GDP suggests how important the green economy is within each country or grouping. We suggest that the stark differences in LCEGSS's 'share' of the economy be treated with caution however, given the differences in methodologies and difficulty in collecting accurate and comparable GDP statistics between countries. Sales per capita values suggest that the US generates significant value for the economy given its population.
When it comes to green jobs, an interesting picture emerges. While LCEGSS employment may be of similar scale in China, in the US it is estimated to represent a much large share of the working age population; over 4%. This represents a large constituency of workers whose livelihoods rely on the green economy.
Results: growth in the US green economy. The LCEGSS data allow us to look at which subsectors of the green economy are reporting the highest growth in sales revenue. Sectoral growth figures for 2015/16 in percent are shown in Table 2. These data suggest that there is strong growth in a number of renewable energy sectors, and lower, although still healthy, growth in wellestablished environmental protection sectors. Again, here these values should be treated with some caution as they are not adjusted for inflation. Nevertheless, growth rates in a number of sectors are between 6-9% per year before taking inflation into account compare favourably to US GDP growth of 2.6% in 2015.
Considering the estimates that we have analysed of jobs in the US green economy, and DoE data on fossil fuel jobs (467,000 in mining, 200,000 in electricity generation and several hundred thousand elsewhere in manufacturing and construction related to fossil fuels), we can analyse the proposed impacts of the 'America First Energy Policy' (Donald J. Trump for President Inc., 2016).
The US added over 1,500,000 FTEs in LCEGSS sectors between 2012/13 and 2015/16, whereas the USEER has reported a decline in coal jobs of 37,000 for the same period. The proposed 'America First Energy Policy' suggested that another 400,000 new jobs could be created in the fossil fuel sector. This would firstly have to reverse the current decline in jobs. It is likely that this would require an enormous expansion in exploration and use of fossil fuel, which seems incompatible with the economic trends working against coal and oil in particular. US oil is already expensive compared to other nations and the decline in coal consumption was primarily caused by cheaper natural gas (49%), lower-than-expected demand (26%) and growth in renewable electricity generation (18%) (Houser et al., 2014). The proposed policy suggests that an increase in economic output of $700 billion 'over 30 years' would be possible, implying an annual average increase of $23 billion. By comparison, LCEGSS data suggest that revenue in the US green economy has increased by over $60 billion per year for the last four years (Fig. 2).

Concluding remarks
The estimated scale of the green economy ($1.3 trillion and employing over 4% of the working age population) strongly suggests that it is a significant contributor to US economic development and the economic well-being of millions of people across the US. It was also a key contributor to the US recovery after the 2007 financial crisis (Aldy, 2013). Existing federal policies to support the private sector (including clean energy initiatives) have assisted US businesses to grow and create jobs (Obama, 2017), and the data herein suggests that growth in jobs in the green economy may be faster than growth in estimated sales value in some sectors of the green economy. Economic initiatives and environmental regulations can, potentially, drive innovation and economic development (Ambec et al., 2013;Porter and van der Linde, 1995), rather than holding it back. This data suggests that many countries have huge potential to generate higher green employment and growth. For example, China has announced that it aims to generate 13 million clean energy jobs by 2020 (Reuters, 2017) and is positioning itself as a new leader in international climate discussions. The economic case for driving economic growth and job creation through fossil fuels has weakened based on the employment estimates in fossil fuels, and there are genuine risks of stranded assets. To safeguard US economic development and job creation, we suggest that economic, environmental and education policies need to be developed to support the US green economy in the context of global developments in the green economy.
The data analysed in this study provides valuable estimates of economic activity in the green economy, where other datasets are no longer updated or do not provide comprehensive measurement of the green economy. While it has limitations, like all datasets, it suggests that alternative data collection processes have the potential to fill gaps in data availability where other methods are currently unable to provide data. The methodologies of business and market intelligence have a long track record in industry and the private sector, and where resource needs may be too onerous, or time is required to make changes to official industrial classifications, triangulated data estimated using methods like those used in this study can provide valuable insights.
This study has provided the basis to restart the previously fruitful and important debates regarding how to define and measure the green economy in the US, and the value of doing so to better assess claims made about the green economy and green jobs. The study presents a newer, broader definition of the green economy, which includes data estimates of both sales and employment, which has data available for the various subsectors that are included in the LCEGSS taxonomy, and which measures value chain activities. The data therefore have a number of novel characteristics and benefits that give it significant potential to contribute to improving the understanding of how economies are changing and how economic policies could be designed based on alternative data collection processes such as this. Future research can continue to explore the definition of the green economy, as well as the composition of the green economy and green jobs in the US and other major economies, including at the state or subnational level.

Data availability
The datasets generated analysed during the current study are not publicly available due to reasonable commercial interests held by partners in this study, but the aggregate data analysed in this study are available in the Supplementary Materials.