Towards understanding policy design through text-as-data approaches: The policy design annotations (POLIANNA) dataset

Despite the importance of ambitious policy action for addressing climate change, large and systematic assessments of public policies and their design are lacking as analysing text manually is labour-intensive and costly. POLIANNA is a dataset of policy texts from the European Union (EU) that are annotated based on theoretical concepts of policy design, which can be used to develop supervised machine learning approaches for scaling policy analysis. The dataset consists of 20,577 annotated spans, drawn from 18 EU climate change mitigation and renewable energy policies. We developed a novel coding scheme translating existing taxonomies of policy design elements to a method for annotating text spans that consist of one or several words. Here, we provide the coding scheme, a description of the annotated corpus, and an analysis of inter-annotator agreement, and discuss potential applications. As understanding policy texts is still difficult for current text-processing algorithms, we envision this database to be used for building tools that help with manual coding of policy texts by automatically proposing paragraphs containing relevant information.


I. Search terms
As mentioned in the main text, we used the "expert search function" of EUR-Lex to identify laws of interest.The search queries below are intentionally broad in order to capture legal acts concerned with climate change mitigation in general as well as acts concerned with specific mitigation actions (i.e.different technologies to decarbonise relevant sectors).From the set of legal acts found with these search queries, we identified the 18 legal texts used for developing the POLIANNA dataset.

General rules:
○ Knowledge from other parts of the text (not within that section) should not be used for making a coding decision; only information in the immediate text surrounding the passage in question.○ We are interested in information that can be extracted from the text, not in the structure of the text itself.For example, we want to know the number of a directive, not all the places where it is referenced.If that directive is referred to later in the text without providing any other name or number to specify it (e.g., by "the directive"), no need to code that.○ Scope of text to code: ■ Text quoted for an amendment should be coded.■ We also annotate headings if from the language the necessary information becomes clear.■ Independent of the topic, all text is relevant, e.g., we also code nonenergy text parts.We do not annotate the preamble.○ Length of the span that is highlighted: ■ The length of a span can be chosen freely (unitization).■ The span should be as short as possible without loss of primary information.■ Sometimes longer spans can be subdivided into different (potentially overlapping) spans.■ Avoid including articles such as "the" or "a" (unless in the middle of a span).○ Tenses: ■ We annotate irrespective of the tense.○ Two adjacent spans with the same label should in general be annotated separately, e.g., "Directives 70/156/EEC [Ref_OtherPolicy] and 80/1268/EEC [Ref_OtherPolicy]".However, where terms occur frequently together, those also may be annotated as one (e.g., "batteries and accumulators" [Tech_LowCarbon]).○ Some spans may be annotated by several different tags from different features or layers.For example, annotating instrument types and policy design characteristics may be appropriate for the same span, e.g."national inventory system" can be RegulatoryInstr and Form_monitoring.○ For the curation, we use all correct overlapping answers.
Rules specific to parts of the coding scheme: ○ Should be used to identify descriptions of markets and mechanisms that use tradable permits.We do not label terms that describe the permits themselves as TradablePermit: "Emission credits", e.g., are Resource_other.○ "Energy Performance Contracting (EPC)" and "electricity market" are also examples of TradablePermit.

• Regulatory instrument [RegulatoryInstr]
○ Self-referential, repeated descriptions alone like "this Regulation" should not be labelled.What follows these self-referential descriptions, however, should be labelled: "This Regulation establishes a mechanism [Unspecified] for…" ○ Mandatory standards and certification schemes are included.○ Verbs on their own such as "prohibit", "require", etc. should not be labelled as RegulatoryInstr.○ "Rules and regulations" are RegulatoryInstr (unless the term "rules" is used only very broadly).• Tax incentives [TaxIncentives] ○ Tariffs in general are TaxIncentives.

• Subsidies and direct incentives [Subsidies_Incentives]
○ Also includes "direct price support schemes" • Public Investment [PublicInvt] ○ Includes "international development" action if this refers to an investment.

• Education and outreach [Edu_Outreach]
○ Includes labelling schemes to inform or educate.
• Unspecified [Unspecified] ○ Expressions that are not further specified like "support scheme", "implementing measures" and "market based measures" should be labelled as Unspecified.

Policy design characteristics:
• ○ Rules: ■ Mainly, we are interested in identifying statements about a policy's start and end date.Dates that relate to when a policy is approved or adopted are not relevant in this sense.
• E.g., do not label: "Done at Brussels, 18 June 2020."■ We consider a policy being in force as long as it is not explicitly repealed, revoked, or amended.■ Compliance [Time_Compliance] vs policy duration time [Time_PolDuration]: When addressees are required to do something in a given time frame, this should be labelled as Time_Compliance.As in the following example, this might be described as a certain timeframe within the duration of the overarching policy: • "The manufacturer shall ensure that its average specific emissions of CO2 do not exceed the following specific emission standards: (a) for the calendar year 2020 [Time_Compliance], the specific emissions…" The time indicated here does not pertain to the duration of the policy as such but to the time frame during which the addressees have to do something.
■ The tag Time_Resources is to be used to distinguish between resources themselves and the time dimension: • "There are authorised to be appropriated, for activities under this section, a total of EUR 50,000,000 [Resource_MonSpending] for fiscal years 1997 and 1998 [Time_Resources]..." ■ Preferably, only code expressions for time itself, not prepositions such as "by", "on", "after" etc.
• "The Commission shall assess that obligation, with a view to submitting, by 2023 [Time_Compliance], a legislative proposal ..." • "By 25 June 2019 [Time_Monitoring] and every two years thereafter [Time_Monitoring], the Commission shall review the list of feedstock set out in Parts A and B of Annex IX…" ■ The tag Time_InEffect is to be used to identify the date from which a policy is in effect, not the date a policy was passed or adopted.that are concerned with certain economic activities.They can be described at varying levels of abstraction from broad (e.g., "private" sector), to more detailed (e.g., "industrial", "residential", or "transportation" sectors), to very detailed (e.g., "utilities" or"electricity undertakings").
• For example: "In performing the task referred to in point (i) of paragraph 1, transmission system operators ○ Rules: ■ Primarily, we are interested in "material" goals related to climate change mitigation.These can be expressed in either qualitative ("mitigate climate change") or quantitative terms ("50% reduction of greenhouse gas emissions until 2030").The latter can be very technical or detailed (e.g."increase energy efficiency in federal agency's use of photocopiers by 3.4% until 2021") but should still be included.■ Objectives need to be distinguished from the means how to achieve them, i.e. policy instrument types.In the example below, the (qualitative) objective is described in the second part of the sentence ("achieve the…") whereas the first part of the sentence ("the hydrogen…") describes how to achieve this objective: • "To authorise the hydrogen [Tech_LowCarbon] research, development, and demonstration programs [RD_D]… to achieve the more economic hydrogen production and use [Objective_QualIntention]".■ For Objective_QuantTarget, we are interested in substantive information pertaining to the target, such as its coverage (e.g., greenhouse gases, sectors), the reductions quantified and the related baseline, and its time frame.Additional information, e.g., linking to other parts of a policy should not be included when labelling.■ Regarding monitoring, only label its form, such as "report" etc., not further specifications of its content but include specifications such as "yearly", "biannual" etc.
• For example: "a detailed yearly report [Form_monitoring] on the usage of pencils…" ■ How monitoring is conducted, i.e., Form_monitoring, can be described both through verbs and nouns.Also, monitoring can be "forward-looking" in the sense of using projections or modelling.Reference to treaties, constitutions, agreements, white papers, overarching strategies.For example, the Paris Agreement.○ Rules: ■ Often, other policies are mentioned to clarify or define matters.
Such "technical" references should be labelled as Ref_OtherPolicy.
In general, highlighting should be as concise as possible, e.g. by only highlighting a reference's main identifier as in "Article 7 of Decision No 406/2009/EC [Ref_OtherPolicy]".■ We are also interested in expressions that detail how another policy is amended or changed.Note that another policy can also be (partly) repealed through the text at hand.Such expressions should be labelled as Ref_PolicyAmended as in "Regulation (EU) 2019/2088 [Ref_PolicyAmended] should further be amended to…".■ In text segments amending other policies, references to yet other policies should be labelled as well.■ When it is not perfectly clear from the text that a mentioned policy is being amended, it should be labelled as Ref_OtherPolicy.■ Be aware that the same policy mentioned in a text can fulfil different roles in various contexts: it can be referenced for clarification and thus be Ref_OtherPolicy in one context while being amended and thus be Ref_PolicyAmended in another.■ In general, only annotate the name of the policy (not "Article" or "Section").For example "Decision No 406/2009/EC" (instead of "Article 7 of Decision No 406/2009/EC").■ If a policy is referenced by a general name (e.g., "the Directive") after being mentioned before, only highlight the part where the policy is explicitly named.■ The IPCC and its reports do not need to be labelled.■ The following example should be labelled as Ref_other_policy, and not Policy_amended, since it is mentioned in passing that the policy is also amended:"if the CO2 emission and fuel consumption figures have been determined in accordance with the requirements of Directive 80/1268/EEC [Ref_other_policy], as amended by this Directive".■ Targets in themselves are not Ref_Strategy_Agreement, they are not strategies.■ Occasionally, formulations like "without prejudice to Union and national laws" are used.This should be labelled as Ref_OtherPolicy as the expression points to the whole corpus of Union and Member state law(s) pertinent to this specific circumstance.For example, (the fundamental principles of) EU competition law are seen as being of overriding importance.-■When the expression "internal energy market" is used, it should be labelled as Ref_Strategy_Agreement as this expression refers to an element of the EU's common market that is codified in the Treaty on the Functioning of the European Union.Similarly, references to the "(European) Energy Union" should also be labelled as Ref_Strategy_Agreement.

Technology and application specificity:
• Rules: ○ We are interested in identifying expressions of low carbon technologies, energy carriers, and applications.Generally, this includes: ■ Compound words that include technology words (e.g. in "aircraft operator") are not to be labelled as Tech_Other.■ Enabling or auxiliary technologies such as energy efficiency tools etc. are included in Tech_LowCarbon.Similarly, expressions that mention electricity transmission and distribution systems as a means to integrate more renewables, should be labelled as Tech_LowCarbon.However, if transmission and distribution systems are discussed only in general terms (e.g., as in "secure, reliable, and efficient transmission system"), these should be labelled as Tech_Other.
■ Energy-efficient buildings and building renovations or retrofits should be labelled Tech_LowCarbon, particularly when expressions like "deep (energy) renovation," "zero carbon buildings," or "passive houses" etc. are used.■ Expressions describing (light and/or heavy-duty) vehicles as such should be labelled Tech_Other (as, e.g., in "This Regulation establishes CO2 emissions performance requirements for new passenger cars [Tech_Other] and for new light commercial vehicles [Tech_Other]...").Only expressions clearly describing electric vehicles should be labelled as Tech_LowCarbon (as, e.g., in "foster electric vehicle [Tech_LowCarbon] deployment").If the focus is more specifically on electrification of vehicles by the use of batteries, this should be labelled as App_LowCarbon.■ Hydrogen should only be labelled if it is part of an application or technology description.■ The "electricity generation" can, in some contexts, also be used to describe a sector being targeted.In these cases, the expression should be labelled as Addressee_sector and not Tech_Other.This is a datasheet for the dataset POLIcy design ANNotAtions (POLIANNA) based on the questions developed by Gebru et al. [2021].Many aspects regarding the data are described in the main text of the accompanying paper entitled "Towards understanding policy design through text-as-data approaches: The polcy design annotations (POLIANNA) dataset".Here, we keep answers short in order to reduce redundancy.

Motivation
• For what purpose was the dataset created?Was there a specific task in mind?Was there a specific gap that needed to be filled?See the accompanying paper.For reference, the abstract reads: "Despite the importance of ambitious policy action for addressing climate change, large and systematic assessments of public policies and their design are lacking as analysing text manually is labour-intensive and costly.PO-LIANNA is a dataset of policy texts from the European Union (EU) that are annotated based on theoretical concepts of policy design, which can be used to develop supervised machine learning approaches for scaling policy analysis.The dataset consists of 20,577 annotated spans, drawn from 18 EU climate change mitigation and renewable energy policies.We developed a novel coding scheme translating existing taxonomies of policy design elements to a method for annotating text spans that consist of one or several words.Here, we provide the coding scheme, a description of the annotated corpus, and an analysis of inter-annotator agreement, and discuss potential applications.As understanding policy texts is still difficult for current text-processing algorithms, we envision this database to be used for building tools that help with manual coding of policy texts by automatically proposing paragraphs containing relevant information." • Who created the dataset (e.g., which team, research group) and on behalf of which entity (e.g., company, institution, organization)?The dataset is created by the authors of the accompanying paper listed on top of this datasheet.The work was not done on behalf of any entity.

1
• Who funded the creation of the dataset?If there is an associated grant, please provide the name of the grantor and the grant name and number.It was supported by the Swiss National Science Foundation (grant number CRSK-1 190936) and a ETH Career Seed Grant SEED-24 19-2, funded by the ETH Zurich Foundation.

Composition
• What do the instances that comprise the dataset represent?Are there multiple types of instances?The dataset consists of annotated text divided into spans by the annotators (unitizing).Every span corresponds to one instance, which is annotated with three labels in a hierarchical fashion.The lowest level label is a "tag", which belongs to a "feature", and which in turn belongs to a "layer."We organize the dataset by "articles" that correspond to one article in a legislative text.Those splits have no other meaning but to structure the data.
• How many instances are there in total (of each type, if appropriate)?There are in total 20577 annotated spans.A summary overview can be found in the paper and can be computed with notebooks available in the online repository at https://github.com/kueddelmaier/POLIANNA.• What data does each instance consist of ?"Raw" data (e.g., unprocessed text or images) or features?Each instance consists of of a segment of raw text with minimal pre-processing that was segmented manually.
• Is there a label or target associated with each instance?There is a label associated with each instance, and and the same span may occur several times with a different label.See above and the paper for a detailed description of the labels.
• Is any information missing from individual instances?There is no information missing from individual instances.
• Are relationships between individual instances made explicit?Individual instances may overlap in the text that is included in the span, as such overlap was explicitly allowed during annotation.
• Are there recommended data splits (e.g., training, development, validation, testing)?We recommend to stratify by legislative articles to avoid overlapping spans to leak between splits.
• Are there any errors, sources of noise, or redundancies in the dataset?If so, please provide a description.The dataset is a curated version based on annotations by two annotators.We extensively evaluated inter-annotator agreement in the main paper.In addition, policy text is at times repetitive.As such there are parts of sentences in different text passages that are exactly the same, and therefore the dataset can contain redundant annotations, which belong to different text but read the same.
• Is the dataset self-contained, or does it link to or otherwise rely on external resources?The dataset is self-contained.
• Does the dataset contain data that might be considered confidential (e.g., data that is protected by legal privilege or by doctor-patient confidentiality, data that includes the content of individuals' non-public communications)?Does the dataset contain data that, if viewed directly, might be offensive, insulting, threatening, or might otherwise cause anxiety?The data and labels do not contain confidential or offensive data.

Collection process
• How was the data associated with each instance acquired?Was the data directly observable, reported by subjects, or indirectly inferred/derived from other data?The raw data was directly acquired, and labels where created based on the description in the paper.
• What mechanisms or procedures were used to collect the data (e.g., hardware apparatuses or sensors, manual human curation, software programs, software APIs)?How were these mechanisms or procedures validated?The data were directly downloaded from the EUR-Lex website Publications Office of the European Union [2022] using their download functionality.
• If the dataset is a sample from a larger set, what was the sampling strategy (e.g., deterministic, probabilistic with specific sampling probabilities)?The dataset is a subset of all available European laws.Please see Section "Policy selection" in the paper for a description of our sample selection.Policies were selected deterministically based on certain criteria.
• Who was involved in the data collection process and how were they compensated?The data collection process only involved student assistants at ETH Zürich who received hourly rates according to university policy.
• Over what timeframe was the data collected?Does this timeframe match the creation timeframe of the data associated with the instances (e.g., recent crawl of old news articles)?If not, please describe the timeframe in which the data associated with the instances was created.
• Were any ethical review processes conducted?This was not deemed necessary.
4 Preprocessing/cleaning/labeling • Was any preprocessing/cleaning/labeling of the data done (e.g., discretization or bucketing, tokenization, part-of-speech tagging, SIFT feature extraction, removal of instances, processing of missing values)?If so, please provide a description.Please see the paper.
• Was the "raw" data saved in addition to the preprocessed/cleaned/labeled data (e.g., to support unanticipated future uses)?If so, please provide a link or other access point to the "raw" data.The raw data can be accessed at Publications Office of the European Union [2022].
• Is the software that was used to preprocess/clean/label the data available?If so, please provide a link or other access point.We used own scripts that are available at https://github.com/kueddelmaier/POLIANNAand an open source labeling tool provided by TU Darmstadt, Inception Klie et al. [2018].

Uses
• Has the dataset been used for any tasks already?If so, please provide a description.The dataset has been used in a PhD thesis in the summer of 2023 which is available here https://doi.org/10.3929/ethz-b-000641426.
• What (other) tasks could the dataset be used for?Please refer to Section "Usage notes" of the paper.
• Is there anything about the composition of the dataset or the way it was collected and preprocessed/cleaned/labeled that might impact future uses?For example, is there anything that a dataset consumer might need to know to avoid uses that could result in unfair treatment of individuals or groups (e.g., stereotyping, quality of service issues) or other risks or harms (e.g., legal risks, financial harms)?Is there anything a dataset consumer could do to mitigate these risks or harms?Some parts of the dataset pertain only to climate change mitigation and renewable energy, and given the selection of documents included in the dataset, they may not cover the entire possible instances of general policy-making.This means that the dataset may need to be expanded for other use cases in the future.We intended to provide the necessary scripts and descriptions for users to expand the dataset but cannot guarantee that the tools and sources remain directly compatible with those.Users should be able to, however, recreate the annotations also with other tools.
• Are there tasks for which the dataset should not be used?While there are many tasks that the dataset is not appropriate for, there are no known use cases that may lead to intentionally or unintentionally malicious use.

Distribution
• Will the dataset be distributed to third parties outside of the entity on behalf of which the dataset was created?The dataset is publicly available with a digital object identifier (DOI) at https://doi.org/10.5281/zenodo.7569275.

Maintenance
• Who will be supporting/hosting/maintaining the dataset?The dataset will be supported by the authors of the accompanying paper.
• How can the owner/curator/manager of the dataset be contacted (e.g., email address)?The authors can be contacted via their institutional email addresses.
• Will the dataset be updated (e.g., to correct labeling errors, add new instances, delete instances)?If so, please describe how often, by whom, and how updates will be communicated to dataset consumers (e.g., mailing list, GitHub)?There is currently no plan to update the dataset.If updates are made, they will be made as new versions on the data repository.
• If others want to extend/augment/build on/contribute to the dataset, is there a mechanism for them to do so?If so, please provide a description.Will these contributions be validated/verified?There is currently no plan to have others contribute to updates of the datasets.If this changes in the future it will be communicated at https://github.com/kueddelmaier/POLIANNA.
Authority_default]: The individual or entity that is making the rule, ensuring its implementation, for enforcing the rule and may apply sanctioning, including an existing individual or entity empowered, directed or required to implement.○ Legislative authority [Authority_legislative]: The individual or entity that is drafting or voting on legislation.○ Newly established authority [Authority_established]: A newly established entity that is ensuring the policy's implementation.○ Monitoring authority [Authority_monitoring]: An individual or entity responsible for monitoring the outcome of the policy, through report, review, or audit.All entities that are part of the monitoring process, and not the primary monitored entity.○ Default addressee [Addressee_default]: The individual or entity that the rule applies to and needs to ensure its implementation.○ Resources addressee [Addressee_resource]: The actor that receives a resource.Resources that are provided through government spending.Can be a concrete sum or unspecific assumption such as "more spending on…".This includes grants, subsidies, allocations of funds.○ Monetary revenues [Resource_MonRevenues]: Provisions that affect government revenue (positively or negatively).Can be a concrete sum or unspecific assumption such as "increase revenue".This includes, e.g., tax credits (negative), tolls, fees, customs (positive).○ Other resource type [Resource_Other]: Other resources such as personnel, facilities/equipment, or emissions allowances.
○ Monitored addressee[Addressee_monitored]: An individual or entity monitored for the outcome of the policy, through report, review, or audit.○Sectoraddressee[Addressee_sector]: Relevant sectors that are covered by the policy.• Objective [Objective] ○ Quantitative target [Objective_QuantTarget]: A quantitative target or objective of the policy.○ Qualitative intention [Objective_QualIntention]: A qualitatively stated intention or objective of the policy.This lacks a specific quantity that is targeted, for example increasing the amount of hydrogen produced with renewable electricity sources.Also includes references to unspecified targets.○ Quantitative target not mitigation [Objective_QuantTarget_noCCM]: A quantitative target or objective of the policy, not pertaining to climate change mitigation (e.g.adaptation or employment targets).○ Qualitative intention not mitigation [Objective_QualIntention_noCCM]: A qualitatively stated intention or objective of the policy, not pertaining to climate change mitigation (e.g.adaptation or employment targets).This lacks a specific quantity that is targeted.• Resource [Resource] ○ Monetary spending [Resource_MonSpending]: • Compliance [Compliance] ○ Sanctioning form [Form_sanctioning]: Sanctioning provisions and measures.○ Monitoring form [Form_monitoring]: The form of the monitoring (provisions relating to report, review, or audit; standards and certification schemes) • Reversibility [Reversibility] ○ Provision for reversibility [Reversibility_policy]: A provision for the extension or termination of the policy.• Reference to other policies [Reference] ○ Reference to other policy [Ref_OtherPolicy]: External legislative text referenced for objectives, definitions, constraints, or for other reasons.○ Amendment of policy [Ref_PolicyAmended]: Amendment of another policy, or repeal thereof, that is made through this legislation.○ Reference to strategy or agreement [Ref_Strategy_Agreement]: Reference to treaties, constitutions, agreements, white papers, blue prints, overarching strategies.For example the Paris Agreement or key national climate strategies.
Low-carbon energy source or carrier [Energy_LowCarbon]: A low-carbon energy source or energy carrier (includes biomass and nuclear).○ Other energy source or carrier [Energy_Other]: Other energy source or energy carrier (includes fossil fuels).
○ Other technology [Tech_Other]: Other technologies with no direct role for decarbonization.• Energy source/carrier specificity [EnergySpecificity] ○ 1 Appendix Auctions or auctioning mechanisms in themselves are components of instrument types (for example auctioning of permits), if they are kept general they fall under regulations.•Intergovernmental agreements, such as the Paris Agreement, are not instrument types as such but are viewed as Ref_Strategy_Agreement; also, references to the European Energy Union should be labelled as Ref_Strategy_Agreement. • Some expressions refer to technicalities of implementation and these are not annotated as instrument types.For example, in the text "Member states shall ensure a level playing field", the expression "level playing field" is not annotated because it is a technicality of implementation.
○ Regulatory instrument[RegulatoryInstr]: Wide range of instruments by which a government will oblige actors to undertake specific measures.○Taxincentives[TaxIncentives]: Policies to encourage or stimulate certain activities or behaviours through tax exemptions, tax reductions or tax credits on the purchase or installation of certain goods and services.General:• Feed-in tariffs are both RegulatoryInstr and Subsidies_Incentives.• Regarding monitoring, the distinction between actors doing the monitoring and those being monitored is of most interest to us. ■

its greenhouse gas emissions at least by the percentage set for that Member State in Annex I in relation to its greenhouse gas emissions in 2005
Data and (statistical) information can, under certain circumstances, also be understood as Resource_other, namely if they are made available to an actor for example for their economic advantage.In most cases, however, data should not be labelled in this way.
• For example: "Each Member State shall, in 2030, limit

decision to release the Member state from the obligation to submit an amended national renewable energy action plan
• For example: "The Commission [Authority_default] may, if the Member state [Addressee_default] has not met the indicative trajectory by a limited margin, and taking due account of the current and future measures [Unspecified] taken by the Member state [Addressee_default], adopt a [Reversibility_policy]."The final span would contain additional overlapping annotations ("...Member state [Addressee_default]..."; "...national renewable energy action plan [Ref_Strategy_Agreement]...").