Goteo.org civic crowdfunding and match-funding data connecting Sustainable Development Goals

The United Nations’ Sustainable Development Goals (SDGs) highlight priority areas for global sustainable development, such as reducing inequalities and protecting the environment. Digital platforms, such as Goteo.org, facilitate financial support from individuals for SDG-related initiatives through crowdfunding and match-funding campaigns. Match-funding is a type of crowdfunding, where individual donations are matched or multiplied by public and private organizations. There remains a lack of open data, however, to study the effectiveness of match-funding as a way to finance these civic initiatives. The Goteo.org platform’s approach to data transparency and open source principles have allowed these data to be collected, and here we present a dataset for 487 civic crowdfunding campaigns. This dataset presents a unique opportunity to compare the behaviour of different crowdfunding modalities in parallel with the SDGs. Measurement(s) crowdfunding campaigns • Donation Technology Type(s) Goteo digital platform • digital curation Measurement(s) crowdfunding campaigns • Donation Technology Type(s) Goteo digital platform • digital curation Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.12053847

:132 | https://doi.org/10.1038/s41597-020-0472-0 www.nature.com/scientificdata www.nature.com/scientificdata/ were only introduced in 2015, analyses of their implementation and impact in local contexts are still scarce 20 , as its connection with civic mobilisation beyond institutional actions. However, the need and urgency to prioritize them in international research and sustainable development efforts are widely accepted 21 .
Although some recent studies quantitatively address the progress and systemic interrelation of each of the 17 SDGs 22 , and even their possible connection with crowdfunding 23 , our approach makes the first contribution to both fronts from an open data perspective, adding to an emerging trend in the study of the impact of civic technologies 24 . The dataset examines civic crowdfunding, match-funding and the SDGs from two dimensions: (1) the efficiency and behaviour through usual crowdfunding models in contrast with purely match-funding mechanisms; and, (2) their connection with the transversal priorities of the SDGs.
It is important to emphasise for other studies to which this dataset can contribute regarding the financing of the SDGs (with a total volume reflecting 3,497,502 euros among the 487 Goteo campaigns included, from 55,419 donations), that whilst it can serve as an indicator for different approaches and datasets, in a macro context, it represents a limited scope when compared to general figures from international organizations and public and private financing; some experts have estimated that to meet the SDGs globally by 2030 requires annual funding of trillions of euros 25 .

Methods
Regarding the first dimension, distinguishing between modalities of civic crowdfunding, this dataset covers results from 487 crowdfunding campaigns of different types on the Goteo digital platform (https://www.goteo. org/), between February 2017 and May 2019. Goteo represents a unique approach to data transparency as one of the few open source crowdfunding platforms in the world 16 , allowing full public scrutiny of its main funding dynamics, campaigns and backer behaviours. The dataset includes the typology of campaigns, differentiating between those following the usual crowdfunding mechanism (392) and those with match-funding models (95) which have been implemented by the platform in recent years for various pilot projects. Among the latter, the dataset distinguishes between campaigns that have applied match-funding which supplements the donations received at the end of the campaign period (the usual format in crowdfunding platforms experimenting with this model), and those which have dynamically multiplied individual donations from users in real time (through an "ad hoc" formula developed by Goteo). Likewise, through the main dataset provided, each campaign is accompanied by descriptive data (title, subtitle, description of objectives, motivation, social commitment, etc.) and also data on publication date, original language and URL, as well as the amount of money requested, obtained, and other relevant funding statistics.
Regarding the dimension of civic projects funded by civil society concerning each SDG theme, this dataset is innovative, presenting a detailed coding based on a double validation process. Firstly, automatic coding according to "social commitments" as defined by users as campaign promoters, followed by a phase of manual coding in which we have reviewed and refined the specific relationship with one or more of the SDGs in each of the 487 campaigns. This has been based on the presentations and textual contents of each one. Adding more possible elements of discussion to the still emergent literature on SDGs and economic impact 26 , this dataset allows for establishing a series of relationships and observations among a corpus of data on civic crowdfunding campaigns of different modalities according to their detailed classification regarding SDGs.
As such, each campaign is accompanied by a set of additional data describing its connection with the 17 SDGs, allowing for comparative analysis beyond crowdfunding modalities and relating to the themes of each goal: • Goal 1: End poverty, in all its forms everywhere. • Goal 2: End hunger, achieve food security, improve nutrition and promote sustainable agriculture. • Goal 3: Ensure healthy lives and promote well-being for all. • Goal 4: Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all. • Goal 5: Achieve gender equality and empower all women and girls. • Goal 6: Ensure availability and sustainable management of water and sanitation for all. • Goal 7: Ensure access to affordable, reliable, sustainable and modern energy for all. • Goal 8: Promote sustained, inclusive and sustainable economic growth, full and productive employment and decent work for all. • Goal 9: Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation. • Goal 10: Reduce inequality within and among countries. • Goal 11: Make cities and human settlements inclusive, safe, resilient and sustainable. • Goal 12: Ensure sustainable consumption and production patterns. • Goal 13: Take urgent action to combat climate change and its impacts. • Goal 14: Conserve and sustainably use the oceans, seas and marine resources for sustainable development. The Goteo platform does not explicitly facilitate the connection of crowdfunding campaigns with the SDGs, so potential donors (also called 'backers') are guided exclusively by how each promoter explains the specific theme and civic commitments of their projects. However, since its inception, the platform has had a system of www.nature.com/scientificdata www.nature.com/scientificdata/ classification by themes that allows access to campaigns which (as described below) has evolved to allow an automatic initial reclassification of such themes around the SDGs, before internal review and manual coding.
Data collection and coding process. Goteo, besides its focus on civic crowdfunding campaigns, is characterised by being open source and facilitating a series of open data through an API. From 1,383 projects published between the time of writing and the start of operations on the platform at the end of 2011 (a total volume of more than 117,000 backers) data of active campaigns during 39 months were chosen for this dataset. The cut-off date for the dataset (instead of covering all the campaigns since the start of the platform) was decided based on a new thematic classification for Goteo campaigns introduced by the promoters of the platform (the non-profit, Goteo Foundation). Since February 2017 projects can be classified according to an impact model called "social commitment", differing from the original ones of: The data collection process, once the initial corpus of 487 campaigns was identified, involved associating a series of additional fields linked to each of the SDGs, to the first version of the dataset (regarding descriptive and performance data). These fields came from the automatic assignment of a positive or negative relationship of campaigns (values of 0 or 1) with each of the 17 SDGs, according to a matrix of analogies between the social commitments reflected in the list above -a codification agreed upon previously by the promoters of the platform.
Subsequently, two researchers manually reviewed the codification of all the data linked to the SDGs for each of the 487 campaigns, refining the automatic classification and in most cases limiting the relationship with SDGs to the three most relevant categories. The validation of this second coding was addressed by two specific meetings among researchers and members of Goteo staff, to check and discuss the results of a preliminary pilot coding of 50 campaigns. The variation to the initial automatic classification was 98%, as a percentage of campaigns to which a pre-assigned SDG category was added or removed. This resulted in a significant improvement of the categories linked to the SDG in the dataset after the manual review of the initial automatic coding, which was also discussed and validated in a final meeting between researchers and the Goteo platform promoters.
The dataset also provides a series of additional fields that come from a clustering of the different SDGs into three categories, called "footprints". These allow for the visualisation of additional relationships according to the data of the automatic coding of SDGs: social, ecological and democratic (Fig. 1).
Finally, to facilitate the use of the dataset by third parties beyond campaign behaviour statistics, an English translation of the descriptive fields is used for all the initiatives from their original language (e.g. Spanish, Catalan, Galician, Basque), allowing textual content analysis to be performed (with a volume of more than 35,000 words in total).
Regarding the provenance, use and license of the data of this work, during the sign-up process, new users of the Goteo platform are informed about terms, conditions and privacy regarding data. Specifically: "in relation to how some activity data can be reproduced, publicly communicated, transformed or freely extracted in part or in whole, by anyone, in any format, with no restriction of time or territory, for any further legitimate use, but containing no personal data from individuals, in compliance with General Data Protection Regulation (EU) 2016/679 (GDPR)". In this regard, our study is developed on publicly available open data sources, accessible via the Goteo API (http://developers.goteo.org/) as well as the contents of the platform itself at https://en.goteo. org/, and shared under the same conditions of the Creative Commons 3.0 BY-SA license, which is in force on the platform. In relation to public availability of the data used for the project, our study has not required the approval or review of an institutional ethics board. Regarding the availability of the unprocessed data (as in the case of untranslated versions of specific fields), the same sources indicated here can apply for direct access to it.

Data Records
The following list describes the different values of the fields in the dataset 27 . Again, to increase the use of the dataset by diverse users in various locations and organizations, the original content has been translated to English, except where indicated.
Goteo dataset coded. This is the main dataset covering the descriptors of 487 Goteo campaigns, after the automatic coding and manual coding processes explained above. The values relate to campaign descriptors originally user-generated by campaign organizers and project leaders, providing content via the Goteo registration form.

• Project identifier
• PROJECT ID -Project identification based on keywords (in different languages, depending on the original version). • NAME -Project name, as reflected on the Goteo platform (in its original language, not translated to English).
• SUBTITLE -Text content with the full subtitle of the campaign, as reflected on the website (translated to English). • URL -Active URL of the campaign page.
• Crowdfunding modality • FUNDING TYPE -Regular crowdfunding behaviour in the campaign is reflected by the value NO MATCH-FUNDING + NO MATCHER, while the other two values indicate dynamic match-funding or funds added by a matcher institution after the minimum goal was reached. • MATCHFUNDING CALL -Values in this column refer to the match-funding calls to which they relate. If the value is NO it means that they behave like regular crowdfunding campaigns (with no automatic matching of funds). • MATCHER -If values in this column are ahoracomparte it means that the campaign has received additional matching funds at the end of the funding period, if the minimum amount defined was reached. • MATCHFUNDING_D -If the campaign is assigned to a specific match-funding call, the value is YES.
• MATCHER_D -With the value 'yes' if the campaign has a special assignment of a "matcher" institution, funds are added if the minimum funding goal is reached.

• Campaign descriptors
• PUBLISHED -Indicates the date on which the campaign is launched.
• sdg_2 -End hunger, achieve food security and improved nutrition and promote sustainable agriculture.
• sdg_3 -Ensure healthy lives and promote well-being for all at all ages.
• sdg_4 -Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all.
• sdg_5 -Achieve gender equality and empower all women and girls.
• sdg_6 -Ensure availability and sustainable management of water and sanitation for all.
• sdg_7 -Ensure access to affordable, reliable, sustainable and modern energy for all.
• sdg_8 -Promote sustained, inclusive and sustainable economic growth, full and productive employment and decent work for all. • sdg_9 -Build resilient infrastructure, promote inclusive and sustainable industrialization and foster innovation. • sdg_10 -Reduce inequality within and among countries.
• sdg_11 -Make cities and human settlements inclusive, safe, resilient and sustainable.
• sdg_12 -Ensure sustainable consumption and production patterns.
• sdg_13 -Take urgent action to combat climate change and its impacts.
• sdg_14 -Conserve and sustainably use the oceans, seas and marine resources for sustainable development.
• sdg_15 -Protect, restore and promote sustainable use of terrestrial ecosystems, sustainably manage forests, combat desertification, halt and reverse land degradation and halt biodiversity loss. • sdg_16 -Promote peaceful and inclusive societies for sustainable development, provide access to justice for all and build effective, accountable and inclusive institutions at all levels. • sdg_17 -Strengthen the means of implementation and revitalize the global partnership for sustainable development.
• Footprint codification • FP ID 1 -Ecological footprint: automatic assignment of values depending on the social commitment category of each campaign. • FP ID 2 -Social footprint: automatic assignment of values depending on the social commitment category of each campaign. • FP ID 3-Democratic footprint: automatic assignment of values depending on the social commitment category of each campaign.

• Campaign performance
• STATUS -Indicates if the campaign was successfully funded (irrespective of its modality) with "funded" or if not with "unsuccessful". • DAYS TO REACH -Results of the calculation of the days needed for the campaign to reach its minimum funding. • NUM REWARDS OFFERED -Indicates the diversity of rewards offered per campaign (ie how many different types of reward were available). • TOTAL REWARDS CHOSEN -Indicates the total of rewards (of any type) that were selected by backers when contributing with specific amounts to a given campaign. • TOTAL REWARDS RESIGN -Indicates the total of donations per campaign where users specifically opted not to receive the reward (of any type) in exchange. This can be understood as those donors that wanted to help a given project, without the need to receive something in exchange as campaign rewards. • AMOUNT RETURNED -Indicates the total money that is returned by Goteo to backers bank accounts after an unsuccessful campaign finishes (ie fails to reach the minimum amount on time). • AMOUNT REINVESTED -Indicates the total amount of money that is kept in the Goteo accounts of those users opting (when donating) to reinvest the money in other projects where the indicated campaign is unsuccessful. • AMOUNT BACKED -Regarding campaigns with match-funding modality, indicates the total money donated by individual backers (the same as for those campaigns with no match-funding scheme). • AMOUNT MATCHED -Indicates the total money that has been added by the "matching" institution of a given match-funding campaign.
Goteo categories descriptive. This set of tables covers the different categories and criteria for relating campaign social commitments (defined by Goteo users), SDGs numeration (double coded, automatic by Goteo staff and manually afterwards by researchers) and the three footprints (defined by Goteo staff).
• SDG ID -Numeric identifier for each of the 17 Sustainable Development Goals.
• SDG TITLE -Title of each SDG value.
• SDG URL -URL for each SDG description on the UNs website (https://www.un.org/ sustainabledevelopment/). • SC ID -Unique numeric identifier related to the social commitment of the campaign, as selected by the users when describing it via the Goteo form. • SC TITLE -Title of the given social commitment, as defined by the Goteo user. • DAYS TO REACH (median) -Results of the calculation of the days needed for the campaign to reach its minimum funding.
• Datafile successful campaigns -The same fields as "Social Commitment clustering" but prior to grouping campaigns (shows the total of 408 successful campaigns).

• SDG stats
• SDG ID -Reflects each SDG after calculating its distribution among the different campaigns (pondering its presence in the total number of projects). • PROJECTS (count) -Number of projects which have a specific SDG assigned.
• The remaining fields are the same as the "Social Commitment" clustering • SDG correlations -This table calculates the variations between SDGs automatically assigned to each of the campaigns of the dataset and the second round of manual coding. www.nature.com/scientificdata www.nature.com/scientificdata/ Goteo donations detail. This table reflects the details of donations to each campaign: time, amount donated, relation to match-funding mechanisms and date of the transaction. It also reflects whether the receipt of a reward for the donation was declined by the user (a specific feature of Goteo) and the messages of support received with the donation, if any.
• PROJECT ID -Project identification based on keywords (in different languages, depending on the original version). • D_STATUS -Values referring to the donation being processed or not: 'Collected' means the money was processed finally due to the success of the campaign, while 'Returned' is assigned when the campaign didn't reach the minimum funding goal (following an "all-or-nothing" scheme) and money was returned to users. Finally, 'Returned to wallet' indicates those reimbursements that (instead of going back to the user's bank accounts) are kept in their Goteo ones, as indicated by them previously. • D_AMOUNT -Refers to the money donated (in euros) by the user.
• D_USER -Coded, anonymous identificator of Goteo users for each donation, except for those who selected to donate anonymously (where value is 0). • D_DATETIME -Date and time value reflecting the precise time of donation.
• D_REWARD RESIGN -Indicates whether or not the donation reward was accepted in exchange.
• D_CALL: If affirmative, the match-funding call in which the campaign is included.
• D_MATCHER -Indicates the matcher user/institution, in campaigns under a match-funding call.
• D_MATCHES -Indicates a match-funding donation which matches a previous one, from a match-funding institution in campaigns under a match-funding call. • D_MATCHED -Indicates a regular donation that subsequently receives match-funding, in campaigns under a match-funding call. • D_SUPPORT MSG -Content of messages of support from backers to the campaign they are donating to, if they sent one (in the original language).

Goteo variable statistics. This additional dataset reflects the different variables of the main dataset regard-
ing Goteo crowdfunding and SDGs, with overall sums or Skewness and Kurtosis statistical analyses to characterise its variability.

technical Validation
In the process of obtaining, coding, combining and preparing the different datasets we performed the following tasks: (1) Extracting information from the Goteo platform information system through API calls (http://developers. goteo.org/doc/) to different tables of the information system: • Project register • "projects"  sdg_2N  sdg_3N  sdg_4N  sdg_5N  sdg_6N  sdg_7N  sdg_8N  sdg_9N  sdg_10N sdg_11N sdg_12N sdg_13N sdg_14N sdg_15N sdg_16N