The Austronesian Game Taxonomy: A cross-cultural dataset of historical games

Humans in most cultures around the world play rule-based games, yet research on the content and structure of these games is limited. Previous studies investigating rule-based games across cultures have either focused on a small handful of cultures, thus limiting the generalizability of findings, or used cross-cultural databases from which the raw data are not accessible, thus limiting the transparency, applicability, and replicability of research findings. Furthermore, games have long been defined as competitive interactions, thereby blinding researchers to the cross-cultural variation in the cooperativeness of rule-based games. The current dataset provides ethnographic, historic information on games played in cultural groups in the Austronesian language family. These game descriptions (Ngames = 907) are available and codeable for researchers interested in games. We also develop a unique typology of the cooperativeness of the goal structure of games and apply this typology to the dataset. Researchers are encouraged to use this dataset to examine cross-cultural variation in the cooperativeness of games and further our understanding of human cultural behaviour on a larger scale.

Background and summary F or humans and non-human animals, play is an essential activity that prepares individuals for adult life. Even though play offers few direct and immediate pay-offs and requires substantial energy (Pellegrini et al., 2007), human children spend a large portion of their time playing (Lew-Levy et al., 2020). During play, children imitate adults and acquire culture-specific skills, norms, and behavioural repertoires (Bock and Johnson, 2004). Although human and non-human animals engage in various forms of play, there is one human-specific form of play (Lew-Levy et al., 2020) that humans of many different ages engage in: rule-based games (Rakoczy, 2007).
Games are a type of play characterized by predefined rules that normatively structure the actions and goals of one or more players (Stenros, 2016;Whittaker, 2012). Children as young as 3 understand and selectively enforce the normative rules of such games (Hardecker et al., 2017;Rakoczy, 2007;Rakoczy et al., 2009Rakoczy et al., , 2008. Between the ages of 5 and 6, children naturally begin to engage in rule-based games (Mogel, 2008). Games also play a special role in human culture, in that they simulate behaviour in important cultural activities, such as war or religious practice (Roberts et al., 1959). With regard to culture, game types vary with geographic location (Mogel, 2008), child-rearing practices (Roberts and Sutton-Smith, 1962), and social complexity (Roberts et al., 1959). For example, games of strategy are present in most societies with high levels of political integration and social classes, but are absent in most societies without these (Roberts et al., 1959), suggesting a non-random distribution of games as a function of cultural context (Chick, 2015). As for the function of games in human development, theoretical and empirical evidence is currently lacking. Research on humans and animals suggests play has an important role in the development of social, cognitive, physical, and emotional skills (Krenz, 2001). However, rule-based games have often been excluded from this research (Pellegrini et al., 2007;Smith, 2005), as some have argued that rule-based games do not "foster innovation" (Pellegrini et al., 2007).
To the extent that games have been studied cross-culturally, research has mainly focused on a single category of gamescompetitive ones. A commonly used definition of games in the anthropological and psychological literature (Avedon and Sutton-Smith, 1971; Barry and Roberts, 1972;Chick, 1998Chick, , 2015Peregrine, 2008;Sutton-Smith, 1962, 1966;Silver, 1978) also includes competition as a prerequisite: "a recreational activity characterized by organized play, competition, two or more sides, criteria for determining the winner, and agreedupon rules" (Roberts et al., 1959). This view has shaped our understanding of games as competitive interactions and has excluded other forms of games-such as cooperative or solitary ones-from the lenses of psychological and anthropological research. As such, little is known about variation in the cooperativeness of games and how the cooperativeness of games might relate to variation in other aspects of the cultural environment.
One way in which the cooperativeness or competitiveness of a game manifests is through its goal structure (Deutsch, 1949;Johnson and Johnson, 2011). Some games emphasize cooperative behaviour between individuals to achieve a shared goal (e.g., hacky sack), others emphasize competitive behaviour between individuals (e.g., chess), or solitary behaviour with no shared nor exclusive goal among players (e.g., jacks). In one of the few studies to examine non-competitive rule-based games, Eifermann (1970) finds variation in the cooperativeness of games played by Kibbutz children and Moshav children, suggesting that games mirror cultural levels of cooperation and egalitarianism. However, the small sample size of cultural groups (N = 2) in this study limits the generalizability of this research.
The current dataset addresses these issues by providing rich descriptions of a large set of games played in Austronesianspeaking cultural groups. Cultural groups associated with the Austronesian language phylogeny (Gray et al., 2009) share common linguistic ancestry (Gray et al., 2009;Greenhill et al., 2008) and cultural features (Goodenough, 1957b;Watts et al., 2016Watts et al., , 2015, and comprise one of the largest language families in the world (Gray et al., 2009). Despite their common linguistic ancestry, these cultural groups exhibit high cultural diversity (Goodenough, 1957a;Watts et al., 2015). Moreover, a significant fraction of these groups is ethnographically well-documented, making them an ideal sample for testing predictions about the distribution and role of games in human cultures.
The Austronesian Game Taxonomy is a unique dataset that can be utilized to investigate questions on the origins, distribution, and function of human games. In addition to the game descriptions (available upon request), we provide the goal structure coding (scheme), several optional filtering steps for researchers to include or exclude games according to the aims of their research, and codes for cross-cultural database matching. We encourage researchers to use the current dataset to test predictions about the distribution of the cooperativeness of games, or to code other aspects of games, such as the type of skill needed to play the game (Roberts et al., 1959), the psychological interdependence of players (Eifermann, 1970), the ages and sex of players, or the use of objects in games across cultures. For example, researchers could ask questions about the role games might play in children's social learning across cultures (Boyette, 2016b), or whether the distribution of games relates to other cultural variables such as social stratification (Boyette, 2016a;Roberts et al., 1959) or levels of intergroup conflict (Richerson et al., 2016).

Methods
Defining games. In most prior cross-cultural studies on games, scholars have defined games competitively (Roberts et al., 1959) and often in terms of 'rule-based games' (Boyette, 2016a;Hewlett et al., 2011). For the purposes of the current study, we have adopted the criteria used by Whittaker (Whittaker, 2012), which includes non-competitive rule-based games. Importantly, as defined here, games also include non-competitive scenarios and can be played by one or more players. We define a game as an activity with: 1. explicit rules accepted by the player(s), 2. undetermined outcomes or actions, 3. contest or challenge, and 4. non-utilitarian value Whittaker (2012) does not clearly define the game criteria in detail, thus, we define these criteria in our own terms. The first criterion, "explicit rules", refers to the constitutive rules of the game, or the regulating means of playing the game. Explicit rules refer to specific behaviours or actions allowed and prohibited by the player(s) of the game to achieve the goal of the game (i.e., the instructions or rulebook of the game; Vossen, 2004).
"Undetermined outcomes" refers to the end-state of the game and can be as simple as not knowing whether one will achieve the goal of the game or, if there is a winner, not knowing who will win the game. "Undetermined actions" include the uncertainty in the specific actions made by the player(s), the order of the actions during the activity, or the timing of events. In other words, the actions and outcome of the game are not scripted or predetermined, as in a theatrical play.
A contest or challenge can be defined as a real or imaginary obstacle for the player(s) to overcome in order to reach the goal of the game. When this challenge is not overcome, the player(s) do not reach the goal of the game. This contest or challenge can take the form of competition between two teams toward one mutually exclusive goal, or it may take the form of a task in which one individual player plays "against" time, chance, or their own abilities. It is important to note that this criterion does not imply that there must be several players playing the game-a contest or challenge may exist for an individual player playing a game by themselves-and it also does not imply that there must be competition between the players in the game.
The final criterion, a "non-utilitarian value", includes activities that people play "freely and spontaneously" (Whittaker, 2012) and suggests that people choose these activities because they want to play the game (Whittaker, 2012), but not because the game is imposed upon them by others.
Game descriptions that provided insufficient information on the game (e.g., the source mentioned the name of the game or a short description of the game without the rules) were included in the database and potentially merged with additional descriptions from other sources at a later stage. Users of the data who prefer a narrower definition of games (e.g., excluding non-competitive games as in Roberts et al., 1959) may re-code the text excerpts to reflect their views. Users may also want to re-examine the four main databases listed in the section "Search criteria and methodology" for further relevant text excerpts.
Defining the goal structure of games. As previous studies have often defined games in a competitive manner, not much is known about the cooperativeness of games. One way to capture potential variation in the cooperativeness of games is to examine the cooperativeness of the structure of the players' goals. The cooperativeness of social interactions can be categorized into three broad types-no interdependence, positive interdependence, and negative interdependence (Deutsch, 1949;Johnson, 1974, 2011). No interdependence indicates the independence of individuals goals-one person is not affected by another person achieving their goal. Positive interdependence refers to the congruity of individuals' goals. For example, if one person reaches their goal the other person also reaches theirs. Negative interdependence refers to the opposition and misalignment of individuals' goals-if one person reaches their goal, the other person cannot reach theirs.
While this typology of interpersonal goal structures is useful, social interactions are rarely purely cooperative or competitive (Deutsch, 1949). Games can also take on more complex structures due to the interaction of social interdependencies and the dyadic structure of interactions between individuals. Thus, we present a new coding scheme for the cooperativeness of games by expanding these interpersonal goal structures to examine the goal structure of games. In the context of games, we define a 'goal' as the overarching aim of the player as a means to end the game.
For example, in a game of chess, each player has the goal of placing the other's king in checkmate.
We describe our typology of goal structures in detail below and provide a visual guide in Fig. 1. We discuss the most common types of goal structures for the games observed in our dataset here. There are other possible goal structures with more than two units that we do not present.
The description of each goal structure is followed by an example game that is familiar to the first author (i.e., American-European background), followed by one from the AustroGames database.
Our typology includes the following goal structures of games: Solitary: The players can interact in a game at the same time and usually have an identical goal, but the players neither cooperate nor compete with one another (no interdependence; Johnson and Johnson, 1974). A single player can also play a game by themselves. For example, in a game of hopscotch, players have the identical, non-cooperative, and non-competitive goal of hopping through all of the boxes by themselves. The game tanimalenge (Game_ID: bello04, Pulotu_culture: Renell and Bellona, Common_name: bite the apple) requires a stick (80 cm long) with a piece of yam, taro, or panna placed on top of the stick. A player attempts to bite the piece of yam off the stick while hopping on one foot with their hands behind their back. If a player succeeds, they retreat into the circle of observers surrounding the stick and join in singing, and the piece of yam is set-up for the next player. If a player does not succeed (i.e., puts their foot down or the piece of yam falls), they retreat into the circle and join in singing. There is no winner or loser of the game (Kuschel, 1975).
Competitive: Players compete with one another and do not cooperate with any other players to achieve the goal of the game. There are no teams in this form of game; each player is a unit and competes against the other players (negative-interdependence; Johnson and Johnson, 1974). For example, in a game of chess, each player has the goal of placing the opponent's king in checkmate. Each player acts competitively, and players' goals are mutually exclusive to one another. In the game lafo litupa (Game_ID: samo44, Pulotu_culture: Samoan, Common_name: throwing and catching 100 beans), two players try to catch 100 beans in groups of four before the other player (Culin, 1899).
Competitive vs. Solitary: Some players have identical, individual goals, and are neither cooperating nor competing with one another to reach this goal (no interdependence, as indicated by the white dots in Fig. 1; Johnson and Johnson, 1974). The other individual (i.e., the black dot) has a competing goal with these players (negative interdependence; Johnson and Johnson, 1974). For example, in a game of hide-and-seek, it is one player's goal to find all other players, while the other non-cooperating individuals try to hide for as long as possible, irrespective of whether the other hiding players have been found. A similar  Fig. 1 The goal structure of players during a game. Each dot represents one player. The colour of the dots represents the goal of the player; different coloured dots represent differing goals; same-coloured dots represent identical goals. A dashed line represents a competitive relationship between players' goals (negative interdependence), a solid line a cooperative relationship between players' goals (positive interdependence), and no line between players is neither a cooperative nor competitive relationship (no interdependence).
HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | https://doi.org/10.1057/s41599-021-00785-y ARTICLE HUMANITIES AND SOCIAL SCIENCES COMMUNICATIONS | (2021) 8:113 | https://doi.org/10.1057/s41599-021-00785-y game, pe'epe'e akua (Game_ID: hawa49, Pulotu_culture: Hawaiian, Common_name: hide-and-seek), is played outdoors in Hawaii with a "ghost" as the seeker (Culin, 1899;Pukui, 1943). Competitive vs. Cooperative group: Some players have identical, mutual goals, and cooperate to reach this goal (i.e., positive interdependence; indicated by the white dots in Fig. 1; Johnson and Johnson, 1974). Another player (i.e., the black dot) has a competing goal (negative interdependence; Johnson and Johnson, 1974) with the cooperating individuals. For example, in freeze tag, one individual's goal is to tag all other players, while the opponents aim to stay unfrozen for as long as possible, and can cooperate and 'unfreeze' each other by tapping 'frozen players on the shoulder. The game hai kaui (Game_ID: bell12, Pulotu_culture: Rennell and Bellona, Common_name: circle game) is played by children in the water. A group of children holds hands to form a circle and one child swims inside of the circle, trying to escape the "net" by swimming through the legs of the others (Kuschel, 1975).
Cooperative group vs. Cooperative group: Players cooperate with some players (positive interdependence; Johnson and Johnson, 1974) and compete with others (negative interdependence; Johnson and Johnson, 1974). The goals of the groups may be identical or non-identical, but they are mutually exclusive between the groups. For example, in soccer, players of one team cooperate to score more goals/points in their opponent's net (while the other team has the opposing goal of scoring in the opposite net). Te fafa tua (Game_ID: vait11, ABVD_language: Tuvalu, Common_name: leapfrog) is played by two teams of 10 or more players, one team of standers, the other of jumpers. The standers from a sturdy line in the sand by wrapping their arms around the waist and their chest on the buttocks of the player in front of them. One at a time, the jumpers take a running leap onto the backs of the standers, thus straddling the standers and piling up behind and on top of one another. The goal of the jumpers is to break the line of the standers. If a jumper falls from the standers, the teams switch roles. If a stander breaks the line or falls, the jumpers get a point and leap again (Kennedy, 1930).
Cooperative group: All players in this form of game cooperate to achieve the mutually shared goal of the game (positive interdependence; Johnson and Johnson, 1974). There is no competition between any of the players. For example, in a game of hacky sack or footbag, the goal of the game is to kick a small sack of grain back-and-forth between the players for as long as possible, without letting the hacky sack touch the ground. A similar game called te boiri (Game_ID: kiri02, Pulotu_culture: Kiribati, Common_name: kicking a ball in a circle) is played with a ball made out of pandanus leaves (Youd, 1961).
The number of players is often irrelevant to the goal structure of a game-players can join a game without changing the goal structure of that game. However, exceptions to this rule are the competitive units in the "competitive vs. solitary" games and "competitive vs. cooperative" games. If more than one player also competes against the other units in these two types of games, the competitive units become cooperative units because they share a common goal. Additionally, the goal structure of games, as defined here, only considers the player(s) engaged in the game; other people enabling gameplay (e.g., referees) are not included in the goal structure coding scheme.

Search criteria and methodology
Four main databases were used to systematically search for information on games in Austronesian cultural groups: the electronic Human Resource Area Files (eHRAF; Murdock, 1983), the resources listed on the Pulotu website (Watts et al., 2015), and two peer-reviewed journals (The Journal of the Polynesian Society (Allen, n.d.) and American Anthropologist (Thomas, n.d.)). An additional 12 data sources were opportunistically obtained by the first author. A total of 1738 sources of data were searched, 219 of which yielded information on games. Further information on data collection is described in subsequent sections. A list of the sources yielding information on games is provided on the GitHub repository.
All sources mentioned in these databases and meeting the criteria mentioned in the subsequent subsections were searched by the first author for passages on games according to the definition as described above. In cases where limited information on the game was provided by the original source (e.g., only the name of the game was mentioned, but not the rules), the information was included in the game database and potentially merged together with similarly referenced descriptions from other sources. In addition to the criteria mentioned below, only sources in the English and German languages were included in the search. Additional information on the geographic location of the society and language(s) spoken were also gathered from the original sources and matched to an Austronesian Basic Vocabulary Database code (ABVD; Greenhill et al., 2008) whenever possible.
For example, on the island of Yap, the game of v at was described around the turn of the 20th century by two separate authors as follows: "v at. Ballgame for boys and girls, always played with only one hand. A four-sided ball made of plaited green coconut pinnae is thrown into the air. The next player must try to hit it from below with the palm of his hand to give it a new blow and to throw it to the next player in the same way. If one player misses the ball, his neighbours pelt him with reserve balls which each one has in his other hand. Older persons also occasionally play" (Müller, 1917).
"First, there is the very popular ball-game. A fairly heavy, yet springy and flexible cube is plaited from two leaf pinnae of a coconut frond, the edges are not too sharp and are soon worn down sufficiently in the course of the game. The players form a circle and one tosses the "ball" into the air. As soon as it comes down, the one standing closest to it hits it strongly from below with the palm of his hand, so that it again flies high into the air, etc." (Salesius, 1906).
The two passages were identified as describing the same game and were coded as a game with a cooperative goal structure. The ethnolinguistic group also aligned with a cultural group on Pulotu (Watts et al., 2015), however, the Austronesian language phylogeny (Gray et al., 2009) does not include Yap (ABVD code: 77), thus, the game of v at would be excluded from analyses if the Austronesian language phylogeny (Gray et al., 2009) were to be used as a filtering criterion for games. A similar game played with a ball, and either using a hand or a foot, is described in eight ethnolinguistic groups included in our dataset. Using the Aus-troGames dataset, researchers could investigate questions about particular games, such as: other resources relevant to research on many cultural characteristics and practices. The following search criteria were used to collect information on games in the eHRAF: Pulotu. There were 743 possible resources listed on the Pulotu website (Watts et al., 2015), all of which were examined. The majority of the sources in Pulotu were books, thus, we created a search criteria to determine whether or not the source was relevant for our search on games. The following were the search criteria for Pulotu: • A general social aspect in the title, for example, "Life in...", "People of...".

•
If the source was unavailable as an electronic source (i.e., paper books, older PDFs): -Chapters on games, amusements, and childhood activities were searched for in the table of contents. If there was a possibility that games might be mentioned, the source was searched through by hand. -The sources were searched through based on: * the relevant chapter, * if there was no term index or clearly relevant chapter, the source was hand-searched for the following keywords: game, play, child(ren), amuse(ment), fun, sport(s).
Each source was subsequently examined for passages on rulebased games.
American Anthropologist. The American Anthropologist journal (Thomas, n.d.) is one of the oldest existing journals in anthropology today and publishes research articles on all aspects of anthropology. A total of 413 sources were searched using the following search criteria (final search date: October 2017): • "Game", AND Each source was subsequently examined for passages on rulebased games.
The Journal of the Polynesian Society. The Journal of the Polynesian Society (Allen, n.d.) is a valuable resource due to the geographic focus of the journal. A total of 374 sources were provided given the search criterion: "game" (final search date: January 2018). Each source was subsequently examined for passages on rule-based games.
Additional sources. Twelve additional sources were not systemically obtained. The sources were either: found in two local libraries (the Max Planck Institute for Evolutionary Anthropology and the Leipzig University libraries), or given to the first author by colleagues.

Data records
All data and code are available on Zenodo (see Leisterer-Peoples et al., 2021) and on Github: https://github.com/ccp-eva/ AustroGames. In addition to the raw game descriptions and coding (.csv), other files include a list of the sources from which passages on games were obtained, and an R (R Core Team, 2020) package to automatically load the data and conduct optional filtering steps. We provide cultural group codes from various databases-i.e., Pulotu (Watts et al., 2015), eHRAF (Murdock, 1983), Glottolog (Hammarström et al., 2020), ABVD (Greenhill et al., 2008), and D-Place (Kirby et al., 2016)-in the database, allowing researchers to cross-reference with other databases. In addition to the raw data files (.csv), we provide a metadata file (.json) to create a Cross-Linguistic Data Format (CLDF; Forkel et al., 2018). The CLDF offers a standardized and comparible format for linguistic and cultural datasets, and can be used in Python (van Rossum and de Boer, 1991). The raw game descriptions are available upon request due to copyright laws.
Variable definitions. Tables 1-4 list the variable names, as indicated in the data files (.csv), and provide a description of each variable. Each row in the "Games" data corresponds to a unique game in a cultural group. Each row in the "Cultures" data corresponds to a unique Austronesian Basic Vocabulary Database (ABVD; Greenhill et al., 2008) code. Other language identifiers are also provided-i.e., ISO-639-3 (SIL International, 2020), Glottolog (Hammarström et al., 2020). Each row in the "Descriptions" data corresponds to a unique description of a game, as mentioned in the original source. Each row in the "Sources" data corresponds to a unique publication describing a game. If multiple descriptions of a game in one cultural group were available, they were linked (see Record Linkage). For example, if a ball game played by Hawaiians was described by two sources, the "Game_ID" is listed twice in the "Descriptions" table, once for each description. If a description mentioned multiple games, then each corresponding "Game_ID" is listed in that row of the "Descriptions" file.
Descriptive statistics of games. We collected information on a total of 907 games in ethnolinguistic groups in the Austronesian language family. Each game may occur multiple times if it was described as being played by several ethnolinguistic groups; however, the game only appears once for each ethnolinguistic group. For example, if a game of baseball was played by Hawaiians and by the Māori, baseball is listed once for each ethnolinguistic group in our database and occurs twice in our database (i.e., once for each ethnolinguistic group). The exception to this rule is if a game with the same name was described with two different sets of rules (e.g., if two ethnographers described baseball in Hawaii played with different rules). In this case, both "versions" of the game of baseball would be listed as distinct games played in Hawaii. The number of games available for analysis will depend on the interests of each researcher. For example, a researcher interested in examining the goal structure of games in combination with the Austronesian language phylogeny (Gray et al., 2009) will acquire a total of 452 games from 55 ethnolinguistic groups after the necessary filtering steps (see Table 5 and the section "Filtering and coding of games" for optional filtering steps). The distribution of goal structures of games within each cultural group after these filtering steps is visualized in Fig. 2. For example, one of the findings is evident in Fig. 2: the distribution of the cooperativeness of games varies across cultural groups. Competitive (n = 228) and cooperative group vs. cooperative group games (n = 121) are the most common type of games in this filtered sub-sample (n = 452).

Technical validation
There were several steps involved in the preparation of the game data for research use. First, we assigned cultural group identifiers (i.e., language codes; see the section "Cultural group identifiers"). Then, we identified game descriptions within each cultural group that described the same game (see the section "Record linkage"). Additionally, we recommend filtering the games in several steps (see the section "Filtering and coding of games"). We provide reliable coding for most filtering steps. Depending on the interests of researchers and the usage of other databases in addition to the games data, researchers have the option to "turn off" or "turn on" each filtering step with the provided R (R Core Team, 2020) package.

Cultural group identifiers.
A cultural group is defined as an ethnolinguistic group, following Pulotu (Watts et al., 2015). Language codes from the Austronesian Basic Vocabulary Database (ABVD, Greenhill et al., 2008), Glottolog (Glottocodes, Hammarström et al., 2020) and ISO 639-3 database (SIL International, 2020) were assigned to each description using the geographic locations (i.e., city, town, country, coordinates) as mentioned in the original source.SMLP, JW, and SJG worked in collaboration to assign the language codes to games played by a A description can refer to multiple games and one game can be mentioned in multiple descriptions, as indicated in "Game_ID". Table 3 The variables and their definitions in the Games data.

Variable Definition
Game_ID Unique game identifier specific to cultural group as defined by ABVD Local_name Name(s) of the game as indicated by the original source(s) Common_name Common name(s) of the game Description_ID Refers to the Description_ID in Descriptions. csv Game Indicates whether the description qualifies as a game as defined earlier in this publication (1 = game, 0 = not a game) Game_uncertainty Uncertainty whether the description qualifies as a game Game_comments Comments regarding the game description or other aspects of the data ABVD_code Refers to the ABVD_code in Cultures.csv ABVD_uncertainty Uncertainty of the ABVD coding (1 = uncertainty) Goal_structure Indicates the goal structure of the game Goal_uncertainty Uncertainty in the goal structure coding (1 = uncertainty) Goal_comments Comments regarding the goal structure coding Introduced_keywords Indicates which keywords were found in the game description(s) Introduced_coding Whether the game description(s) indicate nonlocal origin (nonlocal local undetermined) Introduced_uncertainty Uncertainty in the introduced coding (1 = uncertainty) Introduced_comments Comments regarding the introduced coding Pulotu_time_ok_0 Indicates whether the 'traditional' time frame from Pulotu matches the time frame(s) from the game (1 = same time frame, 0 = different time frames) Pulotu_time_ok_50 Indicates whether the 'traditional' time frame from Pulotu matches the time frame(s) from the game ±50 years (1 = same time frame, 0 = different time frames) Each row in the Games table refers to a unique game played in a cultural group. given cultural group. Given the availability of ABVD codes in combination with the Austronesian language phylogeny (Greenhill et al., 2008) and the Pulotu database (Watts et al., 2015), ABVD codes were used in further validation steps. We also provide cultural group names, as indicated in other cultural databases-i.e., Pulotu (Watts et al., 2015), eHRAF (Murdock, 1983), D-Place (Kirby et al., 2016)-for additional cross-referencing. Multiple language code assignments are separated by semicolons.
Record linkage. To prevent descriptions of the same game within one cultural group from being assigned multiple game IDs, the descriptions of the games were linked (whenever possible) according to the name of the game, details of its play, geographic location and cultural group identifiers. If there was not enough information in the game descriptions to determine whether two descriptions described the same game, the descriptions were not linked. If multiple descriptions from one cultural group did describe the same game, then each of the description IDs will appear in the "Games.csv" under the column "Description_ID". Additionally, if a game played by one ethnolinguistic group was referenced in two descriptions, the corresponding "Game_ID" in the "Games.csv" will be listed in two different rows in the "Descriptions.csv", once for each description of the game played by the ethnolinguistic group.
Filtering and coding of games. As mentioned in the "Technical validation" section, researchers have the option to "turn on" or "turn off" each filtering step with the provided R (R Core Team, 2020) package, thereby including or excluding certain games. Games can be filtered with the following optional steps (see Table  5 for the sample sizes after each filtering step): • Combinations of descriptions that describe a rule-base game, as defined in this publication • Games with a location that could be assigned to an ABVD code (Greenhill et al., 2008) • Games with a goal structure code (see Fig. 1 and the section "Defining the goal structure of games" for codes) • Games of local or non-local origins • Games with ABVD codes corresponding with a cultural group in Pulotu (Watts et al., 2015) • Games with time foci matching the time foci in Pulotu (Watts et al., 2015) (±0 or 50 years) • Games with ABVD codes corresponding to a language on the Austronesian language phylogeny (Gray et al., 2009) Goal structure. The amount of information in and ambiguity of each game description varies considerably, making it difficult to consistently code the goal structure of games. A game was coded as "NA" in cases where the amount of information did not suffice to assign a goal structure code to a game. Additionally, the vocabulary used in the game descriptions varies from Early Modern English to Late Modern English, and occasionally German. All of the games were coded by the first author, whose mother tongue is English and who has fluency in German. Reliability coders (GC, SC) separately coded 15% and 25% of the game descriptions, respectively. Their mother-tongues are German and both have fluency in English. All inter-rater reliabilities were calculated in R (R Core Team, 2020, Version 3.6.3) with the irr package (Gamer et al., 2019).
The reliability of the goal structure coding was very good (κ = 0.94; see Table 6). Three rounds of coding were conducted. Round one of the coding was conducted during data collection by an intern (GC) using a small subset of the final data. After round one, the descriptions of the goal structures were elaborated to the goal structure coding presented in this paper. Rounds two and three were conducted with another intern (SG) after data collection was complete. After round two, the first author marked the disagreements between SG and the first author, and the reliability coder (SG) was asked to first determine whether there was enough information about the game to code the goal structure, and then to code the goal structure only if there was enough information. SG was not told that the games marked were disagreements. The two coders (SG and the first author) then met to discuss questions regarding the English expressions used in the game descriptions, after which SG finished coding. Disagreements between the two coders-along with alternative goal structure codes-are noted in "Games-Goal_uncertainty" and "Games-Goal_comments".
Introduced games. We coded whether authors described the game as local or non-local to the cultural group of interest, as mentioned in the linked game descriptions. With this coding, researchers can include or exclude games that are described as being introduced to the cultural group (i.e., foreign origin). For example, researchers interested in understanding the core functions of games might wish to examine only the games that were introduced into the cultural groups in order to understand which components of these games are integrated and which are dropped during the process of cultural transmission. Alternatively, researchers interested in the relationship between games and psychological aspects of culture might want to exclude games of non-local origin, as they might not reflect the norms and cultural values of the focal cultural group.
There were two steps involved in coding the origins of each game. In the first step, game descriptions were searched through to locate keywords that might indicate the origin of the game. These keywords listed in no particular order: origin, former(ly), past, introduce(d by), introduction (of), tradition(al), generation, ancient, historic(al), authentic(ity), convention, native, mission (ary/aries), custom(s), foreign(ers), import(ed), settlement, church, American, Japanese, English, Europe(an), Chinese, Spanish, British, Arab(ia), Dutch, French. These keywords did not necessarily indicate the traditional or foreign origin of the game. Game and ABVD assigned to the game is on language phylogeny 694 63 Filters 1, 2, 3, 4a, 5a, 6 53 10 Filters 1, 2, 3, 4a, 5b, 6 172 27 The final two rows exemplify the sample sizes after applying multiple filters. The number column (No.) is for quick reference and is irrelevant for the order in which the filters are executed.
In the second step, the game descriptions with at least one keyword were coded to determine whether the games were of non-local or local origin. Only game descriptions with keywords were coded for their origin. A game was coded "nonlocal" if there was evidence that it was of the non-local origin or introduced into the cultural group (e.g., by missionaries, neighbouring groups, etc.). The game was coded "local" if there was evidence that the game was created within the group (e.g., played for generations). If there was insufficient evidence to determine the origin of the game, "undetermined" was coded. All games that did not mention at least one keyword were coded with "NA".
All of the combined game descriptions mentioning at least one keyword were coded by the first author and a reliability coder (NL) coded 25% of these game descriptions. The inter-rater reliability was calculated in R (R Core Team, 2020, Version 3.6.3) and with the irr package (Gamer et al., 2019). The reliability of the origin coding was low (κ = 0.487). However, of the 19 disagreements between the coders, 5 of them were coded as "nonlocal" by one coder and undetermined by the other. There were also no cases in which a game was considered "nonlocal" by one coder and "local" by the other. Thus, a majority of the disagreements among coders was in distinguishing between games of local origins and game descriptions providing insufficient information on the origins of the game. To doublecheck this claim, the origin coding was re-coded into a binary format: "keep" (undetermined or local) and "exclude" (nonlocal). Reliability on the binary origin coding was good (κ = 0.808), thus, coders reliably coded when a game was not local, but not when a game was described as being local or had insufficient information on the game origin. The uncertainty in this coding and disagreements between the coders is provided in the database (i.e., in "Games", see "Introduced_comments" and "Introduced_uncertainty").
Cross-referencing with other databases. The ABVD code(s) assigned to each game in "Games-Game_ABVD_code" were matched with the ABVD codes in Pulotu (Watts et al., 2015), ABVD (Greenhill et al., 2008), glottolog (Hammarström et al., 2020), eHRAF (Murdock and White, 1969), and D-Place (Kirby et al., 2016) for cross-referencing. If Pulotu provided multiple ABVD codes, we provide all of the ABVD codes that matched with the ABVD code assigned to a game (ABVD_code). For example, if   Fig. 2 The number of games with each goal structure found in each cultural group after applying several filters (Filters 1, 2, 3, and 6 in Table 5), mapped onto the pruned Austronesian language phylogeny (Gray et al., 2009) (n = 452). The colourful bar graphs represent the number of the goal structure of games found in each ethnolinguistic group. The tips on the phylogeny indicate the language associated with the ethnolinguistic group. We used the ape (Paradis and Schliep, 2018), ggtree (Yu, 2020;Yu et al., 2018Yu et al., , 2017, and ggplot2 (Wickham, 2016) packages in R (R Core Team, 2020) to create this graphic. Table 6 Inter-rater reliability scores (Cohen's kappa (Cohen, 1960), unweighted) for the goal structure coding and the "introduced" coding of games. Pulotu time frame. The original sources of game descriptions were searched through for information on the field dates of author visitation. The field dates were recorded in specific years or ranges of years. If field dates were not available, we searched for focus dates set by the author. For example, a publication from 2005 retrospectively writing about Hawaiian culture in 1898 would receive a "focus" date of 1898, although the author was not present (i.e., no field date) at the time. If a field date or a focus date was not mentioned in the source, and a brief search using search engines for information on the author's travels revealed no specific dates, the publication date was recorded. This time frame information is available in the dataset under "Time_frame".
Pulotu (Watts et al., 2015) provides "traditional time foci" which can be used to filter out games that were not played at the same time that cultural variables in Pulotu were described. We give researchers the option to match the time foci from the cultural variables with the time foci of the games to ensure that the games and the cultural variables were described at similar points in time. This reduces the possibility that the games were played at a much later time than the cultural variables of the cultural group, or vice versa. The time foci for the cultural variables were provided by the Pulotu database (Watts et al., 2015). Additionally, we give researchers the option to take the exact time foci from Pulotu (Pulotu_time_ok_0) or the time foci ±50 years (Pulotu_time_ok_50).
For example, researchers might not wish to assume that a game described in 1970 reflects cultural variables provided by Pulotu from 1830. To detect such issues, we matched the time foci of the games with the time foci of the cultural variables (with an optional buffer of ±50 years). Thus, if the cultural variables in Pulotu were from 1820 to 1850, a game that was described in 1810 would still be kept in the dataset (i.e., the game was within the ±50-year time frame: 1770-1900), while a game from 1970 would be excluded.
Austronesian language phylogeny. As mentioned in a previous section (see the section "Cultural group identifiers"), each description was assigned an ABVD code, if possible. In a subsequent step, we matched these ABVD codes to the ABVD codes on the constructed Austronesian language phylogeny (Gray et al., 2009). The Austronesian language phylogeny used in this study was constructed by Gray et al. (2009) using 210 basic vocabulary items. Only some of the languages in the Austronesian language family correspond to languages on the Austronesian language phylogeny used in this study (Gray et al., 2009); thus, in many cases, there was no match between games and the language phylogeny (i.e., the game's ABVD code did not correspond with any branch on the phylogeny). The sample size after applying this filter and others is provided in Table 5.

Research opportunities
This dataset contains rich information on games and play from Austronesian-speaking cultural groups. Cultural anthropologists, psychologists, and those interested in comparative research can use the data to generate large-scale examinations of games and play across cultures. This is a unique dataset, as no other largescale examination of games across cultures has made their data available in a codeable format (e.g., Murdock and White, 1969;Roberts et al., 1959).
Researchers coding new aspects of this dataset are asked to consider forking and merging their own coding back to the main dataset hosted on GitHub. This will allow interested researchers to help grow our cumulative knowledge of games. This dataset provides researchers with opportunities to examine relationships between cultural variables and games, as well as study cultural change and diversity.
Researchers are encouraged to code other aspects of these games, such as the type of game (i.e., strategy, physical skill, chance; Roberts et al., 1959), the psychological interdependence of players (Eifermann, 1970), the ages of players, or the objects used in the games. For example, researchers could examine the role games might play in children's social learning across cultures (Boyette, 2016b), or whether the distribution of games relates to other (cultural) variables such as social interaction patterns (e.g., Barry and Roberts, 1972;Khouri, 1976), political stratification (e.g., Peregrine, 2008;Silver, 1978), or child socialization Sutton-Smith, 1962, 1966).
In addition to the research questions mentioned throughout this paper, researchers can use the dataset to answer questions about cultural evolution, human child development, and the role of games in cultural groups. Researchers can use the data to run phylogenetically informed analyses, such as ancestral state inference for certain games or game goal structures, the coevolution of game traits and traits of cultures, or the spread of games across Oceania. Researchers should keep in mind that the dataset provided here is not a complete collection of all games played by these ethnolinguistic groups, but provide a solid starting point for researchers interested in games.

Data availability
The R code (R Core Team, 2020, Version 4.0.3) for data filtering and the AustroGames dataset can be found on GitHub: https:// github.com/ccp-eva/AustroGames and on Zenodo: https:// zenodo.org/record/4675217. The data are in the .csv format to ease human coding of new aspects of games. Additionally, users interested in a machine-friendly version of the data are encouraged to create a Cross-Linguistic Data Format (CLDF, Forkel et al., 2018) by using the .json file provided. Additional information on the CLDF (Forkel et al., 2018) and reading CLDF into R (R Core Team, 2020) can be found here: https://github.com/cldf and https://github.com/SimonGreenhill/rcldf. The code used to pre-process the data published here and to create Fig. 2 is available upon request. The raw game descriptions are available upon request. Users of the dataset or code are asked to cite this publication and the data (Leisterer-Peoples et al., 2021).