To the Editor — Collaborative and creative communities are more equitable when all contributions to a project are acknowledged. Equitable communities are, in turn, more transparent, more accessible to newcomers, and more encouraging of innovation — hence we should foster these communities, starting with proper attribution of credit. However, to date, no standard and comprehensive contribution acknowledgement system exists in open source, not just for software development but for the broader ecosystems of conferences, organization and outreach efforts, and technical knowledge. Furthermore, both closed and open source projects are built on a complex web of open source dependencies, and we lack a nuanced understanding of who creates and maintains these projects1. As a result, large sums and efforts go to open source software projects without knowing whom the investments support and where they have impact2.

Academia faces a similar recognition problem. Attribution is often collapsed to ‘authorship’, yet increases in the size and complexity of scientific teams are colliding with this narrow definition of contribution3. Focusing only on authorship hides much of the work necessary to publish research4. Since this hidden work is performed disproportionately by people from underrepresented communities, the full picture of who is doing work is not accurately represented5. Fortunately, new models of recognition are gaining widespread adoption. One example is the CRediT framework, a taxonomy created for ‘contributorship, not authorship’, to more fully represent the roles that people play in creating research outputs3. Contributor roles are categorized by tasks and stages of the research process6, allowing multiple people to perform the same role or the same person to perform multiple roles. Since its inception, the CRediT taxonomy has been widely adopted (by 33 major publishers so far) in part because it can not only contribute to equity within a project, but also potentially provide the major benefit of standardizing credit across projects and communities7.

Open source ecosystems need to follow suit and adopt a standard taxonomy of contributor roles. As in academic research, modern open source is a highly collaborative and complex task environment, where overly broad or poorly defined roles easily obfuscate the work of many2. For example, a broadly defined role of ‘code contributor’ fails to distinguish between specific tasks such as adding features, fixing bugs or taking other actions that directly edit the source code. Likewise, other substantial contributions, such as organizing meetings, providing outreach or performing other activities that leave no visible trace within the code, are often neglected. Indeed, some important contributions occur entirely outside of common open source development platforms such as GitHub and often go unrecognized8. To succeed, a taxonomy of roles should be simple, comprehensive, use clearly defined non-overlapping categories, represent different types of contributions equally, and must avoid favoring specific platforms.

Existing efforts to recognize contributions to open source are laudable, but gaps remain. For instance, many open source projects include ad hoc attribution lists like ‘credit files’, without consistent attribution categories. Some approaches propose taxonomies (for example, All Contributors), but do not include clear guidelines around how to apply the proposed taxonomy to various contributions or may miss entire categories altogether. Any confusion around the interpretation of the taxonomy, where different communities could interpret the categories as they wish, negates the benefit of standardizing credit across projects and communities.

Meanwhile, platform-specific metrics like GitHub’s ‘contributor count’ are some of the most visible contribution indicators but are limited in scope and do not generalize to contributions outside the platform. Data-driven efforts that extract contributions automatically from source code, version control records, or other platform data are further limited to only those activities explicitly recorded within the data (see, for example, octohatrack8 and name-your-contributors). Despite its importance for open source, efforts to recognize contributions more broadly have yet to be widely adopted.

What obstacles have prevented the adoption of a broader, standard recognition model? If CRediT can teach us anything, it is that standards should emerge from the community, undergo many iterations and rounds of feedback, and receive buy-in from major relevant institutions and involved parties. The CRediT taxonomy resulted from a long categorization effort and is a prime example of a working contributor taxonomy. Research leveraging CRediT data is only ramping up7, yet adoption of the framework continues to rise through community support. A successful taxonomy for open source should develop through a similar community peer review. Just as academic institutions and publishers are embracing the CRediT model, open source contributions need the same attention.

A standard taxonomy of recognized contributions will benefit all levels of open source. Contributors will gain credits beyond the code, providing a clearer signaling of their work for the community. Projects will be able to measure their growth and evaluate their culture. Structural biases will be brought to light, helping to foster more equitable open source communities. Everyone will better understand the interconnected structure of skills, projects and contributions across the broader ecosystems of open source.