Collaborations in the digital humanities

What a “digital humanities project” is can be easily identified since definitions are widely available. We can retrieve one quickly upon opening the webpage: https://whatisdigitalhumanities.com/. The problem is that this helpful page will provide a new definition almost every time the user refreshes it. Altogether, at the time of writing, it alternates 817 different definitions. This richness of descriptions and the debates behind them are somewhat characteristics of the widely discussed field of digital humanities. However, some common elements appear in most definitions. Among these, we find large datasets that are approached with tools to create a representation of the past. There is also an emphasis on reflecting on these practices and understanding how digital changes the research of historians. And most definitions emphasize that such cross-disciplinary collaborations not only use available digital datasets and tools but also develop them (Kemman, 2021, p. 9). All these and many other elements of the definitions apply to the project studied in this article, entitled “DECRYPT: Decryption of historical manuscripts,” supported by the Swedish Research Council (2018–2024, grant 2018-06074).

Although the term “digital humanities” was only coined in 2004, digital humanities collaborations have a long history, since the aftermath of the Second World War, when the argument was first made that historians and other scholars in the humanities should make better use of computers. In the following seven decades, there have been many different collaborations between IT experts and software developers on the one hand and historians, librarians, literary scholars, and archivists on the other. These collaborations were of varying nature and often unrelated: microfilms projects, which included source selection in the archives and libraries since the 1960s, the quantitative turn in history in the mid-1980s, developing library catalogs and databases since the 1980s, the Google Books (2004) and the Google Scholar (2004) projects, etc. They included negotiations between the two sides, among whom there was often some kind of a power asymmetry. To use a technical term to be introduced below, these projects became “trading zones” between the librarians, historians, and other scholars in the humanities on the one hand and the technologists on the other (Kemman, 2021; Fickers, 2012; Foka et al., 2018; Milligan, 2019; Rosenzweig, 2003; Webster, 2017; Zundert van, 2016; Zundert van and Dekker, 2017).

Analyzing digital humanities collaborations poses a particular challenge because, within humanities, the myth of the lonely scholar who goes to archives alone and writes their single-authored monograph in the solitude of their study room is still prevalent. As pointed out, it is precisely in digital humanities that a certain bridge is constructed between the two cultures described long ago famously by Charles Percy Snow (1959). In a digital humanities teamwork, different skills and sets of expertise are joined, and common goals and research techniques are negotiated not only at the project’s beginning but continuously throughout the course of it. Participants often find themselves outside of their comfort zone. They must adopt new vocabularies, practices, and methods and cooperate with scholars who speak different languages and live in a separate disciplinary culture (Kemman, 2021, p. 60).

Many exciting analyses were published on collaborative research across disciplinary boundaries (Cummings and Kiesler, 2005; Walsh and Maloney, 2007; Siemens, 2009; Tsai et al., 2016, Lastilla et al., 2022). However, the challenge posed by the collaborations of participants socialized in different disciplines, following different methodologies and speaking different terminologies is particularly well met by the intellectual toolkit of another research field, Science and Technology Studies (also called Social Studies of Science), that will be applied in this article.

The STS approach to studying collaborations

The terminology and methods developed in Science and Technology Studies have proved particularly fruitful in analyzing highly cross-disciplinary interactions. Harry Collins and his co-authors expanded on the notion of trading zones in a short but classic article (Collins et al., 2007), and Max Kemman offered a book-length study to analyze the negotiations and practices between historians and computational experts (Kemman, 2021). In these and other studies—that serve as models for our present project—such notions as the “trading zones” (originally introduced by Galison, 1996), “boundary objects” (first discussed by Star and Griesemer, 1989), “interactional” and “referred expertise” (Collins and Evans, 2007; Collins and Sanders, 2007), and “the ambassadorial model” (Collins et al., 2017) serve to show how cooperation is possible between scholars socialized in different scientific milieus (in other words: epistemic cultures, to use the expression of Karin Knorr Cetina), by interacting and thereby creating knowledge together determine how we know what we know (Knorr Cetina, 1999). Our goal below is to apply a similar methodology to an—even more—heterogeneous cross-disciplinary collaboration, with several epistemic cultures trying to collaborate.

Peter Galison originally put forward the expression of trading zones to explain local coordination of practices between two communities that do not cooperate on a global scale. His original study examined the interactions between experimental and theoretical physicists interacting and negotiating a joint enterprise. He defined the trading zone as “an arena in which radically different activities could be locally, but not globally, coordinated” (Galison, 1996, p. 119, emphasis in the original) and aimed to explain how communication is accomplished when there is a degree of incommensurability between communities “talking different languages.” The word trading zone is not understood as the place of an economic transaction in which goods are exchanged but rather as a negotiating zone, which provides a “local understanding of an entity without sharing the full apparatus of meanings, symbols, and values in which each of us might embed it” (Galison, 1996).

Building on his terminology, Harry Collins, Robert Evans, and Mike Gorman offer a dynamic trading zone typology, which is illustrated in Fig. 1.

Fig. 1: Trading zone typology.
figure 1

Reproduced from Collins et al. (2007).

On the one hand, the authors differentiate between interactions of homogeneous and heterogeneous epistemic cultures and between coercive and collaborative projects (where power relations of the collaborating parties are asymmetric and symmetric, respectively), on the other. The upper left category of the chart hosts collaborative homogeneous interactions, for which such disciplines are the best examples that have been formed between two different disciplines but became independent endeavors with a full-fledged language: biochemistry and nanoscience. A coercive homogeneous (in other words: subversive) case is when one party or community has control over the other party or community. Examples of this case brought by the authors are the company McDonald’s, Einsteinian physics, or the Windows system, which colonizes the totality of a particular territory (even if there are traces of resistance). In the coercive heterogeneous case, the two communities remain distinct; the dominant community protects its expertise against the subordinate community without being interested in learning from it. This happens when historians or librarians simply order a project from the computational experts for their own purposes, without getting acquainted with the perspective of the latter community, or vice versa, when computational practices end up replacing the historical approach, and programming takes the central role at the expense of reading. Digital humanities projects with asymmetric power relations can be classified in this corner of the chart.

And finally, in the collaborative heterogeneous corner, we find well-functioning digital humanities interactions, sharing practices in two ways: related to common objects, the so-called boundary objects, and communicating through various epistemic cultures with the help of interactional expertise. Before we proceed, these two notions as being so central to cross-disciplinary cooperations shall be explained first.

Interactional expertise is a relatively newly coined term by Harry Collins and his colleagues (Collins and Evans, 2007; Collins and Sanders, 2007; Collins and Evans, 2015). The easiest way to understand it starts by looking at the periodic table of expertise constructed by the same authors (Fig. 2). At the right end of the line of specialist expertise, we find the so-called contributory expertise, which is the usual type of knowledge when a scholar, a scientist, or an expert contributes effectively to a domain of practices—does experiments, publishes articles, gets quoted by other experts (see also: Collins et al., 2016). This is the ability to carry out research in a given field. This is not a new notion; we have always understood a high level of scholarly expertise in that way. Left to it in the table, however, we find interactional expertise. Interactional expertise differs from contributory in that those mastering it do not contribute to the given field of practice, cannot handle experimental settings, or do research. However, due to successful linguistic socialization, they know how to speak a common language with the scientists (Collins et al., 2007, p. 661). This ability might reach the level when a scholar having interactional expertise cannot even be recognized in (well-defined, usually written) interactions as someone not having proper contributory expertise; in other words, the scholar passes the imitation game (Collins and Evans, 2015). The phenomenon was recognized by Harry Collins, who had attended conferences in gravitational-wave physics and engaged in scholarly debates with the contributory experts of that field so long that he learned how to speak the common language with them. As Collins writes: “While acquisition of interactional expertise does not provide full grasp of the strange form of life—it provides no access to the other parties material culture except in so far as that material culture is represented in discourse—it is surprising how much can be done, is done, and, indeed, must be done, with the language fraction alone (Evans and Collins, 2010, p. 661).

Fig. 2: Different types of expertise.
figure 2

Image reproduced from Collins (2018).

Interactional expertise is particularly relevant in cross-disciplinary cooperations. Gaining contributory expertise in each other’s field is neither possible nor needed in such a collaboration. Still, due to the interactions of the project, and the social immersion in the team members’ fields, the participants gain a high level of knowledge, typically interactional expertise in those fields other than theirs. Managers and principal investigators (PIs) of cross-disciplinary projects are the chief examples of interactional experts.

However, interactional expertise is not the only thing a PI might master. In a different article, Harry Collins and Gary Sanders elaborate on a special kind of expertise that appears at the right end of the meta-expertises in the periodic table: referred expertise (Collins and Sanders, 2007). Carrying out interviews with two managers of scientific projects, they document how referred expertise comes into play when interacting with the team members. This kind of expertise covers skills “that have been learned in one scientific area are indirectly applied to another.” (Collins and Sanders, 2007, p. 622). The notion is coined to the analogy of referred pain, and it can be easily grasped by an example given in the appendix of the article (co-authored by Jeff Shrager). Imagine that a professor of cultural studies and an electrician have to become plumbers. Having contributory expertise in two fields different from plumbing, they both must learn entirely new skills. However, the electrician can have a much larger basis to indirectly apply some of his expertise to the new task. “These might include knowing what it is like to work in a manual job; knowing what it is to learn a manual skill and how to go about it efficiently—and that includes knowing how hard it is to learn such a skill and therefore how much effort and practice is needed; knowing how to deal with customers; knowing how much to charge; knowing how to extract money…” (Collins and Sanders, 2007, p. 640). Similarly, the PI or the manager of a cross-disciplinary project has her research field, where she has contributory knowledge and what she knows about it (how to do scientific research, what counts as evidence, how to write a paper, how to publish an article) can often be indirectly extended to the fields involved in the cooperation.

A last category related to the cross-disciplinary expeditions of PIs and project managers is the ambassadorial model. In contrast to the referred experts who learn their discipline in their home community, and apply it to another, the “ambassadors undertake an expedition to someone else’s community to learn a new trade” (Collins et al., 2017, p. 5).

Besides interactional expertise, one more technical term is indicated in the collaborative heterogeneous corner of the chart drawn by Collins and his co-authors: boundary objects. This expression became famous thanks to a classic article by Susan Leigh Star and James R. Griesemer, entitled “Institutional Ecology, ‘Translations’ and Boundary Objects: Amateurs and Professionals in Berkeley’s Museum of Vertebrate Zoology, 1907–39.” They call boundary objects those “objects which are both plastic enough to adapt to local needs and the constraints of the several parties employing them, yet robust enough to maintain a common identity across sites” (Star and Griesemer, 1989). These are objects (understood in a broad sense of the word) that are interpreted as different things by different communities. A typical example is the cowry shell that—being the same tangible object—bears completely different meaning to the biologist studying it, to the indigenous community who pays with it, to the member of another tribe for whom it contains the soul of their ancestors, and someone from a third tribe, who uses it as a piece of jewelry. A perhaps closer example is a manuscript letter from the archive that is a source on which to build a narrative for the historian and a piece of evidence to use in training a language model for a computational linguist (Kemman, 2021, p. 54). Star and Griesemer’s examples are the specimens, the field notes, the maps, and the museum itself, which are entities bearing different meanings to the sponsors, the theorists, the amateurs, the museum directors, the administrators, and the functionaries, all those people who collaborate around these boundary objects.

In sum, interactional expertise and boundary objects are cited by Collins, Evans, and Gorman to show how collaboration is possible between disciplinary fields that coordinate practices, communicate with each other, and cooperate around the same objects, but remain distinct (heterogenous) fields that do not merge into one another.

One could possibly argue that this typology depicted on Fig. 1 lacks dynamics, it describes collaborations as stable phenomena which do not have histories and do not evolve from one state to another one. While Collins and his co-authors do not wish to grasp the dynamics of interdisciplinary collaborations, in the last part of the article, they describe an interesting—possible, but by no means compulsory—trajectory between the various sections of the table that a collaboration might run. Real-life cooperations might find themselves in basically all segments of the chart in different phases of their evolution, moving from one to another (Collins et al., 2007).

Further STS terms can be well applied to study scientific cooperations (Zemplén, 2019). It is enough to quote one detailed case for our purpose. In his monograph, Max Kemman expands on the notion and typology of trading zones as a lieu of the shallow sharing and exchanging of concepts and practices in different local settings. As he phrases his project: “Historians can share local understandings of concepts from computer science that are relevant to a task, without needing to understand the entirety of computer science or become computer scientists themselves.” He argues that digital history is by definition a meeting point of actors coming from particularly distant positions, the humanities and hard sciences (Kemman, 2021, p. 175).

Kemman uses and further develops the above-mentioned model by Collins and his colleagues, introducing a line that crosses the table diagonally from bottom-left to top-right and divides it into two parts: the connected and the disconnected. This improvement further differentiates the heterogeneous collaborative and the subversive segments.

The subversive corner hosts communities that become homogeneous through one community shaping the practices of the other. A typical example is when historians, as end-users, commission a digital tool from IT experts for research but are not interested in the disciplinary culture of their collaborators. Next to that, in the lower right corner, we find the asymmetric-heterogeneous (enforced) trading zone, where the two communities remain distinct while one community shapes the practices of the other. Kemman argues that this is the bad scenario in digital humanities collaborations: the cooperating disciplines do not merge into one another, but one of them dominates the common practices.

The top right corner hosts the symmetric-heterogeneous (fractioned) trading zones, where the communities deliberately remain distinct while interacting. This is the proper place for a well-functioning digital humanities cooperation. There are two sub-segments here. In the disconnected one, there is a greater distance between the interacting scholars. Each develops their own perspective on the objects under investigation, which is why boundary objects appear in this segment. In the connected one, the scholars, particularly the PI or the “broker” learn enough about the interacting communities to understand their practices. They do not simply commission tools from the communities; instead, they become capable of talking the language of each community while—of course—not converting into contributing experts. A typical example is a historian who participates in a digital humanities collaboration to the extent that they might learn to read and discuss publications from computer science without the ability to publish computer science work themselves.

This model proved fruitful in analyzing digital humanities collaborations in Kemman’s monograph. The direction in which the present study makes a further step is that we are going to use the same framework for a fairly similar purpose; however, the digital humanities collaboration to be studied embraces more than two distinct fields, different communities that collaborate around boundary objects and develop interactional and referred expertise in the course of the meetings as a result of a social immersion in each other’s fields.

Our research question might be formulated like this: “Do the STS methodologies to analyze cross-disciplinary cooperations function in multi-party/multi-disciplinary contexts in the same way as they do in two-party/duo-disciplinary contexts?” In order to answer the question, we first describe the project and its cross-disciplinary nature.

The DECRYPT project: aims and tools

The long-term goal of the DECRYPT project is to establish a new cross-disciplinary subject of historical cryptology to shed light on the usage, content, and development of historical ciphers throughout the centuries in Europe. In order to do so, the project aims at building a research infrastructure for historical cryptology to collect, digitize, process, and decrypt historical encrypted sources and release these through a web service with information about their provenance and other facts of relevance (Megyesi et al., 2020). To achieve the rather ambitious goal, resources in terms of ciphers and keys along with historical non-encrypted sources, are collected and processed to be analyzed for which tools for transcription and decryption are developed.

Hitherto many scholars and scientists have been working on cracking single ciphers and in an uncoordinated fashion from different and complementary areas such as history, linguistics, philology, computer science, cryptology, and computational linguistics, all with their own point of view, purpose, and methods. Some people focus on cracking and interpreting single ciphers, while others are interested in developing tools to allow others to process encrypted sources. However, no matter which scientific fields the interests come from, they usually encounter the same or similar problems when confronted with encrypted documents. By bringing the expertise of the various disciplines together to collect and digitize encrypted sources and develop software tools for automatic or semi-automatic decryption the project can establish the new scientific subject of historical cryptology, release data of encrypted sources, and make tools publicly available to help users in transcribing and decrypting the manuscripts. The project is financed by the Swedish Research Council between 2018 and 2024 and is part of a special call to support high quality cross-disciplinary research in Sweden. The project received 29.5 million SEK (ca 3 million Euros) over 6 years.

Since the project started, the cross-disciplinary team has collected over 7000 encrypted sources from 13 countries in Europe originating from early modern times and released them in a publicly available database (Megyesi et al., 2019; Héder and Megyesi, 2022). To study historical ciphers, historical corpora have been collected and released as part of the HistCorp collection for 17 European languages along with language models and tools for spelling normalization (Pettersson and Megyesi, 2018). For the processing and analysis of ciphertexts, the team is developing an interactive transcription tool allowing the use of AI in terms of hand-written text recognition models developed for various symbol systems (Chen et al., 2021; Szigeti and Héder, 2022; Souibgui et al., 2022). In order to decrypt the ciphers, a (semi-)automatic decryption tool, CrypToolFootnote 1, has been developed that helps the user break ciphers of various types.

The DECRYPT project: collaboration structure and management

The project brings together experts from different fields, notably computational linguistics and computer science, including image processing, cryptology, linguistics, and history. The mentioned disciplines all have their own input in this cross-disciplinary project and benefit from the cooperation. Historians and linguists benefit from the decoded documents, leading to new knowledge and better understanding of our history and historical languages; computer scientists, cryptologists, and computational linguists working on developing methods for automatic decryption of various types of ciphers get access to a heterogeneous collection of ciphertexts and code material from linguists and historians, which in turn can lead to new methodological insights in language technology applications. Librarians and archivists get a correct identification and description of the encrypted documents that lie hidden in the collections they are guarding (Megyesi et al., 2020).

The project is led by a professor of computational linguistics with a strong cross-disciplinary background having one leg in the humanities, especially linguistics and language studies, and the other in computer science, with specialization in AI applied to natural language processing. Her research has been dealing with various themes within digital humanities to develop and provide language technology tools for the humanities and social sciences. The project leader has previously gained experience in the management of research and teaching in higher education in her role as the head of the department.

The DECRYPT project was planned by the PI and six co-applicants, two researchers in cryptanalysis in Germany, an expert in image processing in Spain, a historian from Hungary, and a computational linguist and a linguist/philologist in Sweden. The researchers in the project also cooperate with other teams internationally, not only within language technology and cryptology, but also with historians, historical linguists, librarians, and codicologists.

In addition, the project has recruited ten people in various areas on various levels, both junior and more senior members: a senior computer scientist for working on the implementation of the models and tools, a cryptologist, a historian, three research assistants for data management, one PhD student in history and one in image processing, and two postdocs/researchers in cryptology. The project also involves a handful of associated partners, historians, and cryptologists who contribute to some part of the work for a shorter or longer period of time, for example, in the collection of encrypted sources or decryption of them. New members, mostly thesis students, join the project each year led by senior members of the team. The project cooperates with over 100 users who give feedback on the developed resources and tools.

The project builds heavily upon the cooperation of the involved disciplines. Since day one, the group planned, carried out, discussed, and followed up on all parts of the project in project meetings (physical and digital) in various constellations. All members of the group meet on a regular basis three or four times per year during two-three days in general meetings. In 2019 and early 2020, three general meetings took place at the start of the project. Throughout the COVID-19 pandemic, all general meetings were conducted online over two days. Since 2022, two general project meetings have resumed in-person annually. Each general meeting has a given structure to make people be able to prepare well in advance. A preliminary agenda is sent out two weeks before the meeting with the opportunity for everyone to make suggestions for presentations, discussions, information, etc. A final agenda is announced on the project wiki one week before the meeting and the presenters are strongly encouraged to upload their presentations in connection to the meeting.

Crucial elements of cooperation are cross-disciplinary subgroups created around certain topics. These theme groups meet (more or less) frequently to discuss their work, which they present at the general meetings. Such teams typically create a tool, crack a cipher, or answer certain research questions. In order to do so efficiently, they meet on a regular basis where they report the work the members have been carrying out, discuss the problems and results, and plan and coordinate the next steps. If the team is larger, they typically appoint a group leader, a convener whose task is to lead the meetings and arrange the new ones. Members usually take turns taking notes.

The highly cross-disciplinary nature of the research group requires an open climate where basic and advanced questions and comments are welcome. During the first year of the project, the team devoted much time to basic lectures on various topics to learn about each other’s area and to develop a common terminology (see more below). A challenge is to efficiently include new project members into the work, who are mainly assistants and/or thesis students who are new to the field and do not know the team. They are appointed a supervisor who also serves as a mentor for daily communication. They are also given the opportunity to present their work at a general meeting which gives them the possibility to practice academic presentation and discussion of their research by the cross-disciplinary team. The project thereby raises a new generation of young career academics with insight into cross-disciplinary projects.

To keep everybody informed about important project-related issues, wiki pages were created with information about ongoing papers, publications, talks, references, and information about the project meetings with meeting notes from each. Bigger theme groups also take minutes of their meetings and publish them on the project pages to make these accessible to the whole team. In addition, the team has an email list to make communication smooth among its members.

The project team has produced over 60 scientific publications during the first four years of the project (see the publication list at de-crypt.org). Team members published in journals and conference proceedings devoted to the involved disciplines including cryptology, computational linguistics, image processing, history, and linguistics. However, submitting papers to “single-subject” areas with a cross-disciplinary topic has been shown to be a challenge sometimes.

The content of the papers witnesses the various research questions the team members and the authors of the papers addressed in a cross-disciplinary manner. Below, we give some examples of common topics that have been published by the team.

Data collection and description is carried out by historians, cryptologists, and computational linguists, associated with the project with the aim to get a representative sample from various time periods and areas in Europe. The historical texts are collected and processed by philologists and computational linguists. From the start, project members made an attempt to meet the demands of all disciplines to provide a great variety of sources so that various image processing tools and automatized codebreaking methods could be tested and developed. Standardized data were created for training, validation, and testing of various algorithms to make results comparable and progress measurable over time.

Development of the transcription algorithms requires manually transcribed data and guidelines to systematize the transcription across symbol sets, which have been taken care of by computational linguists and linguists. The automatic recognition and analysis of symbols, including binarization of images, line and character segmentation, and models for transcription of various symbol sets and hand-writing styles, along with algorithm development require not only expertise in computer vision but also competence in philology and computational linguistics.

For codebreaking and decryption, participants of the project started by investigating cipher keys to see what type of linguistic entities could be expected to be encrypted. In parallel, existing ciphertext and plaintext documents were mapped to recover the keys. On the basis of the keys, more educated guesses could be made on the structure of the remaining ciphers and develop automated codebreaking algorithms for the decryption of those (Lasry et al., 2021). The new algorithms were implemented into the decipherment software CrypTool. New ciphers that were uploaded into the database by historians and transcribed by computational linguists were analyzed by cryptologists. The application of the new, improved decipherment algorithms led successfully to the decipherment of several encrypted sources (e.g. Kopal and Waldispühl, 2022). Moreover, the deciphered texts were contextualized linguistically and historically (Kopal and Waldispühl, 2022, Lasry et al., 2023).

Members of the team have been inspired by and learned a great deal from each other, discovering new perspectives, and new methods. For example, computational linguists intend to apply unsupervised deep-learning methods to the automatic detection of plaintext sequences in keys and ciphertexts, methods learned from the image processing experiments. Image processing experts, on the other hand, test transfer learning which is commonly used at the time of writing in computational linguistics to adapt transcription to particular hand-writing styles. The image processing experts also emphasize that they use philological standards for transcriptions, something that they have not planned originally. Cryptologists include historical language models from various time periods to improve cryptanalysis, something they have not considered before. In the future, the project members aim to further strengthen the cooperation between the specific areas, especially between the image processing and decryption part, which is the next, natural step when they connect the interactive transcription and decryption tools to one single, user-friendly web service.

Methodology

To analyze the cross-disciplinary nature, strengths, and challenges of the DECRYPT project, the authors employed a mixed methodology, combining observations from project participants collected in a survey along with project meeting notes to study communication techniques and organizational structures. To gather empirical data, the authors sourced information from three primary channels.

In the first place, both authors are active participants in the project, who have closely followed its evolution not only from the inception of the funded project but also several years before the collaborative endeavor was officially initiated. Their approach could best be described as participant observation. This approach offers numerous advantages, such as easy access to information and experiences. However, it also introduces potential complications, particularly in dealing with sensitive matters where results obtained from close colleagues might be influenced by real or perceived expectations (more on this is elaborated upon below).

The second source utilized for this reflexive analysis comprised written materials. Notes have been systematically recorded during the 2–3 day-long project meetings since its commencement. These meeting notes serve as valuable documentation of the discussions, complementing other materials like articles and PowerPoint presentations that participants have uploaded to the project’s shared wiki page. The meeting notes contain detailed documentation of individual participant’s views and the group’s priorities and strategies. However, in this work, we only considered the documentation that reflected communication and organizational issues and best practices employed throughout the collaborative process.

Thirdly, and constituting the most abundant source for this analysis, a structured questionnaire was distributed via the Limesurvey platform. The purpose and the plan of the survey were discussed in detail in a meeting to ensure to cover all important aspects from various disciplines. The survey was then refined and sent out to all core members of the project. The survey was completed by all 14 core members of the project. It is worth noting that the questionnaire was not distributed to recently enrolled MA students who usually participate in a short period of time during a few weeks or months. Nor was the survey distributed to external users of the project’s various resources and tools; it was exclusively sent to members who regularly attend project meetings and subgroups, i.e. to researchers who have gained experience in cross-disciplinary cooperation.

While questionnaires are conventionally used for quantitative research, in this instance, they serve as the foundation for a qualitative analysis. The survey encompassed 30 open-ended and closed-ended questions grouped into five categories: participation, objectives, disciplinary backgrounds, collaboration, and possible boundary objects. Respondents typically spent around 30 min to completing the survey, though some participants reported that they had deliberated on their responses for several days. The closed-ended questions yielded data for the diagrams, which will be presented subsequently. Conversely, the open-ended questions functioned as written interview queries, the responses to which will also be outlined later.

The questionnaire was preceded and followed by personal, unstructured interviews, as well as informal, focus group-like discussions in the presence of the entire group, including the Principal Investigator (PI). Prior discussions assisted in formulating the survey questions and their structure, while the subsequent discussions further elaborated on topics raised by participants in the questionnaire. The rationale behind this approach was to initially explore individual feedback from participants independently, without influence from other participants, and only subsequently engage in a brainstorming phase akin to a focus group meeting. During the sessions, we explained to the group members the STS approach described above, as well as the key concepts (interaction expertise, boundary objects, etc.), stressing that they did not need to use them in their answers.

The foundation of this examination rests upon the framework established by the survey’s structure, encompassing aspects such as participation in the project, professional aspirations, diverse disciplinary backgrounds, epistemic cultures, modes of collaboration, challenges related to terminology, and potential boundary objects. Further insights are gleaned from responses to open-ended questions, with particular emphasis placed on outcomes derived from interviews conducted with the Principal Investigator. Their unique position involves shaping the collaborative framework, addressing terminological ambiguities, and cultivating interactional expertise in domains beyond their own.

Before proceeding to the results, it’s important to underscore two crucial observations. Firstly, the authors’ participant observation approach, while advantageous, is not without its drawbacks. Analyzing one’s colleagues, with their knowledge of being scrutinized, can potentially sway results. Secondly, the project group is relatively small, and despite the questionnaire’s anonymity, respondents were aware that their responses could be traced back to their identifiable disciplinary backgrounds.

These two factors evidently exert influence over the responses provided by colleagues. As a consequence, the authors endeavored to critically evaluate the validity of certain responses. Instances of this challenge include participants’ satisfaction with project management and the power dynamics within participating disciplines. Another potentially problematic issue, the power structures among participating individuals, remained outside the research’s purview, as it did not align with the research focus; accordingly, it was intentionally omitted from the investigation. Note, that we do not perceive a problem in this regard; quite the contrary. However, we recognize that our impression could be subject to distortion, and an alternative methodology would have been required for an analysis of personal power relations. Direct observation and inquiries would not have been effective in this context.

Analysis and discussion of the results

Participation and goals

The questionnaire was completed by the 14 core members. Their participation dates from different periods. Three of them had already participated in the group of researchers who had unsuccessfully applied for funding during the years 2012 and 2014: (i) from COSTFootnote 2 to create a scientific network for historical cryptology and (ii) an ERC grant to study historical encrypted sources. One person joined when the successful (DECODE) grant application was submitted to build infrastructural resources for historical cryptology (2015). The great majority joined the project just before or during the current project: four when the application of this DECRYPT project was prepared, and six during the time of the active and running project (2019–2023). The result of the distribution of joining the project is illustrated in Fig. 3.

Fig. 3: Joining the team.
figure 3

No. of respondents: 14.

The distribution among the participants is pretty even when the primary focus of the participants’ research within the project is at stake (several answers could be marked) (see Fig. 4). Three of them participate in source collection (either in the archives or ordering copies), two supply metadata of the sources (either in DECODE or in Excel files), two provide transcriptions (either manual or semi-automatic), one deals with historical analysis, three with decryption, five develop decryption tools, two are involved in linguistic analysis, one concentrates on the historical language corpus, two build the DECODE database, two develop models for image processing, another two the transcription tool, and one supports the PI as a research assistant.

Fig. 4: Primary research foci: contributory expertise.
figure 4

Several answers could be marked. No. of respondents: 14.

In each case, this primary focus is supplemented by secondary research foci where the given participant considered that they still had contributory expertise (see Fig. 5). The most typical secondary fields with contributory expertise were the transcriptions, the decryptions, the metadata research, and the development of the transcription tool.

Fig. 5: Secondary, but still contributory expertises.
figure 5

Several answers could be marked. No. of respondents: 14.

It is certainly a measure of the success of this collaboration that a large number further fields have been marked, where the participants “do not contribute directly to the results but can follow the discussions and participate in the debates”. Each of the above fields had at least four answers, and the more popular ones had seven and eight answers (see Fig. 6). Together with the contributory experts in the same fields, this high number indicates that a large part of the group can participate in the often technical discussions to at least some degree. The number of expertise on Figs. 46 jumps from 26 and 28 to 63, indicating how participants in the project have broadened their intellectual horizons.

Fig. 6: Non-contributory expertise.
figure 6

Several answers could be marked. No. of respondents: 14.

To what extent can we call this type of expertise interactional? The question was formulated like this: “Which are those fields within the project where you do not contribute directly to the results but where you can follow the discussions and participate in the debates?” We asked the respondents to judge whether they could follow the discussions. To rightly call that interactional expertise, we should have tested whether they pass the imitation game, that is, whether the real contributory experts of the given field would not—in well-defined and written communication—recognize their lack of contributory expertise (Collins and Evans, 2015). We did not carry out this test, and we are not even sure each of them (us) would pass the game; the best we can say is that the participants made an essential step towards developing interactional expertise in these fields. Whether or not they were entirely successful, this is an achievement.

The pattern of the various types of expertise is probably the most complex in the case of the principal investigator of such cross-disciplinary projects. This is the reason why—beyond the above questions—she was interviewed separately. We quote her answers more at length before trying to answer to what extent the concept of the interactional and the referred expertise and that of the ambassadorial model is relevant in her case.

The disciplinary background of the PI of DECRYPT is suitable for developing further kinds of expertise: being a computational linguist working in the field of natural language processing, she has insight into both science and humanities. When interviewed (in written form), she reported the challenges of the planning and design of the project structure to best serve to achieve the project goals given the wide range of competencies and diverse backgrounds of the group members with respect to scientific discipline, scholarly experience, origin, age, and gender.

“From the start, it was important to understand the different scientific fields with their particular goals, interests, and methods and create a platform to allow all to meet on equal basis and to identify common research questions, possible solutions and expected results. It has been of great importance not only to be well-prepared, knowledgeable, and well-organized but also open-minded to all involved. The prioritized aspects that had to be considered from the project start included to ensure:

  • Sharing goals: To bring all members—core, affiliated and temporary members—around the same goals, both general ones to have an understanding of the entire project and goals of a specific topic(s) related to the subgroup’s work.

  • Organization: To ensure that everybody follows the same main track, define subgroups with various expertise to work on relevant topics with clear research questions, and keep the timeline with the agreed deadlines and publications of results.

  • Efficient communication: To create an effective meeting culture—physical or virtual, written or oral—to be able to discuss and report about ongoing and planned work so that subgroups can work smoothly and the team can be updated to plan and discuss the next steps and priorities.

  • Mutual respect and understanding: To keep a respectful tone and meaningful discussions, encourage questions, constructive criticism, and debates in “high to the ceiling”—atmosphere, allowing various (scientific) views and opinions, as well as various levels of experience from beginners to world-leading experts in the field.

  • Outcome and Inclusiveness: To promote sharing of ideas, knowledge, and data in the team, and to be generous with time and effort as well as being inclusive with respect to participation in subgroups around topics of mutual interest including joint publications as a result of successful cooperation.

  • Interpersonal aspects: To manage group dynamics, including differences in points of view and difficult situations.

Despite the above mentioned challenges that many research projects face, the hardest thing is to understand the typical scientific workflow of each discipline involved, including the underlying theories, the methods as well as career requirements and publication traditions.”

Her response regarding the challenges of developing expertise in the participants’ respective fields unfolded as follows:

“Being curious, listening carefully and trying to understand the various viewpoints are not enough; “learning by doing” is the only way to get expertise. In the beginning, before the project started, a pilot study was carried out where the ciphers Copiale and the Borg codices were transcribed, analyzed, deciphered, and contextualized with the help of various experts in the fields. Then, writing the project proposal has been indispensable for the identification of the research questions which the experts from the various disciplines could gather around. The PI’s role is the “spider in the net” who leads the discussions, and at the same time translates between the disciplines in order to be able to arrive at some common understanding. Once consensus is reached about the goals and subtasks, smaller groups are created to work towards particular well-defined goals with specific research questions. At the same time, a white paper has been written by the permanent members of the project which allowed the group to revisit the goals as stated in the application and define or fine-tune the various tasks. Over time, the subgroups around specific themes became larger and larger involving more diversified competence. The output in terms of scientific publications as well as the creation of data, models and tools has been serving as the overarching goal which members of the subgroup as well as the entire team could endeavor. Having left 2/3 of the project time behind the main challenge is still the large variation of the individual interests and expectations of what the project could “give back” to the team members; a master’s or a PhD thesis, scientific publications for career paths, a pile of encrypted sources to be deciphered and published possibly to become as world-wide news, tools allowing scientific experiments in large-scale, or a release of professional, user-friendly, fully working tools implemented as an industrial application.”

In addition to the overarching project management difficulties, this response incorporates elements from all three models, even if not explicitly denoting them. The acknowledgment of the referred expertise may be discerned in her emphasis on the significance of her background as a computational linguist, a discipline situated at the intersection of the humanities and computer science. Evidently, interactional expertise is imperative for guiding the discussions. Finally, the ambassadorial model could elucidate the necessity for the PI to engage in a subgroup with expertise distinct from her own, such as in the field of computer vision and image processing, and subsequently convey their findings to the entire group.

Let us return now to the questions addressed to the whole group. Depending on their expertise, participants are involved in various subgroups within the project (named after the main implementation objective of the given group): data collection, historical language corpus, CrypTool2, transcription tool, the evolution of keys, and the impact of historical language models in decryption. Participants could choose several answers and the outcome is illustrated in Fig. 7. The PI participates in each of them.

Fig. 7: Subgroup participation.
figure 7

Several answers could be marked. No. of respondents: 14.

A separate group of questions addressed the complicated network of personal and project goals. When requested to provide a short description of the goal of the DECRYPT project, respondents gave remarkably unanimous answers.

“Providing a reliable infrastructure for the collection, transcription and decryption of historical encrypted manuscripts that is freely available to the general public.” (Respondent No. 11)

Many of these responses would successfully meet the cocktail party test, where one must elucidate their research in such a comprehensible manner that even a moderately interested conversational partner in a noisy social gathering grasps the essence of it. To quote two of these:

“To develop tools for recognition and decryption of historical documents. The tools should make it possible for researchers with no cryptographic and very limited IT education to work with encrypted historical documents.” (No. 8)

“The project aims at creating a complete toolchain that is end-to-end web based, thus easily accessible for the humanities researchers when dealing with historic documents that involve cryptography somehow. This is supposed to generate synergies between several research fields, and to make advances that would be unattainable without an interdisciplinary team.” (No. 4)

In the following part of the questionnaire, the respondents were asked to define those project goals that are specifically related to their participation. Answers included specific research targets (decryption, metadata, transcription) as well as more abstract goals: “Produce novel research” (No. 13); “Support other members with their topics, e.g., support linguistic analysis, transcription, etc.” (No. 14).

When questioned about whether the DECRYPT project has thus far attained its objectives successfully, all respondents answered in the affirmative. Several respondents commented on the importance of team spirit, the high level of cooperation, and the connectedness of the community. Some added, however, that some tools require optimization to have stable versions, and one noted that “there seems to be more focus on academic achievements (i.e., writing papers and conducting novel research) than on creating usable and robust tools (No. 1).” At this point, we have the first trace that the epistemic cultures of the respondents (academics vs developers) diverge.

When asked about risks where the project might not achieve its goals, everyone answered optimistically, though some marked one specific subgoal as very challenging: the high-quality automatic transcription of the manuscripts. In this field, as one participant noted, “many novel models were provided and some of them were designed specifically for the recognition of historical ciphered images. We have the goal of end-to-end decryption of images (from ciphered text to image) that we will work on it before the end of the project, I think it can be achieved.” (No. 3).

In addition to the general project goals, the particular sub-goals, and the prospective risks, the individual motivations of the participants for joining the DECRYPT project, as well as their anticipated personal or career-related gains, were also subject to inquiry. Besides general answers, professional development, personal hobby, publications, cross-disciplinary cooperation, and the expansion of the research network were named:

“Professional challenge, a possibility to work with great representatives of the profession; an exciting and meaningful way to earn money.” (No. 6)

“Love for ciphers and especially the original historical material.” (No. 14)

“The project helps me to “learn” this field deeper and deeper; it keeps me updated with the development of the field; it helps me in my academic progress with giving feedbacks and advices and also possibilities to join in writing papers discussing the very interesting questions of the field.” (No. 6)

“I strive for constant personal improvement and working with the project has given me many opportunities for personal and professional growth.” (No. 11)

“Perform research and gain expertise within the field that I like. Learn new knowledge from different disciplines (the meetings and discussions that we are having). Collaborating with other researchers within the project from different fields.” (No. 13)

Finally, when asked about the results of their participation after the collaboration had ended, several respondents mentioned one specific tool in the development they contributed (the database, the decryption tool, etc.) but also continued and widened expertise (including the field of digital humanities) and possible further collaboration.

A variety of disciplinary backgrounds and publication patterns

The aim of the subsequent series of questions was to chart the diversity of disciplinary backgrounds among the participants. The initial question straightforwardly requested them to specify the field in which they received their academic training (their major at university, the domain in which they publish articles, and participate in conferences). The responses were not limited to a single choice; respondents had the option to select multiple disciplines and even add additional choices beyond the predefined possibilities. The highly cross-disciplinary character of the intellectual backgrounds is evident, as illustrated in Fig. 8.

Fig. 8: Disciplinary backgrounds.
figure 8

Several answers could be marked. No. of respondents: 14.

Following upon this aspect, we delved into variations in publication preferences. It was evident in both project discussions and publication plans that participants, influenced by their diverse backgrounds, held differing views on preferred publication formats. For instance, single-authored monographs are the norm and highly valued in the field of history, where the first book serves as a crucial entry point into the field. Conversely, in the natural sciences, multi-authored journal papers and contributions to international peer-reviewed conference volumes are customary and essential for establishing one’s reputation in the field.

The responses, illustrated in Figs. 9, 10, affirm this observation. However, it’s important to note that these responses do not present an objective depiction of typical publication patterns in different disciplines. Rather, they reflect the perspectives of DECRYPT project participants. These divergent views are consequential when it comes to deciding how to disseminate specific findings and through which platforms to do so.

Fig. 9: Publication preferences.
figure 9

No. of respondents: 14.

Fig. 10: Multi vs. single-authored publications.
figure 10

No. of respondents: 14.

Do these disparities indicate the presence of distinct epistemic cultures? It might be premature to label them as such at this stage. However, considering the earlier quotation regarding the distinct goals of academics and software developers, along with the variations in publication preferences, disciplinary backgrounds, and the dissimilar terminology and research methods that we’ll explore below, there are more and more arguments in this direction.

Collaboration—trading zones and terminology issues

A primary objective of the current research was to determine the DECRYPT collaboration’s classification within the trading zone framework outlined by Collins et al. as previously described. Specifically, the focus was on understanding how participants perceive the nature of collaboration. They were interviewed to gain insight into the positioning of their respective disciplines, whether the collaboration leans towards symmetry (where both scientific communities collaborate on an equal footing) or asymmetry (where one community commissions tools for its specific needs or exerts dominance over the other). The responses can be categorized into two distinct clusters.

Many respondents found it symmetric and added that

“Expert knowledge in all disciplines is equally required to achieve the goals.” (No. 8)

“We use inputs from all other disciplines: transcriptions, language expertises, historical expertises, language corpora, etc. Also, we deliver outputs to other disciplines: software as well as decryptions. We have to cooperate with many others to achieve our goals.” (No. 14)

“There are no dominant and minor actors. Everybody communicates with everybody, everybody can help everybody—sometimes help or ideas come from very surprising professions/actors. The participants give feedbacks to each other even if it is a hardcore professional remark or an external/layman’s observation. Both kinds of remarks are very useful in our work.” (No. 6)

Nonetheless, an alternative interpretation of the question (more aligned with Collins et al. and Kemman’s conception of symmetry and asymmetry) emerged, with several participants highlighting a degree of asymmetry in the collaboration (without blaming the project for this).

“I would say as a web developer I am serving the research groups, so my discipline has an asymmetric relation to the others. But I should say that the relationship between the image processing and linguistic/historic groups seems to be quite symmetric, what I find very commendable.“ (No. 1)

A most informative explanation was provided by one of the answerers:

“I think the setup is somewhat asymmetric as the agenda is set by the needs of the humanities for the toolmakers. However, I would not go as far as domination, rather, I’d say humanities have a natural monopoly on knowing what are the interesting research problems. Toolmakers could support almost any kind of research to some extent but only if they knew what are the worthy goals and methods. And so, the leadership role of the humanities is the result of them adopting new tools into their old domain, and therefore setting the house rules as the host.” (No. 4)

This implies that the collaboration can be categorized in several aspects towards the upper right corner of the typology. However, in certain instances (such as humanities seeking tools from IT experts), it aligns more with the lower left corner.

To gain a deeper understanding of the participants’ perspectives on the collaboration, and the differences in their epistemic cultures, they were asked—both in the questionnaire and in the group talks—to provide examples in which they had to exert additional effort due to their colleagues coming from diverse disciplines. These examples might have been related to how they approached a research problem, analyzed it, cooperated in writing an article, or any other practical or scholarly issue. Participants were encouraged to share both successful and frustrating examples, with the aim of identifying challenges they encountered in the process.

Some examples were related to the difficulties of understanding how another discipline works.

“The “history part” had to explain how libraries, archives work, how collection thus happens. Without explaining this, many project members would not understand our possibilities, the time requirement of different work-flows, we could not have asked for their help or remarks.” (No. 6)

“Quantitative (big[ger] data) analyses: are unusual in my field, I learned much about data processing and data handling from the computer linguists and cryptologists; even this knowledge I can transfer to my own work” (No. 12)

“At design, I took extra care for creating clear diagrams and other visualizations to communicate some key decisions. As IT operations experience is not at all common in the team, I took the initiative for some infrastructure work like geographically independent backup strategy, domain name layouts, etc.—these are things I normally not do anymore.” (No. 4)

Some answers were related to the practicalities and difficulties of writing co-authored articles:

“A colleague from a different discipline was struggling with using online document sharing platforms (such as the Google suite). We ended up having to switch to offline versions of the documents for them and then either me or one of my colleagues would update the online files after we got our colleague’s work back via email. It was a bit of a workaround but it still worked out in the end.” (No. 11)

“It has to be accepted that historians write longer sentences and much more detailed than computer scientists. As long as there is no redundancy that is fine to me. On the other hand, computer scientists are too often and too early happy if they have a mathematical formula and running code.” (No. 7)

“Cooperation in writing: The program Overleaf was new to me and I had to invest to learn the LateX-code—I’ve become a big fan of the program and have in the meanwhile used it in another collaboration with a colleague from my own field.” (No. 12)

Positive experiences were also related to common article writing:

“We (the image processing) did a collaboration with a member of the project from Historical Linguistics. The goal was to analyze the transcription models from a user perspective. We collaborate in a good way and we end up writing an article that was accepted (…) For writing the article, each member wrote their own parts in the paper. The introduction, discussion, and conclusion parts were written together by all members.” (No. 13)

Further questions were focused on terminology concerns. In interdisciplinary collaborations, it is common for terms to hold distinct meanings within the participating disciplines, necessitating collaborators to initially recognize disparities in their shared vocabulary and subsequently establish common definitions. A frequently cited example of this challenge is the term “metadata” (Kemman, 2021, p. 42), which holds different meanings for historians/librarians and IT experts. In the DECRYPT collaboration, the term “metadata” presented a similar challenge:

““metadata” “which I used as technical descriptors of data files, while here it is broader, both as it includes stuff that is just primary data (i.e. contents of a record describing a document).” (No. 4)

However, there have been numerous other instances involving many of the crucial terms employed in this project.

“Before developing a common terminology everybody used different terms to everything (cipher text, plaintext, cleartext, code character, etc.), everything was quite messy. But we fixed the terminology in the first year or so. Since that time only methodology-related terms have to be clarified sometimes. E.g. clustering refers to different things in connection with image processing/clustering of homophonic substitution ciphers/IT development, cryptanalysis.” (No. 6)

“I don’t remember exactly but I don’t think I used to differentiate between the words “plaintext” and “cleartext”, while now I would never use them interchangeably.” (No. 11)

“Transcription, cipher, key, plaintext vs. ciphertext, nomenclator/nomenclature, etc. →I had a different understanding/idea based more or less on modern cryptography, morphemes or linguistic terms in general” (No. 14)

“Also that cryptology was used correctly in linguistics, while in computer science cryptology and cryptography is often mixed.” (No. 7)

“Boundaries of historical periods like the early and late modern and the late middle ages (what now I know as early modern I thought of as late middle age).” (No. 4)

“Once I was talking about hierarchical clustering without explaining what this term exactly refers to. After a while it turned out that nobody knows what I am talking about. (It was rather my fault.)” (No. 6).

Continuing with the matter of distinct epistemic cultures, participants were asked whether they had acquired new research methods or adopted unfamiliar approaches during their involvement in the project. The responses varied significantly, depending on the disciplinary background of the respondent.

“Since I am not an IT specialist it was quite interesting from the beginning to hear about hill climbing or simulated annealing with fixed temperature, or following the evolution of the image processing methodology… And since I am not a linguist either I have met very interesting questions, methods from that field too. Almost everything was new for me from these disciplines, so yes, I have learned a lot since I am participating this project.” (No. 6)

“I did not know how to collect and approach archives as well as how to deal with historical texts and languages. I feel much more confident but still need colleagues doing/helping with these.” (No. 14)

“I am now much more proficient in Old French, after working on many documents in that language.” (No. 9)

“The general transcription-decryption of historical manuscripts methodology was new to me. Briefly: start from an initial guess on the historical circumstances, possible content, and encryption method of the manuscript; transcribe easy parts of it; try to decrypt it (with cryptool for example); from the decryption attempt get some new ideas how to improve the transcription; go on with the transcription; iterate previous steps until the manuscript is decrypted; show it to historians. (I know it does not always work like this, and sometimes decryption is not even successful, but this is my impression how it works in general).” (No. 1)

“Using AI in research of historical documents; Quantitative analysis, e.g. using pivot tables in Excel; Using CrypTool2 for frequency analysis, cryptanalysis.” (No. 12)

And finally two general but telling answers:

“I think I was more used to individual research before joining the project, so I learned a lot about collaborating with fellow researchers.” (No. 11)

“But I think I learned for me novel ways of seeing the world.” (No. 4).

Considering the extensive diversity and sometimes conflicting interpretations of terminology, professional objectives, research methodologies (and places), collaboration dynamics, and more, it is not an overstatement to suggest that participants in the DECRYPT project had to engage in negotiations due to their distinct and significantly varied epistemic cultures.

Offline vs. online

Having started the cooperation a year before the COVID-19 pandemic hindered all research trips, the DECRYPT project allowed participants to compare in-person and virtual gatherings. When describing the advantages of the face-to-face meetings within this project, they underscored the importance of effective communication, building strong relationships, and improving results (e.g., “brainstorming on the problems, which is more efficient offline”, quoted from No. 8). In short: “physically meeting people is good for the human mind” (No. 2). Despite the convergence of responses, it is noteworthy to cite a few, as they provide valuable insights into the extent to which individuals value personal meetings, even in light of the challenges associated with travel.

“Discussing a question, solving a problem is always much more effective if the members are physically in one place. (Coffee breaks, common lunches and dinners are also good places for professional discussions.)” (No. 6)

“At least at the beginning of a project they are necessary. They let you learn to know your colleagues as humans. They colleagues are normally fully concentrated on the topic and do not work/read something else.” (No. 7)

“Meeting with the others face to face is invaluable, without it no meaningful relationship could be formed. They aren’t anymore distant faces, but living people. Also just discussing things with them in one to one conversations or in small groups is very helpful, as some things are hard to bring up before the entire team in the online meeting. Online cooperation also becomes way more successful after the people got to know each other in person.” (No. 1)

“Stronger commitments, smoother communication, whole day is assigned to the meeting, so more focus, traveling is fun and there are social elements.” (No. 4)

Among the drawbacks of in-person meetings, unsurprisingly, participants frequently cited factors such as travel duration, expenses, and occasionally, concerns regarding COVID-19 risks. Nevertheless, some respondents, possibly influenced by the withdrawal symptoms induced by the COVID-19 restrictions, expressed the view that they do not perceive these challenges as real problems, considering them as integral aspects of our work. One respondent added the “negative climate effect of traveling by flight”. (No. 12).

Regarding the advantages of online meetings within this project, numerous participants highlighted their time efficiency and cost-effectiveness. They underscored that these virtual gatherings are simpler to coordinate, facilitate greater ease in bringing together individuals, and provide a convenient means to receive updates from other teams. And let us finally quote a fairly positive answer:

“It comes close to the real thing, which is incredible given the distances and number of countries involved. We can function as a team while not being in the same room also, mostly people are disciplined regarding time and preparation so the schedule works.” (No. 4)

One participant added: “The same as with physical ones, minus the human connection.” (No. 2). This brings us to the disadvantages of online meetings, where once again, the responses were unanimous:

“no chatting during “coffee break”, staying in the own world.” (No. 2)

“No eye-contact, no body language makes communication less efficient; Sometimes technical issues; Harder to stay focused 100% of time.” (No. 8);

“Even if they’re shorter, they somehow become more tiring” (No. 11);

“In (long) online meetings, concentration is lost much faster (at least for me) than in real meetings. The social aspects are missing, like having a nice coffee or a beer/wine together.” (No. 14).

“They are good for update of the work, but not for deep discussions.” (No. 14)

Although the respondents were unaware of the outcomes of a substantially larger similar study conducted by Collins and his colleagues (2023), their responses echoed similar sentiments. They expressed a consensus that in-person meetings play an indispensable role in community building. Substituting them with virtual meetings, despite their logistical conveniences, would overlook a pivotal aspect— physical gatherings provide a platform for scholars to immerse themselves in a specific field socially. The members of the DECRYPT collaboration are all relatively new to their shared domain of historical cryptology, precisely because this field has been established recently. Consequently, they can be viewed as newcomers in the discipline. Building trust, establishing a common terminology, and finding ways for collaboration hold particular significance for them. In their context, scientific communication transcends mere information exchange; it becomes a “a process of socialization into overlapping and mutually embedded scientific domains.” (Collins et al., 2023, p. 1).

Boundary objects

Project participants were informed about the meaning and significance of boundary objects in cross-disciplinary collaborations. They were introduced to the definition put forth by Star and Griesemer, who described boundary objects as “objects which are both plastic enough to adapt to local needs and the constraints of the several parties employing them, yet robust enough to maintain a common identity across sites.” It was explained to them that these objects might be perceived differently by distinct scholarly communities. As examples, they were presented with specimens (Star and Griesemer, 1989), a manuscript letter (see above, Kemman, 2021, p. 54), and a grant application (Collins et al., 2007, p. 663) were given.

To identify potential items that could serve as “tangible” shared objects but are interpreted differently by various communities, we conducted a pre-selection process during our initial group discussions. Based on our preliminary assumptions, beyond the grant application, several items within the DECRYPT project could conceivably be considered boundary objects. These included the cipher keys, encrypted messages, unsolved ciphers, the DECODE database, transcription guidelines, historical language corpora, the TranscripTool, and the CrypTool2 software. Notably, when participants were asked about the purpose they assigned to these objects, the responses diverged depending on their individual research goals and personal perspectives.

In the questionnaire, when participants were asked about their purposes for utilizing the grant application, their first responses were quite predictable. It was unsurprisingly mentioned that the application served as a means to secure the grant funding, facilitate project presentations, and compose the initial white paper publication for the project’s inception. For someone who joined the project at a later stage, the grant application provided an essential introduction, aiding their comprehension of the collaboration’s nature. Its specific sections were used to clarify what is expected from her. Others employed the grant application for validation, cross-referencing the project’s commitments while working on software and process design. Notably, one respondent logically stated that the grant application presently underpins their salary. Additionally, for the project’s web page developer, the grant application served as a text resource, from which paragraphs were borrowed.

A common thread of diverse purposes emerged when examining the applications of cipher keys, encrypted messages, and unsolved ciphers. These usages often align with the specific roles and responsibilities held by project participants.

Keys, messages, and unsolved ciphers are used by several members as a source to be transcribed and by others for performing cryptanalysis on them. Some members are responsible for collecting, uploading, and annotating them (however, they do not necessarily transcribe or analyze them). Some do historical research on cipher keys, a further one implements the descriptor and metadata fields and does data transformation and mass data upload, and a last one needs them to see what to expect to appear in keys from various time periods. Encrypted messages are further used for testing the decryption tools (e.g., Cryptool, NCID) and the TranscripTool, unsolved ciphers to test the decipherment pipeline. One group member particularly strongly differentiates between solved and unsolved ciphers because he is motivated to break the unsolved ones.

The DECODE database provides a different working area for group members who upload images, edit the records, add metadata, check its usability, implement and test it, look for fresh sources for historical research, and are particularly interested in solving unsolved messages.

The transcription guidelines are developed and written by some; others use it as a source when making the TranscripTool adhere to it; others—the real addressees of the guidelines—look at specific rules on how to transcribe the sources, and again others evaluate them, contributing from a philological perspective.

The language corpora are built and enriched by some participants, they serve others to create historical language models, others use them in the decryption tools, and someone tests them for the decoder and similarity analyzer.

The TranscripTool is developed, validated, and tested by some group members, while others create the models for the new transcription tool. Models are built inspired by various fields of AI, such as machine learning, natural language processing and computer vision, all represented in the same project.

The CrypTool 2 software is obviously used for cryptanalysis and deciphering unsolved ciphers. Still, it is also used for teaching purposes and presenting results, while those who check it and give guidance for improvement certainly have a different perspective.

To what extent can these items be called boundary objects? They possess a tangible quality that imparts meaning to all participants, yet the interpretation of this meaning varies among individuals due to their distinct research objectives and motivations. These items are inherently complex, and the degree to which their various components are perceived, or even noticed at all, and subsequently evaluated by users, is intricately tied to these individual objectives and motivations.

The debate surrounding what qualifies as a boundary object remains ongoing. However, considering the manuscript letter mentioned above (which serves different purposes and carries distinct meanings for historians and linguists), it is reasonable to view the examples we have provided as strong contenders for the classification of boundary objects. Their capacity to accommodate different interpretations and meanings for diverse participants within the project suggests their potential as boundary objects in facilitating interdisciplinary collaboration.

Conclusions

The research presented in this paper had several key objectives. Its immediate aim was to conduct a reflexive analysis of the cross-disciplinary DECRYPT project, which seeks to establish infrastructural support for historical cryptology through collaboration among researchers in diverse fields such as history, linguistics, natural language processing, cryptology, computer science, and computer vision. Specifically, the study aimed to explore how the partnership between artificial intelligence (AI) and the humanities had evolved over four years after the project’s inception.

To assess the project, we adopted the Science and Technology Studies (STS) approach, seeking to examine whether and how fruitfully it can be applied to study a digital humanities project in which many different disciplines work together and where the different types of expertise and boundary objects play crucial roles.

Additionally, the study aimed to investigate the feedback provided by project participants regarding the challenges they encountered when dealing with data structures, research methodologies, terminology, and publication strategies from different fields. We sought to uncover the difficulties that participants faced when co-authoring articles across disciplines and reconciling individual interests with common project goals.

Of particular relevance are the challenges the PI faces in interactional and referred expertise fields. The study also explored the tensions that arose between academic researchers focused on publishing research papers and industry participants seeking practical and expedient knowledge.

It is evident that the adoption of Science and Technology Studies (STS) terminology proved to be beneficial not only for the authors of the study in their analysis of the project but also for the participants themselves, as it facilitated a clearer understanding of the challenges they encountered. What participants had previously approached intuitively and without explicit reference now became more apparent. The use of tools like the trading zone chart allowed them to visualize how the project navigates between the top right and bottom left corners. Highlighting the range of expertise required beyond their individual contributory expertise and the varied interpretations of boundary objects among different participants, enhanced awareness of the collaborative process.

In response to the initial question posed, it can be asserted that the STS methodology is well-suited for describing multifaceted collaborations. Furthermore, its relevance appears to increase in proportion to the complexity of the collaborative endeavor, making it a valuable analytical framework for examining and comprehending the dynamics of such projects.

Finally, we believe that our research can be a valuable resource for other digital humanities collaborations whose managers and Principal Investigators lack a comprehensive handbook or guide on how to initiate and manage such complex projects successfully. Our aim is to assist these projects in avoiding the common pitfalls and challenges associated with launching, planning, executing, and sustaining highly cross-disciplinary initiatives (particularly in the section “The DECRYPT project: collaboration structure and management”). Through our analysis of best practices, identification of disciplinary tensions, examination of expertise types, and exploration of boundary objects, we endeavor to offer practical guidance.

Moreover, we hope that our work can foster meta-level cooperation among similar projects, providing a platform for discussing these shared challenges and problems explicitly. By facilitating the exchange of insights and strategies, we aim to contribute to the overall advancement of collaborative endeavors in the digital humanities and enhance the chances of success for future initiatives in this domain.

Disclaimer of ethical issues

This paper presents the findings of an evaluation of a cross-disciplinary project aimed at identifying and addressing the challenges of running such a project efficiently. The evaluation primarily relied on surveys completed by project participants, who were informed about the purpose of the research and its potential implications. All respondents provided their informed consent, ensuring they were aware of their rights to confidentiality and anonymity. We took several measures to uphold the highest ethical standards throughout the research process. Firstly, all participant data collected through the surveys have been anonymized. Secondly, the study was conducted in accordance with ethical guidelines relevant to cross-disciplinary research, ensuring that the welfare and rights of the participants were protected at all times. No conflicts of interest have been identified in the conduct of this study. All data sources, including survey responses, were collected and processed with the utmost care to ensure integrity, accuracy, and respect for the participants’ contributions. This research acknowledges the importance of ethical considerations in conducting evaluations involving human subjects. We are committed to the principles of ethical research and have taken all necessary steps to ensure that this study adheres to these principles.