Introduction

The double helix of DNA and structure of virus particles are by now familiar as images that circulate widely. By contrast, the shape of humanity’s natural languages, and their high-dimensional form mostly remain to be explored and modelled as visual material. The contemporary software we have at our disposal and methods for digital design allow us to visualise natural forms of human language. This is a multi-disciplinary effort, which relies upon a study of language combined with expertise within architecture and design.

The structure of language in its elementary form unfolds along a timeline, as one phrase succeeds another. Scholars of spoken discourse record speech and later produce it in a written format as a transcript. Conventions for transcription vary (Bucholtz, 2000), but generally the lines of the transcript are made to resemble the structure of speech. A brief pause at the end of a phrase, as it were, punctuates speech. When producing a written record of spoken language, it is important to visualise such patterns on the page, moving to the next line after a pause in the utterance. Likewise, the intonation of speech is another element that leads to the organisation of lines on the page (Hymes, 2003, p. 96). When written down, speech therefore does not look like prose. The structure of spoken language is reflected in each transcript. The organisation of the lines of a transcript is an implicit patterning within language (Hymes, 1994). In spoken language, the units that correspond to sentences are called “verses”. People speak in verses not prose, and discovering the lines of a transcript leads to an understanding of the structure of language that is otherwise not possible. It is this elementary structure reflected in a transcript that forms the basis of our study, as we began to model natural language as a three-dimensional (3D) shape (see Fig. 1).

Fig. 1: Illustration of a transcript of recorded speech in an Amazonian language.
figure 1

Provided in small script, without translation and arranged around a central axis only to reveal the elementary structure of a spoken language sample (Aikhenvald, 2003, pp. 630–635).

The research question we address here is how aspects of the high-dimensional form of natural language could be rendered in 3D. In order to do so, we first present historical sources on the timeworn comparison between the grammar of language and the “geometry” of a figure, its shape or arrangement of its parts. For the purpose of this study we selected one grammatical feature, evidentiality and we discuss its historic roots. After this section on the historical background to our project, we continue with a review of the literature, which includes both references to the study of language and relevant material on computational 3D modelling. A method section follows with a detailed lay-out of the steps we took to construct a 3D digital image from a language sample. The paper’s main section presents the findings, and the four prototypes we developed. Each one is based on a distinct language sample, from the Kurdish language, an Amazonian language, an ancient dialect of Neo-Assyrian, and American English. Finally, in the conclusion we reveal images of the samples of natural language printed in 3D, along with a discussion of the project’s contribution to both computational design and our understanding of natural language as a high-dimensional form.

Historical background

Besides the elementary form of spoken language revealed in a transcript, grammar is a structural element that equally determines the shape of language. Grammar is defined as the whole structure and system of a language, and has been compared to a geometrical order for centuries. Within mathematics, geometry stands for the study of the spatial relationships between the lines, angles, and surfaces of things. Flat shapes such as circles or squares describe boundaries in two dimensions. Their 3-dimensional equivalents, spheres and cubes, reveal an orientation and situation within 3-dimensional space. A thirteenth century Archbishop of Canterbury in England (Robert Kilwardby 1215–1279) had pointed out a correspondence between the fields of grammar and geometry. His work later prompted the Dutch philosopher Spinoza (1632–1677) to treat grammar in a geometric manner in the seventeenth century. Spinoza in turn influenced ideas by the linguist Whorf (Jakobson, 1968, p. 605), best known for the Whorf hypothesis. This is a principle suggesting that the structure of a language affects its speakers’ worldview, and that people’s perceptions are relative to the language they speak (Carroll et al., 2012). A less well-known aspect of Whorf’s oeuvre is his attention to “a geometry of form principles characteristic of each language” (Jakobson, 1981, pp. 76–77, original emphasis). This Whorfian notion informed the early stage of our research. But what does geometry mean in this instance?

The distinctive property of grammar lies in its abstractive power, as it abstracts itself from anything that is particular, concrete—the actual words or discourse. What is left are the abstract designs of the structure of phrases, the grammatical rules. In this way, grammar resembles geometry. Geometry is an abstract construction that presents an idealistic formalisation of the size and shape of things. For instance, we know that the earth is not a perfect sphere, but it is represented as one in celestial globes. What underlies both geometry and grammar is the abstractive power of human thought. We superimpose simple geometrical relations upon the particular objects we see, and we map out grammatical figures upon concrete instances of verbal expression. Jakobson notes that he relation between language and its grammatical features is like “the relation between physical and geometric bodies”. The physique of language subsumes words resembling a geometric body, that “mimic the abstract role of spatial coordinates” in a physical world (Jakobson, 1968, pp. 605–606). This time-honoured comparison between grammar and geometry forms the historical background to our study. The complexity of such “a geometry of form principles” in language meant that they were evoked throughout the ages but never visualised. Here, we explore if contemporary developments in architecture and 3D design can bring about such a visualisation.

The familiar visual representations of an astronomer or chemist represent spatial or geometric qualities of planets and molecules and converts them to a scale that is visible to the human eye without telescope or microscope. By contrast natural language is commonly envisaged as a virtual entity, only seen by means of writing systems, from which we deduct the sound and meaning of words. Should we therefore conceive of Whorf’s notion of “a geometry of form principles” (Jakobson, 1981, pp. 76–77) that typifies a language as only a metaphor? Could this be an imaginative portrayal of natural language, as an immaterial entity, without spatial or geometric characteristics of its own? On the contrary, natural language often acts as a placeholder for material, observable entities, and its grammar reflects this potential. The category of words Jakobson brought to the fore in this respect were pronouns (Jakobson, 1968, pp. 605–606). The he, she or you, of conversation or discourse anchor language into its surrounding reality in a particular manner. In each utterance such pronouns can stand for a different person, shifting from context to context. Language is held in place by the concrete fillers of any given sequence of pronouns. This process doesn’t limit itself to people or social reality, but also concerns inanimate features. For example, the English language features this for nearby entities, and that for things that are further away. The principle remains the same, as a generic and relatively abstract term can be used as a “receptor” that ties language into reality, as a web of meaning with definite spatial characteristics. In each utterance one can find ephemeral “placeholders”, with concrete people or things that stand for you, she or that. The spatial nature of language is thereby not only a metaphor, but is central to the study of language in its spoken, interactive form.

There are other features of natural language that also manifest a concrete orientation towards social and material realities. For example, evidentiality which ties language to elements of reality that have been observed, witnessed. The term evidence reflects the relevance of sight from the Latin “to see” (videre), but can of course also be based on hearing or other senses. All languages have the means for establishing evidence for what one is saying, and feature evidentiality. Evidentiality portrays the way verbal expression is tied into perception, and indicates the source of the information, as seen, heard or otherwise sensed. This involves the grounding of an utterance in a body of facts and first-hand experiences, a set of events witnessed in a particular place. Evidentiality acts as a momentary placeholder providing coordinates of perception as speech unfolds. It belongs to a wider group of language features that act as “placeholders” for material and observable entities. Such “receptors” or “placeholders” are cornerstones of a language’s “geometry of form principles” (Jakobson, 1981, pp. 76–77) at the meeting point between an utterance and its concrete, physical situation.

These features of natural language stood out as we began to respond to our research question. Could contemporary developments in architecture and 3D design, and the modelling software developed for the production of film and video animation become relevant for the visualisation of natural language? Natural languages are high-dimensional forms, and this paper only offers an initial foray into a largely unexplored terrain. For the purpose of creating a prototype we opted for a well-defined and single grammatical feature as a specific dimension to model language in 3D: evidentiality. It has received much scholarly attention across the fields of both linguistics and anthropology, which has generated a wealth of data.

Much like the comparison of geometry and language, a preoccupation with evidentiality has a long history. It was Franz Boas who had first set out the linguistic method of anthropology, the study of languages and cultures in Science Magazine in the nineteenth century (1899, p. 95). His oeuvre brought home a critical awareness of the evidential strength in native American languages, such as Kwakiutl. Later, in 1942, he remarked that “we would read our newspapers with much greater satisfaction if, in the same way as Kwakiutl, our language, too, would compel journalists to state whether their reports were based on self-experience, on inference, or on hearsay” (Jakobson, 1971a, p. 483). It was the linguist Jakobson who later coined the term evidential and portrayed the systematic distinction between “vouched for” and other events (1971b, p. 135). Evidentials operate along a continuum from facts based on first-hand experience, information that was heard, seen or otherwise sensed, to information that was reported by somebody else, to hearsay, or indeed speculative thought void of tangible sense.

The hierarchy of evidential values depend on a cultural context, whether for example seeing or hearing carries more evidential weight. Likewise, the importance of the function of evidentiality is language-specific, and is an aspect of linguistic relativity (Hymes, 1966; Jakobson, 1985). Our perception of the world and our ways of thinking about it are deeply influenced by the structure of the languages we speak. Evidentiality is a prominent aspect of such relativity as it concerns the way language is tied into perception and reflects a hierarchy of evidential value. In general, ideas about language structure and its use vary from one culture and social context to another (Hymes, 1966; Schieffelin et al., 1998). Likewise, verbal expression isn’t a single assumed phenomenon, “language” as we know it, but understandings of its existence, the kind of entity it is, vary across cultures (Demuro and Gurney, 2021). In view of this complexity, we limit this study to an elementary grammatical feature that characterises moments in verbal expression that encode perception: evidentiality. Such markers of perceptible reality are conceived as a set of coordinates, reflecting the manner in which language is tied into perception.

Relevant literature

In approximately a quarter of the world’s languages grammar dictates that the evidence for what is said must be marked. Such obligatory grammatical evidentiality has been documented most extensively for Native American, Amazonian, and Australian Aboriginal languages. Further categories of inference, assumption, hearsay or indeed speculative thought void of tangible sense, are marked by contrast to evidential precision (Boas, 1911; Aikhenvald, 2018). The existing literature on evidentiality is extensive, and includes work written by anthropologists who pay attention to the social context of evidential language in spoken forms (Hill and Irvine, 1992; Kuipers, 2013). However, it was not possible to include social factors within the modelling process, which is solely based on the recorded speech events themselves. To first conceive of language in 3D we relied on existing concepts within the literature, the notion that narratives can be studied in terms of their pattern (Hymes, 1982) and their texture (Johnstone, 1990). Such seminal work allowed us to develop the notion of an arrangement of evidentials across an utterance as an aspect of the shape of language now to be visualised in 3D.

The research process that led to the modelling and visualisation of natural language in 3D was also guided by a body of literature on the history of design. The software that is currently used for the production of film and video animation seemed apposite to begin to work with the complex and dynamic shapes of natural language. This is also known as “spline-based” software. A “spline” or flexible curve is defined in mathematics, as a curve constructed so that it passes through a set of points. Such curves then form the basis for the modelling of spline-based 3D surfaces that make video animation possible. For the visualisation of language in 3D, it was necessary to first disaggregate taken for granted aspects of the historic development of such spline-based design tools. We turned to the historical roots of parametric design to discern elementary tools for our project.

Spline surface geometries have been used for centuries in crafts such as ship building. Historically, splines were typically drawn on a horizontal surface using a flexible strip arranged as a curve between a series of carefully positioned lead weights, known as “ducks”. Likewise, the Catalan architect Gaudí used weights on a string to design the iconic model of the church crypt of the Colònia Güell. The parameters for this design are the string length, the weights, and the nodes where the string is attached, while the outcome of the model automatically derives from gravity (Davis, 2013). Gaudí’s architecture contained the main features of what would become “parametric design” in a digital age, conceived in terms of parameters that vary or can be adjusted. He could generate different versions of his model, through simple physical adjustment of strings and weights. This hands-on approach led to him to be called Gaudí “the geometer” (e.g., Català, 2007) and he was known as an architect especially skilled in geometry. This model for design and architecture seemed well-suited to begin to work with “the geometry of form principles” characteristic of natural languages (Jakobson, 1981, pp. 76–77). Without insights from Gaudí’s oeuvre we would not have been able to conceive of the perceptual anchor points of language as weights, along an abstract gravitational axis, reflecting the relative evidence they provide. We developed a form of parametric design with parameters derived from the elementary structure of spoken language as well as its grammar. Each marker of evidentiality is conceived as a “weight of evidence” that influences what is said in its immediate vicinity.

The history of spline-based architecture includes a further element that proved essential for our design process. We did not wish to construct evidential weights that pointed away from the viewer. This would be counterintuitive. It made sense to depict the evidential bases of the utterance as close-by, with the more speculative aspects of language use placed at a distance. The segments where language tied into perception for the original speaker now point towards the viewer. This kind of inversion is already sedimented into the history of parametric design, and we developed our project in parallel to historical design processes. Within architecture models created with string and weights were inverted and turned upwards to be built as arches for chapels or churches. The Catalan Architect Gaudí did not have to manually calculate architectural outputs, but could derive the shape of the curves through the force of gravity of the weights acting on strings (Burry, 2011, pp. 152–170, Davis, 2013), This type of modelling goes back to seventeenth century philosophy and architecture, or Robert Hooke’s saying “as hangs the flexible line, so but inverted will stand the rigid arch” (Hooke, 1675, p. 3; Heyman, 1995, p. 7 quoted in Davis, 2013). In other words, strings with weights settle into a shape, and can then be inverted to point upwards as an architectural model. Gaudí used this principle to design the Colònia Güell chapel by creating an inverted model using strings weighed down with birdshot lead pellets (Burry, 2007). Our method adheres to this tradition but concerns natural language. String-based methods of design have been around for centuries, including the method of inversion, which we applied to 3D language models too. We frame evidential coordinates as exerting a gravitational pull reflecting the weight of evidence. Afterwards we inverted the model with evidentiality and closeness to perceptible reality pointing upwards towards the viewer, and experienced as nearby. Perceptual closeness for the original speaker was mimicked for the viewer faced with the model’s protruding sections.

Our design process is equally indebted to literature on classical antiquity. The act of visualising language as a virtual material logic is timeworn and sedimented into our vocabulary. The written word is a visual representation and notation of acoustic shapes in language, yet also inhabits a figurative space as “text”, derived from the Latin for “woven” (textus). In ancient Greece, the verbal arts of song making and epic narration of Homeric poetry was visualised as the craft of pattern-weaving and sewing (expressed by the verbs huphaínein “weave”, or rháptein “sew”) (Fanfani et al., 2016; Nagy, 2017). Such textile picturing of language lives on in our language as “text”, “hymn”, or “rhapsody”. Likewise, our models evoke a figurative (Allen, 2000, pp. 32–33, 40) texture of language. Elements of language interlace with observable reality, and together they are conceived as a digitally woven fabric. Comparable to a “text” as a figurative rendering of language, the geometric texture of natural language is also figurative, but is conceptualised as “woven” in 3D.

Methods

Natural language is a high-dimensional form. We ought not be deceived into seeing language in the form employed by our writing system, as words, and sentences (Port, 2010, p. 304). Nor can its visualisation be limited to 2-dimensional “texts”, or a homogeneous plane of language laid out on paper and conceived as “superimposed upon” a solid surface (cf. Ingold, 2016, p. 59; Ingold, 2017, p. 54, 132). A model of language in 3D is based on a standard set of coordinates along the x, y and z-axis of Cartesian geometry. This is the method for making 3D objects commonly used in 3D modelling software. We mapped specific language features onto this system of coordinates to create a 3D representation of language (see Fig. 2). This process depends on the selection of language features, and the extraction of the form of language from its meaning-based aura. What we deliberately left out here is the word structure or semantic organisation of language. We selected and extracted the basic structure of language, its progression along a timeline, as a sequence of “syllables”. Syllables are the units of pronunciation including one vowel sound, which are the building blocks of words. This approach depends on the study of language as sound, an approach pioneered by Steven Feld and Don Brenneis (e.g., 2004). The design process was also based on our previous work in experimental architecture (Matthews, 2017), and the spatial conceptualisation of language (Pillen, 2017). This led to a portrayal of language as a frozen, abstracted, syllabic, sonic structure woven into perceptible reality by means of evidentiality.

Fig. 2: The Cartesian coordinate system used to present language in 3D.
figure 2

The parameters of a transcript mapped out onto x, y and z-axes.

The 3D models emerged from a translation of linguistic data into a visual programming language (Grasshopper integrated with Rhino’s 3-D modelling tools). A detailed breakdown of the 3D design process follows. First a transcript was manually formatted indicating the relative weight of each evidential as a number. This is the point where linguistic units are given a numerical value. For the sake of brevity, we demonstrate this principle through one example, the relative weighting of evidentials in an Amazonian language, Tariana. On the basis of research in linguistics combined with ethnography a hierarchy of preferred evidentials can be discerned within a speech community. In other words, certain evidentials are more highly valued. In the Tariana language speakers are grammatically obliged to indicate the nature of the evidence they convey. Such evidence falls into five categories. First of all a visual evidential for information obtained through direct visual observation and secondly a non-visual evidential for evidence based on other senses. Further categories indicate an absence of direct visual or non-visual evidence. The third concerns inferred evidence based on an observed event or state, and the fourth a reported evidential or the repetition of information related by someone else. Finally, there is a fifth category of assumed evidential referring to information obtained through reasoning or common sense without visual or non-visual experience (Aikhenvald, 2003: pp. 287–323). Such a hierarchy of evidential categories is language and culture specific, hence the need for rigorous research including methods from linguistics and ethnography to determine the value of each evidential. Based on this existing research, we gave numerical values to each evidential category in a language sample thereby coding their evidential weight (see Fig. 3).

Fig. 3: A brief example of coded evidentiality in the transcript of an Amazonian language sample.
figure 3

Evidentials are marked in bold with their relative weight indicated as a numerical value in brackets.

The formatted transcript is then entered into an Excel Spreadsheet, which separates out each line of the transcript on its own row and lists the relative evidential weightings within the body of the text. The Excel spreadsheet is then read by the Grasshopper script, which determines the number of syllables in each line of the transcript. The software also searches for square brackets to create a list with the numbers of the evidential weightings and their locations within the structure of the transcript. Data from the language samples are then organised by a familiar Cartesian system of axes x, y and z (see Fig. 2 above) through an initial digital threading of sequences of syllables on a timeline. The x-axis depicts the timeline of speech as one phrase succeeds another, with the y-axis indicating the number of syllables in each line of a transcript. The x- and y-axis can thereby represent the elementary structure of language reflected in a transcript. By now the model is a digitally woven pattern with two parameters that are given numerical values, the length of the “warp” defined by the length of the x-axis or timeline, and length of the “weft” (filling yarn) being the y-axis or number of syllables in a line of transcript. At this point we have the warp and the weft of a digital fabric based on the elementary structure of language discussed in the introduction, but it is flat.

Notation alone, whether in the form of written language or as a drawing in two-dimensional (2D) space, is too reductive to have the capacity to communicate complex, implicit aspects of natural language. Such knowledge can be invoked through other visual and physically tangible means. Here, we propose a 3D image of a linguistic connectedness to evidence. This method of converting a linguistic, evidential value to a numeric one is the point at which the image obtains its parametric and figurative character. The lines of a transcript or text are suspended between points in 3D space, and no longer feature as notation drawn on a surface. Digital curves were constructed by locating control points in the woven structure (Jabi, 2013), which can be thought of as virtual evidential weights. The third dimension or z-axis indicates such relative evidential weight within a language sample, at points where language is tied into the source of information, the evidence. Evidential markers thereby take on a virtual weight along a digital spline, or flexible curve. This way a selected geometric “wireframe” of natural language was designed, a still skeletal 3D model along a xyz axis.

Once this set of instructions for the computer—the algorithm—is created, any number of transcripts can be imported for 3D rendering. The elementary structure of the transcript and the evidential weightings or “input” change from one language sample to another. The main advantage of using a parametric, 3D scripting tool is that it allows for the output wireframe geometry to rebuild itself according to any imported transcript, making it possible to compare many different transcripts, without having to manually rebuild the geometry from scratch. When the software or Grasshopper script has detected all the data in the transcript it can start generating the 3D geometry (see Fig. 4).

Fig. 4: A visual representation of the “algorithm” or sequence of instructions for Rhino’s 3D modelling software.
figure 4

A set of rules for importing and processing the transcripts, in addition to creating the 3D geometry.

The software is understood to have a “morphogenetic” function as it creates the 3D shape. “Morphogenesis” is a term borrowed from biology where organisms develop a shape, body plan starting from an initial formless collection of cells. Software for 3D design, allows for a comparable “genesis of form” based on imported numerical data. The evidential weights are conceived as virtual material forces that gives shape to the wireframe. To portray natural language as a 3D object, the digital weave of warp and weft (filling yarn) is virtually completed to appear as a smooth, woven undulating surface as opposed to a loosely woven series of individual threads. This is the moment the digital fabric has been realised (see Fig. 5). This final element of the 3D design process is determined by the measured coordinates of the wireframe, but also has a series of material characteristics such as the ability to drape, or fold. Computated, parametric design thus makes possible a qualitative reading of an otherwise abstract, numeric dataset distilled from natural language.

Fig. 5: A first image of natural language in 3D.
figure 5

Formulated as a digitally woven fabric.

Prototypes

We selected language samples for the purpose of creating exploratory prototypes. The project began with a single prototype of a spoken Kurdish language sample. Kurdish is the language that is central to the research of one of the co-authors. Initially this research project only included one language. Once we had developed a method and a first visualisation emerged, we were curious about the potential of other datasets. To this effect we developed a dialogue with a seminal author within the study of evidentiality (Aikhenvald, 2018). Amazonian languages are known to have the richest array of evidentials that are currently documented. We chose a transcript in Tariana as a second sample, as Tariana has the most extensive system of evidentials of all the Arawak languages in Northwest Amazonia (Aikhenvald, 2003). The reason for the inclusion of this language was twofold. First of all, to visualise a most complex form of evidentiality in a language few people are familiar with. Secondly because this example, and many others like it concerns an endangered language, for which 3D visualisation could complement existing language documentation for future archives.

However, it is important not to give the impression that complex evidential systems can only be found in the small-scale indigenous groups in the Amazon, Australia or Native America. This led to the decision to juxtapose the 3D model of the Amazonian language sample with models based on two further languages. Indeed, the study of evidentiality is not limited to the limited class of languages that obligatorily encode the source of the information but concerns related phenomena in all languages (Fox, 2001: p. 168). Based on expertise available within our wider research team, we questioned whether data from an ancient language would allow us to model its evidential dynamics in 3D. The third selected language was spoken in the Neo-Assyrian period. This led to a model based on records of an argument in Akkadian extracted from a cuneiform clay tablet from the eighth or seventh century BC. Finally, colleagues who had been shown the models urged us on to include a sample of English, which led to a fourth and final model.

Our focus was on the 3D design process itself, and the way in which it brings the evidential character of each language sample to the surface. The selected languages are unrelated and reflect our research trajectory, with the aim of testing the possibility of a few initial 3D prototypes. We relied on the elementary structure of language described in our introduction, as a point of departure. The transcript of each language sample reveals the lines, as well as the length of each line. For the spoken samples in Kurdish, and Amazonian language and American English, the lines are made to resemble the structure of speech governed by pauses and intonation patterns. The elementary structure of the written Akkadian sample was determined by the lines on the cuneiform clay tablet from which it was extracted.

The model of evidentiality in Kurdish is based on a transcript of oral discourse in Kurdish by a co-author of this paper. The transcript for this image is part of her corpus of narratives in Kurmancî Kurdish about a lost homeland in Turkey and Syria, broadcast via Kurdish media. Data on Kurdish, spoken by millions in the Middle East and Europe’s cosmopolitan cities document large-scale evidential finesse. The coding of this sample was based on known markers of evidentiality documented by linguists (Johanson and Utas, 2000; Bulut, 2000). One notable evidential category is “direct reported speech”, a word-for-word quotation of a witnessed utterance, and representation of the voices of others. The blue highlights in the digitally frozen language excerpts indicate such overt references to the source of quoted material. Crests that protrude towards the viewer indicate the explicit linguistic ties to perceptible reality when originally uttered (see Fig. 6).

Fig. 6: Prototype of sample in Kurdish (Kurmancî or Northern Kurdish).
figure 6

Blue highlights indicate word-for-word quotation.

The next model depicts a transcript in Tariana, an Amazonian language. The selected sample is part of a corpus of stories of varied genres recorded since 1991 and features a total of 120 evidentials over 33 lines. It is an account entitled “The Yanomami” about an encounter with a neighbouring people. The first 10 lines of this language sample were presented as an excerpt earlier to illustrate a transcript of recorded speech (see Fig. 1). Here, we add an English translation, presented as prose to convey the meaning of this excerpt in a nutshell, and to offer a moment of respite from geometric abstraction: “Here is another (story) again. I will tell about what’s their name, Howler Monkey people, what I saw myself. I will tell (about) what I saw myself. There, in Venezuela, what’s its name, out of the Orinoco river, comes out another (river) called Kunukunuma. Then our boss was (there), then we went upstream, and arrived at the mouth of that river. He said, “There are Howler Monkey People (Yanomami) who owe me,” they say he took a gun, and they took a teacher from there. “So I will pay you with manioc flour,” he said to me, “so go and get (what they owe me)”. “OK,” we said, four of us went, (including) one from the mouth of this river, who spoke the language, we went. Oh! Then we went far. It was very early, four o’clock. We went and entered (the river) and went until it was midday, then we arrived. We arrived, it was very dangerous, in the nice area, then we arrived at a real plain at the hill, said to be “Descent of a hill of a deer and an evil spirit” called “Peak”, then, if one looks up, there was a green lofty mountain standing (there). We arrived close to this mountain”. Evidentials are genre-specific, and this narrative relies on visual, non-visual and reported evidentials. Such obligatory grammatical markers indicate whether the information was obtained through direct observation, whether the evidence is based on other senses, or whether the information was related by someone else and is reported here (Aikhenvald, 2003, pp. 287–323, 630–637). The crests featuring in nearly every section of this 3D language model are a visual hint for non-Tariana speakers of how the linguistic threads are tied into the source of the information they encode (see Fig. 7).

Fig. 7: Prototype of sample in an Amazonian language (Tariana).
figure 7

Tariana has the most extensive system of evidentials amongst the Arawak languages.

A further prototype is based on an ancient language and data extracted from a cuneiform tablet of Neo-Assyrian origin (Parpola, 1978). The analysis of the material is derived from a wider study of the evidence-based language designs that underpinned the construction of an Assyrian empire (Kanchan, 2018, pp. 30–65). The data have been extracted from a cuneiform tablet and letter entitled “The Crimes of Guzana”. This is a letter with complaints addressed to the king about the unruly city of Guzana. Here, we add an English translation of a noteworthy segment of the sample, conveyed in prose for the sake of brevity: “Their other crime: In the reign of the father of the king, my lord, they wrote the silver quota of the shepherds on an Assyrian document and on an Aramaic document and sealed the amount of silver with the neck seals of the treasurer Nabû-qati-ṣabat, the village manager, and the scribe, with their neck seals (and) with the (royal) stamp seal, saying: “If they don’t pay this year, they will die!” But when a bribe was made, they cut off the stamp seals and their neck seals, and threw them away. Did they not cut them off arbitrarily? Qurdî, the chariot driver of the treasury horses, is treading on the authority of the palace. He has laid his hands on the cone of Ištar, saying: “Strike me! Let’s see (what happens)! Bring me an iron knife, so I can cut it off and stick it in the governor’s ass!” I am unable to tell “what else” he has said about everybody” (Luukko, 2012). Rendered in 3D, our image’s highly textured sections reflect the evidentials (Kanchan, personal communication) and evidence used to support the complaints (see Fig. 8).

Fig. 8: Prototype of sample in the Neo-Assyrian dialect of Akkadian.
figure 8

Data sourced from cuneiform clay tablet (ca. 911–612 BC) in the British Museum Collection (Kanchan, personal communication).

English as a global, modern language has its own repertoire of evidential devices. Users of English regard knowledge as factual much of the time, expressing it without overt evidential qualification (Chafe, 1986, p. 271). Such a stance is reflected in the evidential value of an indicative sentence. A vast majority of sentences are in the “indicative mood”, and include a verb that makes a statement of fact. Likewise, an evidential value is displayed in phrases that mirror knowledge based on belief or opinion (Frajzyngier, 1985). But the grammatical encoding of evidentiality is not obligatory as has been documented most extensively for Native American, Amazonian, and Australian Aboriginal languages. For English, scholars have used the term “evidentiality” in its broadest sense, as an attitude towards evidence and knowledge (Chafe, 1986, p. 262) distinct from compulsory evidentiality dictated by a language’s grammar. A final prototype is based on a sample of American English, a section of a 2004 US presidential debate analysed within the literature on linguistic anthropology (Lempert and Silverstein, 2012, pp. 147–160). The resulting image is characterised by the gentle undulation of its vaulted geometry. In the first 3 images above the citation of information related by others via reported speech markers is shown as a blue highlight. The monochrome finish of this final model reflects the absence of direct reported speech, in this particular sample (see Fig. 9).

Fig. 9: Prototype of sample in American English.
figure 9

Evidentiality is not considered a grammatical category in English and is optional.

The sample for each language was picked amongst uncountable possible others. Each model is unique and the chosen language samples belong to a particular genre within a language, including narrative, a letter to a king, and political rhetoric. Numerous other samples would have been equally valuable for this research. A guiding principles was the availability of a language sample in a published or public format. We made sure our coding relied on analyses by experts (Johanson and Utas, 2000; Bulut, 2000, Aikhenvald, 2003, Kanchan personal communication, Lempert and Silverstein, 2012). Our research question specifically addresses the possibility to model natural language in 3D. Debates about comparative grammar, and the comparison of evidential systems across languages belong to a different discipline, cast in the language of linguists. Our models are tangential to such debates and ought to be appraised in their own right for their capacity to render a particular language sample in 3D.

“Orthographic” projection is the most common method used to display 3D objects, as 2D drawings of a top view, a front view and a side view. This is the classic method architects use to draw plans. An orthographic view is the usual projection of choice for displaying visual representations of numeric datasets too (Tufte, 2001, p. 9, 39, 43). We are used to seeing 2D graphs along an x- and y-axis, be it in weather or stock market charts. As we assigned numerical values to the evidential weight of linguistic elements, we did not wish to simply add a graph or an abstract 2D image to the existing literature on evidentiality. Instead, we visualised evidential weight as a 3-dimensional geometric distortion, by assigning “different levels of value” on the z-axis (Romanyshyn, 1989, p. 41, 43, 53). Moving away from a 2D representation of language, means that a dominant image is challenged: language as text laid out on a page. In the same vein, language no longer features as a linguistic trace superimposed upon the surface of reality. In lieu of reality being “a solid globe upon the outer surface of which” (cf. Ingold, 2017, p. 63) languages unfold, a “reality-effect” is digitally woven in 3D. Barthes coined this term for modern textual devices that establish literary texts as realistic (Barthes, 1989, p.141–148). We borrow the term, but here “reality-effect” stands for language’s 3D shape and interplay with perceptible reality, as strands of language dock into perceptual anchor points. Meanwhile “placeholders” within discourse, such as evidentials, act as receptors for observable phenomena in the social and material world. The conglomerate of a particular language sample and perceptible reality appears textile. 3D design reveals language “threaded through” or woven into observable reality as “an all-enveloping infusion” (cf. Ingold, 2017, pp. 53, 36), and “reality-effect”.

The results of this study are four prototypes, each documenting a brief instance of language use. The shape of the model based on the Kurdish sample allows a viewer to observe the regular occurrence of undulations throughout the model. This 3D pattern indicates a sustained presence of evidentials, whilst the equally prevalent blue highlights reveal reported speech, or directly cited information. In the model of the Amazonian language there are crests in nearly every section of the 3D discursive structure. This shape reflects the fact that evidentials are obligatory in the grammar of this language and occur in all the lines of the transcript. The characteristics of the geometry of the sample from an ancient language, Akkadian are distinct too. Here, the middle section of the narrative on the crimes of the citizens of Guzana is highly textured with the blue highlights indicating a reliance on reported information. One can see how language is threaded through the perceptual evidence for the complaints addressed to the king. Finally, the model based on American English stands out for its monochrome undulating waves and almost uniform shape. This result adheres to expectations since the transcript did not include directly quoted material, which would have generated blue highlights. Also, English speakers are not obliged by grammatical rules to indicate the source of their information at all times, as is the case for Amazonian languages. Hence the less dramatic peaks and vales in this final model. This significant contrast between the 3D models based on the Amazonian and English language samples is a reflection of this grammatical divergence.

Each 3D shape and model is of course unique, linked to a specific and brief manifestation of a language. We would encourage our readers to look at the models again, whilst envisaging them as a “reality-effect”, a momentary fusion of language and observable reality. The unique shape of each prototype offers a reminder of the diversity of reality-effects across languages and peoples. We often have the impression we can learn another language through the principle of translation. The intricate nature of evidential systems is often lost in translation though. Especially if translation is based on the question “what does this mean?”, or “how can I make myself understood?” The observation of the visual clues in our prototypes goes some way towards answering a wider question about living a reality through another language. Briefly entertaining the possibly of seeing oneself speaking Kurdish, Akkadian, an Amazonian language or English now includes a rare visual awareness of the reality-effect of each language. Language as an all-enveloping infusion isn’t just a concept. Our 3D models allow us to reveal the role of a distinct language as a mould for configurations of the observable. This is the central message our study conveys.

Conclusions: natural language printed in 3D

Our research question addressed the possibility to model an aspect of natural language in 3D. This project reveals that it was the development of advanced software for video animation that allows for a digital image of natural language to be woven in 3D. The prototypes are both communicative and inquisitive. Their form was uncertain until they were created, at which point they revealed the shape of each language sample designed in 3D. For us the “guide rails into the blindness of an as yet un-realised dimension” (Evans, 1997, p. 173) were made of 2 elements. The first element was the conversion of a hierarchy of evidential weights into a set of numbers by co-author, Alex Pillen. As a second element, co-author Emma-Kate Matthews developed the algorithm or the instructions for the 3D software programme. Later the resulting 3D image was printed as a 3D object. A first 3D print of a stretch of language was made in strong and flexible Nylon Plastic (PA12, Polyamide) to accentuate the textility of the model (see Fig. 10a). To 3D print in nylon plastic, a bed of nylon powder is used to then sinter the powder with a laser layer by layer, solidifying the powder through a process called “Selective Laser Sintering”. A second model was produced in a silver alloy selected as a softer, malleable metal and chosen for its ability to showcase intricate details (see Fig. 10b). A later model was commissioned to be made in a rubber-like material (not shown here) to further accentuate the fabric-like nature of the prints. We await the invention of new materials to be able to print further digitally frozen linguistic shapes, in which the thickness of the individual “linguistic fibres” can be reduced further.

Fig. 10: 3D Prints of a Kurdish language sample.
figure 10

In nylon plastic (a), and a silver alloy placed on a reflective surface (b).

This project builds upon existing work within the history of parametric design and the study of evidentiality within linguistics and the anthropology of language. To lay out our contribution to each of these fields, we begin with design, notably the work of Frei Otto at the Institute for Lightweight Structures at the University of Stuttgart, who extrapolated the method of the architect Gaudí. Designing with parametric models became a way of “form finding”, and Otto no longer relied on strings or weights. His method of form finding included, amongst other things, minimal surfaces derived from soap films. If two soap bubbles come into contact, they merge and a thin film is created in between them, a delicate layer of liquid surrounded by air. Such soap films have been used as a model for 3D design (Otto and Rash, 1996). Spline-based computation or 3D modelling has thus become a search space for the projection of new forms (Hight, 2013, pp. 419–420). Our project builds upon this legacy of form finding and would be hard to imagine without references to Otto’s oeuvre. Natural forms can be transposed into digital format, a computer-generated 3D image, which often leads to valuable new material objects or buildings. Our research extends this remit to include source material more ethereal than soap films, an ephemeral utterance or fleeting speech event (see Fig. 11). An age-old comparison of geometry and natural language became the starting point for our parametric reasoning. Natural language is conceived as an abstract force, that brings visual and auditory observations together into a particular configuration. We can now set up a formatted transcript as a space through which algorithms search to project its geometric shape in 3D.

Fig. 11: A fabric-like digital image of natural spoken language.
figure 11

As a 3D “text”, derived from the Latin for “woven” (textus).

Secondly, we offer a critical reflection on our contribution to the study of language and evidentiality. The prototypes’ vaulted geometries are a new mode of exploration of the time-honoured comparison between grammar and geometry. Linguistic relativity and the language-specific role (Hymes, 1966; Jakobson, 1985) of an evidential function become concrete. The most elaborate and well-documented evidential systems belong to endangered languages in small-scale societies. In an era of rapid loss of linguistic diversity (Walsh, 2005), language visualisation in 3D concerns languages that will not leave behind a written record, and whose evidential systems are often lost in translation. Besides, analyses by linguists document evidentiality for a given language, but are unlikely to reveal the pattern and dynamic deployment of an evidential function in real time as an utterance unfolds in situ.

One could question why we chose to focus on the history of parametric design as a starting point for our project. Indeed, other modelling techniques within architecture and design may be relevant to study the architecture of natural language. The rationale for choosing parametric design methods is the following. Parametric design, be it in Gaudí’s architecture or contemporary software programmes is defined by “parameters” that vary and can be easily adjusted. Parametric design underlies the development of contemporary software for video animation and film. Likewise, language is a fluid system that is in constant motion. Its grammatical features change from minute to minute. In our study, a grammatical marker becomes a “parameter” that affects the other aspects of language in its immediate vicinity. The position of grammatical elements in the elementary structure of language—along the lines of the transcript—varies, as does their effect. A grammatical feature such as evidentiality is thereby defined as one “parameter” or variable that can be edited according to the data-input, and given an evidential value. The other parameter that varies is the number of syllables in the line of a transcript. Within parametric design the relationship between elements is used to inform the design of complex structures. Here, this concerns the dynamic relation between a grammatical element and the surrounding structure of a language sample, which alters the resulting geometry. The “parametric” software we used thus allowed us to both generate and visualise the “geometry of form principles” (Jakobson, 1981, pp. 76–77) inherent in a transcript as language unfolds. For now, as this is uncharted territory, we have opted for displaying the language samples as frozen in time, and limited our data-input to a brief transcript for each language. The capacity of parametric design to generate moving images on the basis of rapidly changing parameters remains relevant though.

What our project entails for research in the not too distant future remains an open question. We anticipate that larger datasets could be imported, and that moving images of evolving shapes in natural language could be produced. Parametric design in architecture eventually led to contemporary 3D animation for videos, games, and films. Likewise, our research trajectory leaves room for future visualisations of the spatial dynamics of natural languages. We began with a single and concise grammatical category, evidentiality, but look forward to other 3D explorations of distinct languages. As mentioned earlier, evidentiality is part of a wider category of ephemeral “placeholders” that constitute the spatial nature of language. Visualising the use of pronouns, of you, she or we across spoken languages, or the equivalents of this and that in English with contemporary software for video animation would equally be fascinating. Its complexity and scope leading to open questions for future research. We anticipate that the development of algorithms on the basis of other grammatical features, and use of large datasets, could shed light on language dynamics that remain overlooked or underresearched.

We also ought to question how such 3D visualisations contribute to our “knowledge” of the recorded language samples. Sometimes visible patterns in nature can be modelled mathematically. Let’s think for example about a spiralling shell of a snail, the structure of the pineapple, or indeed bubbles and foam in the sea. High-dimensional forms in language and the patterns of grammar have evolved naturally in humans through use and repetition over millennia. We designed a parametric image of the “reality-effect”, which the flow of language generates by means of evidentiality. It is the history of mathematics that has shown us the significance of the physical representation of concepts. It seemed difficult to develop a correct imagination of complex multi-dimensional surfaces, such as Boy’s surface (Weisstein, 2021), without having a model at hand which could be touched and rotated to understand its multiple faces simultaneously. For such essential geometries, a model—be it realised and observed, or only vividly imagined - was not a means to a pedagogical end, but played a role in the development of mathematics (Ording, 2011). In comparison our task is limited but we equally ask what kind of “knowledge” our prototypes bring about.

Mathematical objects, an astronomer’s orrery, or the chemist’s molecular model all engender “tacit knowledge”. Such knowledge is often a valid anticipation of their yet indeterminate implications—of further understandings arrived at in the end (Polanyi, 2009, p. 24). As we tried to grasp evidentiality in its figurative form based on spatial intuition we kept in mind this idea of “tacit knowledge”. The models’ effect could not be produced by means of a typical drawing, a diagram, or notation on a page. The prototypes lead to additional knowledge in material form, an unspoken form of knowledge, which can be presented visually (see Fig. 12). To bring about such knowledge 3D printing of the models seemed essential. This was the most challenging and arguably least successful aspect of our study. The majority of our models were rejected by the manufacturer. Their structure was too complex for current 3D printing techniques or the individual linguistic fibres were too thin. They would have simply melted together in the printer. Each 3D print we present here was preceded by half a dozen unsuccessful attempts. This process points at the limits of the scale of our study, and the level of detail that can be meaningfully observed once the datasets are enlarged. For now, we opted for short transcripts, but we are of course curious about the potential visualisation of larger and more diverse samples. It remains to be assessed whether a novel printing technique with new materials or video animation will be most effective to reveal the shape of large natural language samples.

Fig. 12: A nylon prototype made via the latest industrial type of 3D printing (Multi Jet Fusion).
figure 12

Exhibits the fine details of this Amazonian language sample designed in 3D.

Many aspects of languages around the world can’t be translated into English because the equivalent meaning and cultural framework doesn’t exist. Likewise, it is not easy to translate evidential finesse from one language to another, unrelated language. Evidentiality features as a topic in comparative grammar and linguistics, but is often secluded within technical treatises. By contrast we are all familiar with the way fabrics behave, regardless of the languages we speak. The woven nature of the prototypes builds upon such familiarity and allows a tacit, non-explicit encounter with diverse language samples. The models and 3D printed objects do not possess a particularly simple form, but their shape follows a general law that can be discerned, both by means of grammar and spatial intuition. Contemplating the 3D shape of a natural language sample differs from a linguist’s exercise in analytical lucidity, and constitutes a form of tacit knowing of languages we may not speak.