Browsing through sealed historical manuscripts by using 3-D computed tomography with low-brilliance X-ray sources

Stromer, Daniel; Christlein, Vincent; Martindale, Christine; Zippert, Patrick; Haltenberger, Eric; Hausotte, Tino; Maier, Andreas

doi:10.1038/s41598-018-33685-4

Download PDF

Article
Open access
Published: 18 October 2018

Browsing through sealed historical manuscripts by using 3-D computed tomography with low-brilliance X-ray sources

Daniel Stromer¹,
Vincent Christlein ORCID: orcid.org/0000-0003-0455-3799¹,
Christine Martindale²,
Patrick Zippert³,
Eric Haltenberger³,
Tino Hausotte ORCID: orcid.org/0000-0002-2923-3217³ &
…
Andreas Maier¹

Scientific Reports volume 8, Article number: 15335 (2018) Cite this article

4444 Accesses
17 Citations
105 Altmetric
Metrics details

Subjects

Abstract

Severely damaged historical documents are extremely fragile. In many cases, their secrets remain concealed beneath their cover. Recently, non-invasive digitization approaches based on 3-D scanning have demonstrated the ability to recover single pages or letters without the need to open the manuscripts. This can even be achieved using conventional micro-CTs without the need for synchrotron hardware. However, not all manuscripts may be suited for such techniques due to their material and X-ray properties. In order to recommend which manuscripts and which inks are best suited for such a process, we investigate six inks that were commonly used in ancient times: malachite, three types of iron gall, Tyrian purple, and buckthorn. Image contrast is explored over the complete pipeline, from the X-ray CT scan and page extraction to the virtual flattening of the page image. We demonstrate, that all inks containing metallic particles are visible in the output, a decrease of the X-ray energy enhances the readability, and that the visibility highly depends on the X-ray attenuation of the ink’s metallic ingredients and their concentration. Based on these observations, we give recommendations on how to select the appropriate imaging parameters.

Interferometric imaging of amplitude and phase of spatial biphoton states

Article Open access 14 August 2023

Mid-infrared wide-field nanoscopy

Article 17 April 2024

Metasurface array for single-shot spectroscopic ellipsometry

Article Open access 10 April 2024

Introduction

Historical documents are relics of the past, containing information about long-forgotten times. Due to aging or external influences, many of these cultural assets cannot be further investigated as they are too fragile to open or page-turn, so their valuable contents remain hidden. In 1750, Karl Weber discovered the ‘Villa dei Papiri’ near Herculaneum, where more than 1800 papyri were found carbonized by the eruption of the Mount Vesuvius in 79 AD^1,2. Although the volcanic eruption preserved the scrolls, it is impossible to unroll them without causing damage. There are also less extreme examples, such as in the Germanisches Nationalmuseum (Nuremberg, Germany), where pages in the book fold area are stuck together due to aging. In addition, external influences such as floods, fire or war may continue to generate more and more sealed documents. A recent example is the Duchess Anna Amalia Library fire in 2004 (Weimar, Germany) where 62000 volumes were severely damaged, not only by fire, but also by the water used during fire-fighting^3,4.

Historical books basically consist of three parts: the cover, the pages and the ink(s). Book covers are diverse as they were hand-made and sometimes very luxurious, using materials ranging from leather or wood to ivory, gold or silver^5,6. Parchment, papyrus or handmade paper was used as writing medium, where the latter was established in the Middle Ages⁷. Since the Roman Empire, iron gall ink has been widely used for writing^8,9,10. Thomas Jefferson’s ‘Declaration of Independence’, Goethe’s ‘Faust’, Mozart’s ‘The Magic Flute’ and even some of Rembrandt’s sketches¹¹ are just a few examples which were written or illustrated with iron gall ink. Due to its indelible nature, it is even partly still in use today. The Germanisches Nationalmuseum has a collection of historical works consisting of 3380 manuscripts, dating from the early Middle Ages to the early 20 century, where about 95 percent of them were written with iron gall ink. The main iron gall ink ingredients are tannic acid, gum arabic and iron salt (FeSO₄)^12,13. Since this ink could only be used to write in black, other inks were invented. Malachite ink is based on metallic particles with its greenish color caused by Cu₂CO₃(OH)₂^14,15. Tyrian purple ink (also called ‘Royal purple’) used the mucous secretion of sea snails containing Bromine¹⁶ as dye, whereas buckthorn berries were used to achieve yellow and ocher inks¹⁷.

In 2007, Bergmann et al.¹⁸ used X-ray fluorescence imaging to reveal hidden writings written with iron ink on pages of the Archimedes palimpsest. The same method was used to differentiate between two inks of the Qur'ān palimpsest¹⁹, where the erased ink had a different material composition to that of the newer ink. However, this method can only be applied directly to a specific page and requires the book to be opened. Where opening the book is not possible, 3-D X-ray CT imaging can be employed, as in this work. This method has not only been used for human medicine but also for archaeological purposes such as scanning cultural heritages^20,21,22. Previous works show that conventional micro-CT systems can deliver good results by exploiting the various material characteristics of historical ink and paper^{23,24,25,26,27,28}. If the ink composition has a higher attenuation than the paper’s cellulose, the ink is recognizable in the volume of an X-ray CT scan²³ enabling the contents of a book to be known without the need to open a fragile manuscript. This is most effective in cases where metallic particles were present in the ink. Even ink that has been erased and overwritten can still be recognized if some particles penetrated deeper layers of the paper are still present^29,30. With its high resolution, 3-D X-ray CT is well suited to digitize historical documents. A disadvantage of this method is the X-ray radiation which could accelerate the aging process of cellulose when performing a scan^31,32. However, the impact of this X-ray radiation on the dry paper and ink is still very limited³³. The dose can also be kept to a minimum while preserving the relevant information, as shown in our previous work³⁴.

One method which does not expose the document to ionizing radiation is 3-D Terahertz imaging^35,36. A major drawback of this technique is its limited penetration depth, allowing only a few pages to be digitized. Until now, Terahertz imaging has not been evaluated on real historical documents where the effect of the metallic particles in the ink reflecting the Terahertz waves is unknown. A second technique is X-ray phase contrast imaging capable of scanning manuscripts, books or archaeological relics^37,38,39. However, this technique is neither widely accessible nor mobile. Libraries would have to transport their valuable books and documents to the measurement centers which is costly due to security and insurance reasons. Whereas micro-CT systems could be mounted in libraries because they are more mobile than phase contrast hardware.

This paper describes the complete pipeline for an X-ray CT digitization of a book which cannot be opened anymore. Six different inks (metallic and non-metallic) were used to generate writings and evaluate their visibility. First, we performed three 3-D X-ray micro-CT scans with different parameter sets and used a fast and accurate 3-D reconstruction technique leading to 3-D volumes of the book. Next, we applied an algorithm to these volumes to extract and map all individual pages into 2-D⁴⁰ for further investigation. Finally, we compared the three different scans with regard to the visibility of the writings. To the best of our knowledge, the entire digitization pipeline for such books has not been analyzed in detail. The process contains many variable options and parameters: the book’s components, 3-D scan parameters, 3-D reconstruction approaches, page extraction and 2-D mapping. In this work, we provide details on each step and expose the limitations of each proposed technique. We also highlight the new challenges that will arise due to the resulting 2-D pages of the X-ray volumes, as this output fundamentally differs from the state of the art high resolution camera images currently used within the field of document digitization.

Material

Before working with real historical documents, every process within the digitization pipeline must be optimized. This ensures that only one scan is performed with optimal parameters thus minimizing radiation exposure to the document. Therefore, we made use of a self-made book with the following dimensions: 17 × 13 × 3 cm (L × W × H), as shown in Fig. 1(a). It consists of a buffalo leather cover and 56 pages of handmade paper with a page thickness of about 150 μm. The focus of this study is to investigate the visibility of six different inks in the X-ray CT volume: three different iron gall inks, malachite ink, Tyrian purple ink and buckthorn ink. It should be mentioned that we used two different iron gall inks and additionally produced a third version by adding more FeSO₄ to one of the original inks. This enables the assessment of the effect of the number of iron particles on the visibility of the output. Malachite and Tyrian purple ink were chosen because they consist of different metallic components, while buckthorn ink is only cellulose-based.

Initially, we performed an energy-dispersive X-ray spectroscopy (EDS) in the range of [0, 8] keV for each ink to analyze the present materials, as shown in Fig. 2. EDS is based on X-ray fluorescence and is capable of obtaining the elemental composition of materials, from which the ink’s X-ray characteristics can be derived. Diverse elements have varying X-ray attenuations for different X-ray energies allowing the optimal scanning energy to be determined. Furthermore, performing an EDS of the ink in advance may give insight into whether a CT scan of the given document is suitable. In a real setup, where the document cannot be opened anymore, such a measurement cannot be acquired. However, we wanted to ensure that the used materials were not contaminated by other materials affecting the attenuation. The EDS revealed that malachite ink consists of Copper (Cu), Carbon (C) and Oxygen (O) and all iron gall inks consist of Iron (Fe), Oxygen (O) and Sulfur (S). The analysis of Tyrian purple ink showed that Carbon (C), Oxygen (O), Aluminum (Al), Sulfur (S), Chloride (Cl) and Tin (Sn) were present. It should be mentioned that while this ink was formerly made from the secretions of sea snails, producers currently substitute the ingredients with cheaper ingredients. In comparison to the previous mentioned inks that consist of metallic particles, we also evaluated buckthorn ink where the EDS detected only Carbon (C) and Oxygen (O).

As X-rays are used to digitize the document, the X-ray attenuation coefficients μ′ = μ/ρ were calculated for all used inks based on the NIST XCOM database⁴¹, as illustrated in Fig. 1(c). The energy range was chosen based on previous work²⁴ in which we showed that the higher the attenuation difference of the paper and ink, the better the visibility of writings in the output volume. One can observe, that malachite has the highest attenuation, followed by iron gall ink and Tyrian purple. Buckthorn ink, as well as paper, only consist of cellulose, resulting in the same X-ray attenuation coefficient. Further increasing the X-ray energy only leads to slight differences of the materials’ X-ray attenuation coefficients thus the writings would vanish in the output.

Methods

The volumetric scan

The scans were performed on a 3-D X-ray micro-CT using cone-beam geometry. The calculations were made by using the CONRAD framework⁴². Figure 1(b) shows the schematic structure of the scan. The book is placed on a turntable, where a projection image is acquired at each angular step. From the resulting set of projections, a 3-D volume is calculated by applying an appropriate reconstruction algorithm. The book’s front cover lies orthogonal to the rotation axis z, aiming to achieve a balanced penetration length of the X-ray beams through the object²³ for artifact reduction. We performed three 360° scans with 1800 projections. As mentioned above, the X-ray energy should be minimized and therefore we chose 30 kV, 40 kV and 50 kV, where 30 kV constitutes the scanner’s lowest configurable energy. For the 50 kV scan, we used an additional copper pre-filtration of 0.25 mm to narrow the polychromatic X-ray spectrum⁴³. The drawback of such low X-ray energies is increased noise, so a trade off between noise and ink visibility is necessary. By using 30–50 kV, a tube current of 3 mA and an exposure time of 2 s, the signal-to-noise ratio (SNR) is still acceptable. The ratio of source-to-object (SOD) (710 mm) to source-to-detector (SID) (1377 mm) distances and the detector pixel size of 0.2 × 0.2 mm² results in a voxel size of 103 × 103 × 103 μm³.

3-D X-ray CT reconstruction

The state-of-the-art 3-D reconstruction approach in cone-beam CT is the Feldkamp, Davis and Kress algorithm⁴⁴ where the projection images are cosine weighted, ramp filtered and back projected to receive the final 3-D volume. According to Tuy’s condition⁴⁵, the FDK algorithm delivers exact results only in the central plane. The greater the cone-angle γ and the further the slices are from the centre, the more the result suffers from cone-beam artifacts due to the insufficient circular orbit sampling^46,47,48. We tried to minimize these artifacts with a horizontal book placement, reducing the effective worst-case cone-angle ${\gamma }_{max}=\arctan \,(0.5\cdot {H}_{book}\cdot {d}_{min}^{-1})$ to a minimum. H_book denotes the book’s height and d_min denotes the minimum distance between the X-ray source and book, calculated by d_min = SOD − 0.5 · l_book. Here, l_book denotes the book’s diagonal length (214 mm). This yields γ_max ≈ 1.43° and γ_min = arctan (0.5 · H_book · SOD⁻¹) ≈ 1.21° at the rotation center. Such small cone angles result in more than 99 percent of the Radon sphere being sampled⁴⁹ enabling in a very high reconstruction accuracy. In addition, the ink is more dispersed along the plane of the pages and conversely, the ink lies in regions where adjacent pages are very close to each other in the perpendicular page plane. This fact makes a horizontal document placement in the scanner more favorable due to artifact reduction.

Page extraction and 2-D mapping

The greatest challenge after performing the 3-D X-ray micro-CT scan is to handle the 3-D volume. Due to the high resolution and the thin and wavy pages, which are also squeezed together, the separation of pages within the 3-D volume is not trivial. Each page has a thickness of around 150–200 μm resulting in around 1–3 voxels with the given voxel size. The rather low X-ray energies simultaneously increased the noise level. Figure 3(a) shows an exemplary xy-slice of a volume where multiple pages are present such that the investigation of a single page is not possible without virtually flattening the pages. As a manual segmentation is too time-consuming, a fully-automatic page-extraction method was developed⁴⁰ to extract and map the pages to 2-D. The pipeline of the complete algorithm is illustrated in Fig. 4 and can be broken down into three steps:

Volume binarization (blue boxes): Initially, the volume V is filtered using a Guided-filter⁵⁰ for edge-preserved smoothing. Due to the pages appearing like small vessels within the volume, illustrated in Fig. 3(b), we utilized a vessel segmentation technique for volume binarization. Vesselness filtering proposed by Frangi et al.⁵¹ is applied on every yz-slice resulting in V_vessel. The Vesselness algorithm first filters an image I with multiple scales s_i, using a Gaussian filter. Then, the eigenvalues λ₁, λ₂ (|λ₁| ≥ |λ₂|) of the filtered image’s Hessian matrices are calculated for each pixel. The eigenvalues indicate the principal directions of an image’s second order structure, leading to the smallest curvature along the page. The blobness measure $R={\lambda }_{2}\cdot {\lambda }_{1}^{-1}$ is calculated and is close to 0 in case of a page being present. Next, $S=\sqrt{{\lambda }_{1}^{2}+{\lambda }_{2}^{2}}$ is computed. When a pixel is part of a page, S will become larger since at least one of the eigenvalues will be large. A page is enhanced by evaluating
$${{\boldsymbol{V}}}_{{\rm{v}}{\rm{e}}{\rm{s}}{\rm{s}}{\rm{e}}{\rm{l}}}=\{\begin{array}{cc}0 & ,\,{\rm{i}}{\rm{f}}\,{\lambda }_{1} > 0\\ \exp (-\frac{{R}^{2}}{2{\beta }^{2}})(1-\exp (-\frac{{S}^{2}}{2{\gamma }^{2}})) & ,\,{\rm{o}}{\rm{t}}{\rm{h}}{\rm{e}}{\rm{r}}{\rm{w}}{\rm{i}}{\rm{s}}{\rm{e}}\end{array},$$
(1)
where β and γ are control parameters depending on the grayscale intensity of the image. The result of Eq. (1) yields a probability map for the likelihood of a certain voxel being part of a page (V_vessel → 1) or air (V_vessel → 0) and subsequent global thresholding (V_vessel ≥ 0.1 = 1, V_vessel < 0.1 = 0) binarizes the volume. The result of this step is shown in Fig. 3(c). Subsequently, the total number of pages N and the mean page thickness $\bar{p}$ are estimated by counting all pages and calculating their thicknesses for all rows in every yz-slice. Given $\bar{p}$, overlapping pages are separated by splitting pages in the center that are thicker than $1.5\cdot \bar{p}$. Fig. 3(d) shows the result of the entire binarization and separation process.
Page segmentation (green boxes): Within this process, two steps are repeated until a stop criterion is reached. First the actual number of detected pages N_y are counted for all rows Y of a yz-slice; if N_y differs from N, the corrupted row number y is stored in a list. After performing this step for all yz-slices, the corrupted rows are replaced by the nearest uncorrupted row with the same y indices. The total number of corrupted rows is stored as ε. The xy-slices of the volume are median-filtered to remove outliers and to close gaps. These steps are repeated until the newly calculated ε is equal or greater than the old, i.e. that no further enhancement is possible because the number of smoothed rows does not change.
Texturing and 2-D mapping (orange boxes): The final step is to sample the page at each point (x, y) along z-direction and store the maximum value in a 2-D image. The maximum is considered because the writings appear brighter in the image than air or paper. As the pages can be wavy, their curvature has to be considered. Therefore, sampling along the line perpendicular to the page’s central line considers the waviness of the pages and yields more exact results than straight sampling.

Contrast ratio estimation

Before measuring objects with an X-ray CT scan, the expected contrast C of the given materials compared to vacuum can be estimated from the scan parameters and the mass absorption coefficients ${\mu ^{\prime} }_{m}$ of each material m (c.f., Fig. 1(c))⁵². C is given by

$${\rm{C}}=1-\exp \,(\,-\,{p}_{m})=1-\exp (\,-\,{\mu ^{\prime} }_{m}{\rho }_{m}{d}_{m}),$$

(2)

where p_m denotes the attenuation of material m, and d_m the material’s thickness. In a probabilistic sense, this is equivalent to the percentage of photons that will be absorbed within the material. In the multi-material case, the individual material attenuations are summed up over all materials M according to

$${{\rm{C}}}_{{\rm{mm}}}=1-\exp \,(\,-\,{p}_{mm})=1-\exp (-\,\sum _{i}^{M}\,{p}_{i}).$$

(3)

To obtain the final contrast ratio, we divide the multi-material C_mm of ink and paper by the paper-only case:

$${{\rm{CR}}}_{{\rm{est}}}=\frac{{{\rm{C}}}_{{\rm{mm}}}}{{{\rm{C}}}_{{\rm{paper}}}}\mathrm{.}$$

(4)

The higher CR_est, the better the contrast of the ink. This is only an approximate estimate neglecting the effects of noise. Under almost noise-free conditions it is an indicator, whether an X-ray scan will lead to reasonable results.

Results

Each scan took about 2 h where the 3-D reconstruction was performed on-line resulting in a scan and reconstruction time of about 2 h 10 min. The runtime of the extraction algorithm was approximately 15 min for each volume. For the three scans, the resulting 2-D mapped pages were compared with regard to the visibility of the inks. Before extracting the pages, the book cover was deleted manually within every volume to enhance the algorithm’s output.

Book placement

First, the influence of the book placement is evaluated. We performed a scan with a horizontal book placement setting the cover orthogonal to the rotation axis and compared it to a scan with a vertical placement. Figure 5 shows the results for an exemplary central xz-slice. Figure 5(a) depicts the xz-slice of the scan with a vertical book placement. The varying penetration beam length caused metal artifacts (oranges boxes) where the ink is located. As those artifacts only appear in beam direction, the separability of the pages is negatively affected. With the horizontal book placement scan, the artifacts do not appear, cf. Fig. 5(b). The increased noise is due to the fact that the vertical scan was a large volume scan⁵³ with a voxel size of around 59 × 59 × 59 μm³.

Qualitative analysis

The top row of Fig. 6 shows photographs of the original pages written with (a) malachite ink, (b–d) the three iron gall inks, and (e) Tyrian purple ink. The center row of Fig. 6 shows the reconstructed and 2-D mapped pages of the closed book for the 50 keV scan with the copper pre-filter. The bottom row shows the 30 scan output after page extraction.

Most of the malachite ink writings can be identified within all performed scans. The result for the malachite ink shows some artifacts in the center of the output. This is due to the fact that the malachite ink page was placed next to the cover which touched the page in the central region, resulting in the obvious dark areas. However, this does not affect the visibility of the writings themselves. This is also true for the iron gall ink as well, where the writings, numbers and symbols are visible on the extracted pages, except for the central regions where they are less visible. When comparing the three iron gall inks, we observe that iron gall ink 3–which is the ink with additional FeSo₄–has brighter intensities than the original inks. This confirms our assumption that an increased amount of metallic particles increases the writings’ visibility. The Tyrian purple ink writings are only slightly visible, which is due to the lower X-ray attenuation coefficient. Here, only a few symbols can be identified. Buckthorn ink could not be differentiated from the paper because the attenuation is the same according to Fig. 1(c). Hence, we omitted the output results.

For calculating CR_est, the paper thickness d was set to 200 μm referring to the page thickness while the ink thickness was set to 2 μm. The estimated results for all inks show, that the lower the selected tube energy, the better the contrast between ink and paper. Next, we measured the intensity difference of the inks and paper. Therefore, we manually segmented the Chinese symbols on the bottom of the page, calculated the mean intensity of this area m₁ and subtracted the mean of the surrounding paper-only area m₂. From these values we determined the measured contrast-to-noise ratio CR_msr = |m₁ − m₂| · ${\sigma }_{0}^{-1}$, where σ₀ denotes the standard deviation of the pure image noise. The results for all scans are shown in Table 1. The 50 kV scan has the lowest values, followed by the 40 kV scan. The 30 kV scan shows the best results for all inks. Furthermore, we can observe that iron gall ink 3 (IG3) has the best contrast measure caused by the additional iron-II-sulfate. The copper based malachite ink has a higher intensity difference than the original iron gall inks (IG1, IG2) and Tyrian purple has the lowest visibility of all inks. The measured CR_msr’s are generally consistent with the estimated CR_est’s for all inks.

Table 1 Ink contrast evaluation for the three performed 3-D X-ray CT scans.

Full size table

Chromatographic effect of ink

The top row of Fig. 7 shows a snippet of the letters ‘C’ for malachite ink (Fig. 7(a)), iron gall ink 1 (Fig. 7(b)) and iron gall ink 3 (Fig. 7(c)) scanned with 30 kV. We can observe that the X-ray attenuation increases towards the edges of the letter. This can be explained by the chromatographic effect. Unlike papyrus or parchment, handmade paper absorbs most of the ink. When the ink spreads out in the paper, the transport of the liquid portion is faster while the metallic particles slow down until they completely stop and accumulate at a certain region. This results in a higher X-ray attenuation in the border regions. This is emphasized by the plots along the orange lines shown in the bottom row of Fig. 7. The intensities around the border regions are two to three times higher than the background, whereas in the central region the intensities are about 1.5 times higher. Additionally, the appearance of the letters differs from ink to ink. While the iron gall ink writings look smooth at the borders, the malachite ink has a more grainy structure.

Discussion

The scans of the self-made book, consisting of 56 pages of handmade paper, a buffalo leather cover and six different inks, showed very promising results. We showed that the book placement plays an important role for the quality of the output. We proposed an horizontal book placement such that the cover is orthogonal to the system’s rotation axis. This improved the image quality compared to a vertical setup showing severe artifacts and is furthermore easier to mount for fragile documents.

In order to expand these results, the effect of different sizes of books needs to be addressed. For larger books, the scan parameters need to be adapted because the X-ray attenuation might be too high to detect a signal. Also the voxel size will increase with regard to the scanners flat panel detector, however, this can be compensated by configuring a smaller source-to-object distance. Initially, we tested the algorithms on a small book having a dimension of 4 × 4 × 0.5 cm³ ⁴⁰. We were able to reduce the voxel size to around 30 × 30 × 30 μm³, whereas the pages had a thickness of approximately 150 μm resulting in 4–6 voxels/pages. With the larger book presented in this study, the voxel size was increased to 103 × 103 × 103 μm³ such that a page is covered by only 13. This is the minimal requirement to separate pages in the output volume. By using a large volume scan or with future improved scanner resolutions, the output quality will improve too.

We showed that reducing the X-ray energy of the scan improved the visibility of the writings but simultaneously increased noise. This noise could be reduced by averaging over multiple projections, with the drawback of an increased radiation dose. To counter this, other reconstruction approaches, such as a total-variation-regularized reconstruction, can be used. We showed in an earlier work³⁴, that for the aforementioned small book, the number of projections can be reduced to a minimum using a short scan trajectory instead of a full-circle scan, combined with iterative reconstruction techniques.

Until now, we created a database consisting of thirty 3-D X-ray book scans. We used three different scanners, six different books and varying scan parameters. The data was processed by the page extraction algorithm and will provide a database for training-based algorithms, to increase the accuracy of the segmentation, e. g. an automated separation of the cover and page or for handling even noisier data due to different document dimensions or scan parameters. So far we have not tested any double-sided pages.

We showed that ink has higher attenuations at the script edges due to the chromatographic effect of the ink in handmade paper. Also in central areas of the writings, higher grayscale intensities compared to the papers’ where obtained. The ink visibility highly depends on the ink’s metallic ingredients and the quantity of metal particles. The higher the concentration of the metal and the higher the X-ray attenuation of the element, the better the visibility. The CRs of the combined materials are in general consistent with the measured CRs. We were able to separate the original iron gall inks from the malachite ink when comparing the mean intensity of a certain letter which could help to distinguish between different inks. For example, the first letter of a new chapter was often ornamented and highlighted with a different color than the rest of the chapter. With the varying intensity in the X-ray volume, we can separate the inks and highlight other inks, too.

Based on our experiments, we recommend to use a tube current of ≥3 mA with an exposure time ≥2 s (6 mAs). Lower mAs levels reduced the CR and thus the visibility of the ink. The higher the concentration of metallic particles in the ink, the higher the X-ray energy which can be selected simultaneously suppressing the noise level. An energy range of [20, 40] kV showed up to be a good trade off between signal and noise in the output.

In comparison to a synchrotron setup, there are several disadvantages of standard 3-D X-ray CT systems. Due to brighter and more monochromatic X-rays, the synchrotron setup achieves an improved SNR, sensitivity and spatial resolution. This can be useful for an improved page separation and for small and degraded writings appearing in realistic documents. While standard CT systems often have energy limitations, the X-ray energy can be adjusted more easily, allowing the optimization of the scan parameters. As the proposed estimation of the contrast is based on ideal conditions (e. g. monochromatic X-ray source, noise-free), the synchrotron setup’s contrast should shift towards the calculated values and hence improve the results. Conversely, conventional X-ray CT systems are more mobile than synchrotron systems as they can be brought to libraries delivering sufficient results for a digitization.

We are aware that the book is not a real historical document that suffers from aging. Furthermore, there is a wide range of materials that were used to built books such as wooden covers or parchment, which might behave differently from our materials. The proposed digitization pipeline is intended to provide a basis for future research in this research area. A practical example where this digitization process could be implemented is the Germanisches Nationalmuseum in Nuremberg, Germany. One ongoing case is a book where pages are stuck together at the area of the book fold. In this case, the restorer has to trade off the possible damage to the writings within the manual process versus the damage from the CT. As the manuscript is written with iron gall ink, the conservator could consider digitizing the manuscripts by applying an initial 3-D X-ray micro-CT scan, allowing any parts of book’s writings which may be destroyed by the manual conservation process to be restored by using the 3-D X-ray CT volume.

Summary

In this work we presented a non-invasive approach capable of recovering information from manuscripts or books that cannot be opened or page-turned anymore.

Instead of using immobile synchrotron or phase contrast hardware, an X-ray micro-CT system was used to scan the document. We showed that book placement plays an important role for improving the output quality, such as reducing metal or cone-beam artifacts. Furthermore, we make a recommendation for a set of scan parameters, based on our experiments, to enhance the ink visibility. As the 3-D volume with its high resolution and wavy, overlapping pages cannot be investigated precisely with the naked eye, a fully automatic page extraction and 2-D mapping algorithm is presented allowing one to browse through the book’s pages virtually.

To the best of our knowledge, the complete digitization pipeline from scanning to 2-D mapping of the pages was analyzed for the first time in this work. The process consists of many variable parameters requiring careful consideration. To create a realistic simulation, we employed handmade paper and investigated six commonly used historical inks made of different materials with varying X-ray properties. An EDS measurement revealed that the most inks consisted of the desired metallic elements. The measurement of the inks visibility were generally consistent with the estimations and could be further improved with a synchrotron setup.

This study shows many possibilities for the research in the field of digitization of historical documents. allowing the reading of permanently closed books and the revealing of long-forgotten information from past times. Further studies must be conducted to investigate more inks which are visible within X-ray scans and compare this modality to a synchrotron setup, use the segmented data as a basis for a machine learning page extraction and 2-D mapping approaches and refining the scan parameters for books of greater dimensions. Additionally, our group works on reducing the applied radiation dose to a minimum by simultaneously preserving the written information. The data, the source code and detailed calculations are publicly available at the following: https://www5.cs.fau.de/~stromer/.

References

Diringer, D. The book before printing: ancient, medieval and oriental (Courier Corporation, 2013).
Mattusch, C. C. & Lie, H. The Villa dei Papiri at Herculaneum: life and afterlife of a sculpture collection (Getty Publications, 2005).
Knoche, M. The Herzogin Anna Amalia library after the fire. IFLA journal 31, 90–92 (2005).
Article Google Scholar
Weber, J. & Yoshitsugu, M. Devastating fire in the Duchess Anna Amalia library. J. Inf. Process. Manag. 48, 366–370 (2005).
Article Google Scholar
Greenfield, J. ABC of bookbinding: a unique glossary with over 700 illustrations for collectors and librarians (Oak Knoll Press, 2002).
Szirmai, J. A. The archaeology of medieval bookbinding (Routledge, 2017).
Hubbe, M. A. & Bowden, C. Handmade paper: a review of its history, craft, and science. BioResources 4, 1736–1792 (2009).
CAS Google Scholar
Kolar, J. et al. Historical iron gall ink containing documents—properties affecting their condition. Anal. chimica acta 555, 167–174 (2006).
Article CAS Google Scholar
Krekel, C. The chemistry of historical iron gall inks: understanding the chemistry of writing inks used to prepare historical documents. Int. journal forensic document examiners 5, 54–58 (1999).
CAS Google Scholar
Kolar, J. & Strlič, M. Evaluating the effects of treatments on iron gall ink corroded documents. a new analytical methodology. Restaurator 25, 94–103 (2004).
CAS Google Scholar
Hahn, O., Malzer, W., Kanngiesser, B. & Beckhoff, B. Characterization of iron-gall inks in historical manuscripts and music compositions using X-ray fluorescence spectrometry. X-Ray Spectrom. 33, 234–239 (2004).
Article ADS CAS Google Scholar
Stijnman, A. Historical iron-gall ink recipes: art technological source research for inkcor. Papier Restaurierung 5, 14–17 (2004).
Google Scholar
Zerdoun Bat-Yahouda, M. Les encres noires au Moyen Age (jusqu’à 1600) (Editions du Centre national de la recherché scientifique, 1983).
Klockenkämper, R., Von Bohlen, A. & Moens, L. Analysis of pigments and inks on oil paintings and historical manuscripts using total reflection X-ray fluorescence spectrometry. X-ray Spectrom 29, 119–129 (2000).
Article ADS Google Scholar
Yamasaki, K. The chemical studies on the pigments used in the wall paintings of the main hall of hóryúji and their color changes by the fire of january 1949. Bijutsu Kenkyo 167, 84–98 (1953).
Google Scholar
Koren, Z. C. Archaeo-chemical analysis of royal purple on a darius i stone jar. Microchimica acta 162, 381–392 (2008).
Article CAS Google Scholar
Dallimore, W. The economic properties of some hardy ornamental fruits. Bull. Misc. Inf. (Royal Bot. Gard. Kew) 1914, 339–345 (1914).
Google Scholar
Bergmann, U. Archimedes brought to light. Phys. World 20, 39 (2007).
Article CAS Google Scholar
Bergmann, U., Manning, P. L. & Wogelius, R. A. Chemical mapping of paleontological and archeological artifacts with synchrotron x-rays. Annu. Rev. Anal. Chem. 5, 361–389 (2012).
Article CAS Google Scholar
Morigi, M., Casali, F., Bettuzzi, M., Brancaccio, R. & d’Errico, V. Application of X-ray computed tomography to cultural heritage diagnostics. Appl. Phys. A: Mater. Sci. & Process. 100, 653–661 (2010).
Article ADS CAS Google Scholar
van Kaick, G. & Delorme, S. Computed tomography in various fields outside medicine. Eur. Radiol. Suppl. 15, d74–d81 (2005).
Article Google Scholar
Chhem, R. K. & Brothwell, D. R. Paleoradiology: imaging mummies and fossils (Springer Science & Business Media, 2007).
Stromer, D., Christlein, V., Anton, G., Kugler, P. & Maier, A. 3-d reconstruction of historical documents using an X-ray c-arm ct system. In University, M. (ed.) Proceedings of the 31th conference on Image and Vision Computing New Zealand 2016 (2016).
Stromer, D., Schön, T., Holub, W. & Maier, A. 3-d reconstruction of iron gall ink writings. In Leuven, K. (ed.) Proceedings of 7th Conference on Industrial Computed Tomography (iCT 2017) (2017).
Albertin, F. et al. Virtual reading of a large ancient handwritten science book. Microchem. J. 125, 185–189 (2016).
Article CAS Google Scholar
Seales, W. B. et al. From damage to discovery via virtual unwrapping: Reading the scroll from en-gedi. Sci. Adv. 2 (2016).
Article ADS Google Scholar
Baum, D. et al. Revealing hidden text in rolled and folded papyri. Appl. Phys. A 123, 171 (2017).
Article ADS Google Scholar
Rosin, P. L. et al. Virtual recovery of content from x-ray micro-tomography scans of damaged historic scrolls. Sci. Reports 8, 11901 (2018).
Article ADS Google Scholar
Glaser, L. & Deckers, D. The basics of fast-scanning xrf element mapping for iron-gall ink palimpsests. Manuscr. cultures 7, 104–112 (2014).
Google Scholar
Knox, K. T. Enhancement of overwritten text in the archimedes palimpsest. In Proc. SPIE, vol. 6810, 681004–1 (2008).
Charlesby, A. The degradation of cellulose by ionizing radiation. J. Polym. Sci. Part A: Polym. Chem. 15, 263–270 (1955).
ADS CAS Google Scholar
Mantler, M. & Klikovits, J. Analysis of art objects and other delicate samples: Is xrf really nondestructive? Powder Diffr. 19, 16–19 (2004).
Article ADS CAS Google Scholar
Mills, D. et al. Apocalypto: revealing the unreadable. In Developments in X-Ray Tomography VIII, vol. 8506, 85060A (International Society for Optics and Photonics, 2012).
Stromer, D. et al. Dose reduction for historical books digitization by 3-d X-ray ct. In of Applied Sciences Upper Austria, U. (ed.) Proceedings of 8th Conference on Industrial Computed Tomography (iCT 2018) (2018).
Redo-Sanchez, A. et al. Terahertz time-gated spectral imaging for content extraction through layered structures. Nat. Commun. 7, 12665 (2016).
Article ADS CAS Google Scholar
Fukunaga, K., Ogawa, Y., Hayashi, S. & Hosako, I. Application of terahertz spectroscopy for character recognition in a medieval manuscript. IEICE Electron. Express 5, 223–228 (2008).
Article Google Scholar
Ludwig, V. et al. Non-destructive testing of archaeological findings by grating-based x-ray phase-contrast and dark-field imaging. J. Imaging 4, 58 (2018).
Article Google Scholar
Mocella, V., Brun, E., Ferrero, C. & Delattre, D. Revealing letters in rolled herculaneum papyri by X-ray phase-contrast imaging. Nat. communications 6, 5895 (2015).
Article ADS CAS Google Scholar
Bukreeva, I. et al. Virtual unrolling and deciphering of herculaneum papyri by X-ray phase-contrast tomography. Sci. reports 6, 27227 (2016).
Article ADS CAS Google Scholar
Stromer, D., Christlein, V., Schön, T., Holub, W. & Maier, A. Browsing through closed books: fully automatic book page extraction from a 3-d X-ray ct volume. In IEEE (ed.) The 14th IAPR International Conference on Document Analysis and Recognition, 224–229 (2017).
Berger, M. et al. Xcom: photon cross section database (version 1.5), national institute of standards and technology, gaithersburg, md, 2010 (2011).
Maier, A. et al. CONRAD - A software framework for cone-beam imaging in radiology. Med. Phys. 40 (2013).
Article ADS Google Scholar
Jennings, R. J. A method for comparing beam-hardening filter materials for diagnostic radiology. Med. physics 15, 588–599 (1988).
Article ADS CAS Google Scholar
Feldkamp, L., Davis, L. & Kress, J. Practical cone-beam algorithm. JOSA A 1, 612–619 (1984).
Article ADS Google Scholar
Tuy, H. K. An inversion formula for cone-beam reconstruction. SIAM J. on Appl. Math. 43, 546–552 (1983).
Article MathSciNet Google Scholar
Smith, B. D. Cone-beam tomography: recent advances and a tutorial review. Opt. Eng. 29, 524–535 (1990).
Article ADS Google Scholar
Barrett, J. F. & Keat, N. Artifacts in ct: recognition and avoidance. Radiogr. 24, 1679–1691 (2004).
Article Google Scholar
Schulze, R. et al. Artefacts in cbct: a review. Dentomaxillofacial Radiol. 40, 265–273 (2011).
Article CAS Google Scholar
Stromer, D., Kugler, P., Bauer, S., Lauritsch, G. & Maier, A. Data completeness estimation for 3d c-arm scans with rotated detector to enlarge the lateral field-of-view. In Bildverarbeitung für die Medizin 2016, 164–169 (Springer, 2016).
Maier, A. & Fahrig, R. GPU Denoising for Computed Tomography, vol. 1 (CRC Press, Boca Raton, Florida, USA, 2015).
Frangi, A. F., Niessen, W. J., Vincken, K. L. & Viergever, M. A. Multiscale vessel enhancement filtering. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 130–137 (Springer, 1998).
Maier, A., Steidl, S., Christlein, V. & Hornegger, J. Medical Imaging Systems - An Introductory Guide (Springer International Publishing, 2018).
Strobel, N. et al. 3d imaging with flat-detector c-arm systems. Multislice CT 33–51 (2009).

Download references

Acknowledgements

The authors want to thank G. Riedel for performing the EDS measurements of the inks and H. Hoffmann for helpful discussions.

Author information

Authors and Affiliations

Pattern Recognition Lab, Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
Daniel Stromer, Vincent Christlein & Andreas Maier
Machine Learning and Data Analytics Lab, Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
Christine Martindale
Institute of Manufacturing Metrology, Mechanical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
Patrick Zippert, Eric Haltenberger & Tino Hausotte

Authors

Daniel Stromer
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Christlein
View author publications
You can also search for this author in PubMed Google Scholar
Christine Martindale
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Zippert
View author publications
You can also search for this author in PubMed Google Scholar
Eric Haltenberger
View author publications
You can also search for this author in PubMed Google Scholar
Tino Hausotte
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Maier
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.S. built the book and developed the page extraction algorithm. P.Z., E.H. and T.H. performed the X-ray scans. D.S. and V.C. wrote the main part of the manuscript. V.C., C.M., T.H. and A.M. provided expertise through intense discussions. All authors reviewed the manuscript.

Corresponding author

Correspondence to Daniel Stromer.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Stromer, D., Christlein, V., Martindale, C. et al. Browsing through sealed historical manuscripts by using 3-D computed tomography with low-brilliance X-ray sources. Sci Rep 8, 15335 (2018). https://doi.org/10.1038/s41598-018-33685-4

Download citation

Received: 31 May 2018
Accepted: 03 October 2018
Published: 18 October 2018
DOI: https://doi.org/10.1038/s41598-018-33685-4

Keywords

This article is cited by

The use of computed tomography and X-ray fluorescence analysis in the research of printed book from the seventeenth century: book binding, tomographic reading of the text, dendrochronological dating, pigments analysis
- Daniel Vavřík
- Andrei Kazanskii
- Tomáš Kyncl
Heritage Science (2024)
Using computed tomography to recover hidden medieval fragments beneath early modern leather bindings, first results
- J. Eric Ensley
- Katherine H. Tachau
- Milan Sonka
Heritage Science (2023)
High-precision page information extraction from 3D scanned booklets using physics-informed neural network
- Zhongjiang Han
- Jiarui Ou
- Koji Koyamada
Journal of Visualization (2023)
Applications of Microct Imaging to Archaeobotanical Research
- Aleese Barron
Journal of Archaeological Method and Theory (2023)
Deep learning for terahertz image denoising in nondestructive historical document analysis
- Balaka Dutta
- Konstantin Root
- Yixing Huang
Scientific Reports (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.