Abstract
Recently, historical and conservation studies have attached an increasing importance to investigating the materials used in historic documents. In particular, the identification of the animal species from which parchments are made is of high importance and is currently performed by either genetic or proteomic methods. Here, we introduce an innovative, noninvasive optical method for identifying animal species based on lightparchment interaction. The method relies on conservation of light energy through reflection, transmission and absorption from the sample, as well as on statistical processing of the collected optical data. Measurements are performed from ultraviolet (UV) to nearinfrared (NIR) spectral ranges by a standard spectrophotometer and data are processed by Principal Component Analysis (PCA). PCA data from modern parchments, made of sheep, calf and goat skins, are used as a database for PCA analysis of historical parchments. Using only the first two principal components (PCs), the method confirmed visual diagnostics about parchment appearance and aging, and was able to recognise the origin species of historical parchment of among database clusters. Furthermore, taking into account the whole set of PCs, species identification was achieved, with all results matching perfectly their proteomic counterparts used for method assessment. The validated method compares favourably with genetic and proteomic methods used for the same purpose. In addition to animals’ proteomic and genetic signatures, a unique “optical fingerprint” of the parchments’ origin species is revealed here. This new method is noninvasive, straightforward to implement, potentially cheap and accessible to scholars and conservators, with minimal training. In the context of cultural heritage, the method could help solving questions related to parchment production and, more generally, medieval writing production.
Introduction
The knowledge of the animal origin of parchment folios is a question of great interest in codicology: a discipline that, traditionally, studies the materiality of manuscript books, i.e. codices^{1}. In a broader context, the identification of the origin species of parchments allows historians to confirm theories about medieval manuscript production, as well as parchment production and trade. Accurate species identification can help scholars considering longstanding controversies on, for instance, the origin of uterine vellum used in 13th century codices^{2,3,4,5}. In view of cultural heritage preservation, determination of the origin species of parchment, by means of other than visual inspection of e.g. faded traces of hair follicles patterns, could also help conservators adapting storage conditions of historical parchments. Parchment is made from the dermis of animal skins, treated in such a way that chemical processes responsible for the decomposition of organic compounds stop^{6}. The most frequently used animals are goat, calf and sheep^{7}. At macroscopic scale, both sides of the parchment are distinguishable (Fig. 1a), at least, in good quality parchments^{8,9}; on the grain side, hair follicles are visible, whereas, on the flesh side, the surface looks more homogeneous. At a microscopic scale, parchment is mainly composed of type I collagen fibres oriented parallel to the surface and organised in a network architecture (Fig. 1a). Each fibre is made of fibrils, themselves composed by several thousands of triple helices of collagen proteins, consisting of speciesspecific sequences of amino acids^{10,11}. Hair follicles patterns and skin fat content, which influence, respectively, the appearance and texture of the surface, are characteristic of the animal species^{12,13}. However, successive treatments of the animal skin during parchment manufacture and surface degradation over time make determining the species through visual inspection and palpation difficult. Therefore, developing reliable methods to identify species correctly is very desirable.
The noninvasive “ZooMS” proteomic method^{2} has been shown to be successful in identifying animal species that are the most relevant for parchment studies^{3}. Recently, a nondestructive DNA method was developed based on the same sampling protocol^{2}, providing additional information, such as the animal gender^{4}. Invasive DNA methods have also been used^{14,15}. These techniques are reliable but expensive and time consuming, due to sample preparation^{16}. On the other hand, optical methods are intrinsically noninvasive and require no preparation. They are already employed to evaluate the stage of degradation of parchments^{17,18} or to distinguish between modern, historical and artificially aged parchments^{19}. The use of optics is possible because the collagen and its structural organisation provide parchment with birefringence, nonlinear optical properties and fluorescence^{17,19,20}. Due to its structure and composition, parchment scatters and absorbs light, so that diffuse reflection and transmission (as well as absorption) can be observed. However, no optical method has been proposed so far to determine the animal origin of parchments. Here, we introduce a novel method for the identification of a parchment’s origin species, which combines optics and statistics. Basically, absorption and scattering properties are measured over a broad spectral range (UVvisibleNIR), using standard photospectrometry equipment (Fig. 1b). The collected optical data (Fig. 1c,d) are treated using principal component analysis (PCA) (Fig. 1e), a wellknown statistical method previously used for characterization of parchment degradation^{21,22,23}. The proposed method can lead to accurate species identification, even for closely related species (goat versus sheep), whose collagen proteins differ by only a few amino acids. It is noteworthy that, recently, a method based on invasive timeofflight secondary ion mass spectrometry measurements combined with PCA showed interesting potentiality for species recognition in parchments^{24}. By contrast, our noninvasive method can be made portable by using a fibre optic photospectrometer and standard optical accessories. Furthermore, it can be easily used by trained people in libraries and museums. Those advantages are very attractive in cultural heritage science, where fragile and precious manuscripts need to be examined under stringent conditions, in order to preserve them.
Results
In order to show the consistency of the method, we measured the following on a set of modern parchments (21 samples) and historical parchments (20 samples) – diffuse reflectance and transmittance (R_{d}, T_{d}), hemispherical (total) reflectance and transmittance (R, T) and absorbance (A) over a broad wavelength (λ) range, from UV (200 nm) to NIR (2350 nm), using a standard spectrophotometer equipped with an integrating sphere (Fig. 1b). It is noteworthy that sample absorption (a) is related to measured absorbance (see details in Methods). As far as lightmatter interaction is concerned, total reflectance, total transmittance and absorption are entangled spectral quantities, i.e. they obey the energy conservation law: \(a(\lambda )+R(\lambda )+T(\lambda )=1,\,\forall \lambda \). The method uses PCA in order to eliminate redundancy and noise from measurements^{25}. In order to distinguish measured quantities from their PCA representation, we use, respectively, capital Roman letters (A, T, T_{d}, R, R_{d}) and script letters (\(\pmb{\mathscr{A}}\), \(\pmb{\mathscr{T}}\), \({\pmb{\mathscr{T}}}_{{\boldsymbol{d}}}\), \({\boldsymbol{ {\mathcal R} }}\), \({{\boldsymbol{ {\mathcal R} }}}_{{\boldsymbol{d}}}\)).
Recovery of underlying lightmatter interaction
In order to prove that this method is able to recover the underlying physics of lightparchment interaction, spectral data collected from all modern and historical parchments were organized in a data matrix X of size M × N, where M is the number of observations (measurements) and N is the number of variables (wavelengths). It is crucial to note that, in what follows, the number of measurements that is taken into account in X is varied on purpose, according to the goals of our analyses. Specifically, a matrix X is built for every parchment sample under test, which is composed of submatrices X_{α} of size p × N, where α identifies the total number of samples (n) (See details in Methods). Each submatrix X_{α} contains the five types of spectral measurements (p = 5). For instance, a modern sheep parchment from Cowley, UK (n = 5 samples) is associated with a matrix X of M = n × p = 25 observations (measurements) and N = 2150 variables (wavelengths). PCA representation of the data is shown for a modern parchment and a historical parchment (Fig. 2a,c). Sample data clusters (\(\pmb{\mathscr{A}}\), \(\pmb{\mathscr{T}}\), \({\pmb{\mathscr{T}}}_{{\boldsymbol{d}}}\), \({\boldsymbol{ {\mathcal R} }}\), \({{\boldsymbol{ {\mathcal R} }}}_{{\boldsymbol{d}}}\)) associated with each type of measurements are characterized by a centroid vector and Pearson’s coefficient (−1 ≤ r ≤ 1) expresses the degree of correlation between those. Alternatively, any angle θ (\({{\boldsymbol{\theta }}}_{\pmb{\mathscr{A}}\pmb{\mathscr{T}}}\), \({{\boldsymbol{\theta }}}_{\pmb{\mathscr{T}}{\boldsymbol{ {\mathcal R} }}}\), \({{\boldsymbol{\theta }}}_{\pmb{\mathscr{A}}{\boldsymbol{ {\mathcal R} }}}\)) between vectors can be used since r = cos θ. Parallel vectors (θ = 0) and antiparallel vectors (θ = π) correspond to positively correlated (r = 1) and negatively correlated (r = −1) quantities, respectively. Since PC1 and PC2 form an orthonormal basis set^{25}, vectors aligned along PC1 and PC2 axes (θ = π/2) correspond to no correlation.
In both modern and historical sheep parchments, PCA representations of R, T and R_{d}, T_{d} data are positively correlated, i.e. the corresponding centroid vectors are close together (Fig. 2a,c). This is expected from the fact that parchment surface is essentially diffusive, i.e. it scatters light in all directions, so that specular contribution to reflection is small, hence R_{d} ≅ R (and therefore \({{\boldsymbol{ {\mathcal R} }}}_{{\boldsymbol{d}}}\) ≅ \({\boldsymbol{ {\mathcal R} }}\)) and T_{d} ≅ T (and therefore \({\pmb{\mathscr{T}}}_{{\boldsymbol{d}}}\) ≅ \(\pmb{\mathscr{T}}\)) Moreover, PCA representations of R, R_{d}, T, T_{d} and A are negatively correlated to each other (Fig. 2a,c). These results, which were also found for modern and historical calf parchments (Supplementary Figures S1–S7), are consistent with energy conservation and with the underlying physics of lightparchment interaction in the presence of scattering and absorption. Indeed, looking at spectral variations of the measured quantities (Fig. 2b,d), it turns out that, for both modern and historical parchments, variations of R and A are of opposite signs over the entire spectrum. The same trend holds between variations of R and T. On the other hand, variations of A and T are of identical signs over the entire spectrum. Those observations are fully consistent with the relative positions of vectors (θ angles) in PCA representation (Fig. 2a,c), which show that \({\boldsymbol{ {\mathcal R} }}\) and \(\pmb{\mathscr{A}}\) as well as \({\boldsymbol{ {\mathcal R} }}\) and \(\pmb{\mathscr{T}}\) tend to be negatively correlated (Pearson coefficients for ms1482: r = −0.9999 and r = −1.0000, respectively. See values for other samples in Table 1) whereas \(\pmb{\mathscr{A}}\) and \(\pmb{\mathscr{T}}\) tend to be positively correlated (r = 0.9998). Moreover, energy conservation imposes a constraint on the problem: variations in A, R and T must balance each other. The quantity, \(\frac{1}{3}[A(\lambda )+R(\lambda )+T(\lambda )]\), however, is not exactly the expected one (1/3), simply because absorbance is measured, rather than absorption. If A(λ) values are converted into a(λ) values, an almost perfect balance is achieved over the whole spectrum, i.e. \(\frac{1}{3}[a(\lambda )+R(\lambda )+T(\lambda )]\cong \frac{1}{3},\,\forall \lambda \) (Fig. 2b,d). It is noteworthy that this balance holds for any number of arbitrary samples included in the X matrix. Imperfect balance tends to exist in spectral regions where parchment absorption is strong (below 500 nm and above 1300 nm), whereas both absorption and absorbance are almost the same in the 500–1300 nm range where parchment is more transparent (small absorption).
Confirmation of visual diagnostics on parchments
Scholars and conservators have long developed ways to characterize parchment material status from visual inspection. We checked our method is consistent with their diagnostics as far as parchment appearance and aging are concerned.
Parchment appearance may vary from dark to light according to the animal species, the manufacture process and natural aging. For instance, modern calf parchment from Cowley, UK (Fig. 3a) appears dark, whereas modern calf parchment from Schmedt, DE appears light (Fig. 3b). PCA representations of R and T (Fig. 3a,b) shows higher Pearson coefficient for dark parchment (r = 0.8095) than for light parchment (r = −0.8656), i.e. the angle between R and T vector is smaller for the former than the latter (\({{\boldsymbol{\theta }}}_{\pmb{\mathscr{T}}{\boldsymbol{ {\mathcal R} }}}\) = 35.96° versus \({{\boldsymbol{\theta }}}_{\pmb{\mathscr{T}}{\boldsymbol{ {\mathcal R} }}}\) = 149.95°). In terms of light scattering, in the dark parchment, strong absorption of light imposes both low reflectance and transmittance (Fig. 3c), as required by energy conservation, which produces \(\pmb{\mathscr{A}}\) vector opposed to both \({\boldsymbol{ {\mathcal R} }}\) and \(\pmb{\mathscr{T}}\) vectors in PCA representation (Fig. 3a). In the light parchment, high reflectance and low absorption are responsible for low transmittance (Fig. 3c), thus explaining the relative positions of the corresponding vectors, i.e. \({\boldsymbol{ {\mathcal R} }}\) vector opposed to both \(\pmb{\mathscr{A}}\) and \(\pmb{\mathscr{T}}\) vectors (Fig. 3b). Those results are consistent with visual diagnostics about parchment appearance. Noticeably, the method allows us to distinguish light and dark parchments, confirming the basic visual inspection.
All historical parchments under test were found to exhibit a strong positive correlation between PCA representation of A and T, whereas, in all modern parchments, a significantly lesser degree of correlation (Pearson’s coefficient) was found (Table 1). The origin of strong positive correlation in historical parchments can be traced back to spectral variations of A and T, which closely follow each other over the whole spectrum (Fig. 2d). This difference in behaviour between modern and historical parchments does not depend on species. Rather, it is likely to be related to parchment’s natural aging. Aging is known to cause parchment gelatinization^{20,26,27}, which leads to spreading of the UV absorption band into the visible range (Fig. 1c,d), hence changing a(λ), R(λ) and T(λ) spectra. Therefore, differences appear between optical spectra of modern and historical parchments: a(λ) increases in historical samples compared with modern samples; by contrast, R(λ) is roughly the same over the whole spectrum (Fig. 1c,d). Thus, by virtue of light energy conservation, transmittance decreases compared to pristine (modern) sample. Interestingly, the distinction between modern and historical parchments, which may be inferred from visual inspection by a trained person, is also possible thanks to our method.
Recognition and identification of parchment origin species
Using a twodimensional PCA representation, we first discovered that recognition of parchment’s origin species could be achieved using solely PC1 and PC2, thus unveiling the dynamics hidden in entangled optical measurements.
For each historical parchment, a X matrix was built. The first rows contained A, T, T_{d}, R, data of all modern parchments, grain and flesh sides included, and the last rows contained data from the historical parchment under test. In practice, the X matrix consisted of N = 2150 variables and M = 5 × 46 = 230 observations, i.e. p = 5 spectral measurements and n = 46 samples (2 × 21 modern flesh side and grain side samples, 2 historical samples taken from the same folio on both sides).
In PC1versusPC2 twodimensional representation, clusters are observed in modern parchment data for different species and skin sides (Fig. 4a). Clusters for flesh and grain sides are well separated for the most representative PC1 values, i.e. those associated with R and R_{d}. On the other hand, data close to the centre of the plot are weakly representative for the first PC components. Therefore, T and T_{d} measurements do not bring significant contribution to PCA and can be discarded from matrix entries without affecting the recognition process. To confirm this, a new data matrix was built from 3 spectral measurements (A, R, R_{d}) for the same samples (M = 184 and N = 2150) and PCA was recalculated (Fig. 4b). Data of the historical parchment under test are located around clusters of different species, according to the side of the sample (green circle in Fig. 4a), which hampers possible recognition of the species. In order to circumvent this problem, recognition is performed from the most representative data in PC1PC2 representation, i.e. those which have the highest scores (see details in Methods). In order to quantify confidence in the speciesrecognition performed, the norm d of the difference between the historical sample vector and the modern sample cluster centroid vector (Fig. 4b) is calculated. A proximity percentage is defined as: \(100 \% \times (1d)\). For instance, the species of origin of historical (ms1482) parchment was found to be close to Cowley (UK) sheep parchment in our species database (Table 2, 3^{rd} column).
In order to assess this new recognition method, proteomics tests were performed on all historical samples (Table 2, 2^{nd} column). The method was found to match up to 50% of the species. The reason for discrepancies is the fact that only PC1 and PC2 scores were used for recognition: a choice that turned out to be too restrictive to catch the full information contained in optical measurements. The recognition method failed in those cases where the PC1 percentage variability explained (PVE) was lower than 96.7%, as determined by examination of all our results.
Trying to solve the unmatched cases, we discovered that perfect matching, i.e. species identification and not simply species recognition, could be achieved using full PC scores (matrix size: (M − 1), see details in Methods), since they only fully account for the complex interaction of light with parchment when considered together. Contrary to the previous recognition approach, separate X matrices were built for the historical parchment under test and all modern parchments, considering grain side and flesh side separately. The historical sample matrix consisted of N = 2150 variables and M = 5 × 2 = 10 observations (p = 5 spectral measurements and n = 2 samples measured on only one side). Each modern parchment matrix (for either grain or flesh side) had size that varied according the number of samples available for each species. Identification works as follows. Given \({{\rm{Y}}}_{known}^{{\rm{S}}}\) and Y_{unknown}, the full PC scores of modern and historical observations respectively, the Euclidian distance between the historical parchment under test and the four available modern parchment sources (either flesh side or grain side) is calculated, i.e. \({d}_{s}=\Vert {{\rm{Y}}}_{known}^{{\rm{S}}}{{\rm{Y}}}_{unknown}\Vert \). The smallest distance is regarded as the criterion for species identification. Moreover, a proximity percentage is defined for each calculated distance, \( \% {\rm{P}}=100({d}_{s}\times 100/{\sum }_{m=1}^{4}{d}_{m})\), where \({\sum }_{m=1}^{4}{d}_{m}\) is the sum of historical parchment distances to the four modern parchment sources (S = {sheep UK, calf UK, goat UK, calf DE}). The proximity percentage ranges from 100% to 60.7%. The 100% value is the trivial case of the selfdistance of a parchment and the 60.7% threshold value is the smallest value among distances between flesh side and grain side of modern parchments. Animal species identification is considered as successful only if the calculated proximity value is above that threshold. The number of samples for each modern parchment was greater than the number of samples available for each historical parchment. However, the former had to be the same as the latter in order to calculate the norm. We therefore selected randomly sets of \({[A,T,{T}_{d},R,{R}_{d}]}_{\alpha }\) within \({{\rm{Y}}}_{known}^{{\rm{S}}}\), in order to match the size of Y_{unknown}, so that the same number of PCs was obtained in both cases. Remarkably, the method provided 100% of correspondence with proteomics results (Table 2, 4^{th} column).
Furthermore, in order to test the robustness of the proposed method, instead of doing the PCA using all samples for modern each parchment source, as above, we reduced the number of modern samples (random choice of 2 among several ones) to match that of the historical parchment. Thereafter, we recalculated \({{\rm{Y}}}_{known}^{{\rm{S}}}\) (M = 10 and N = 2150). The results were found to be almost identical to those obtained with the previous procedure (Table 2, 5^{th} column): an outstanding outcome if we consider that random selection affects PC scores differently in both procedures. This proves the robustness of the proposed method. As a matter of fact, using all PCs instead of only the first two ensures that all principal characteristics of the interaction of light with parchment were taken into account, allowing accurate species identification in 20 historical parchments, dating from 12^{th} century to 16^{th} century.
Discussion
Herein, we present, for the first time, a dimensionality reduction method applied to a set of observations made on parchments, involving spectral measurements of optical properties such as transmittance, reflectance and absorbance. Those measurements are noninvasive and provide relevant information in the frequency (wavelength) domain. Spectral information is organized and processed by principal component analysis, leading to an innovative method for animal origin species identification in parchments. Optical radiation scattered by the sample is captured by an integrating sphere. The transmittance, reflectance and absorbance spectra recorded from a sample are a fingerprint of both the animal skin and the parchment manufacture. However, direct comparison of spectra from known parchment sources with those of historical parchments is a difficult task, since aging modifies material properties according to the parchments’ history, leading to spectral modifications. Thus, retrieving the principal characteristics of parchments from their optical properties is a fundamental step towards achieving species recognition and identification. Data processing is primarily based on the analysis of spectral variations. For this purpose, spectra are organised in a data matrix X and the corresponding correlation matrix S_{X} is transformed into an uncorrelated data set via the principal components matrix P, yielding a score matrix Y = PX.
Overall, the results depend on the observations inserted into the X matrix. For instance, species recognition is achieved by calculating the norm between centroid vectors of clusters, the components of which are the two first PCs of the X matrix built from absorbance, total reflectance and diffuse reflectance measurements of all modern parchments and of the historical parchment under test. This procedure does not produce full success in species recognition since it involves only the variances of the most relevant principal characteristics of the interaction of light with parchment. Species identification, on the other hand, is performed based on the shortest Euclidean distance between all PC scores from a specific element of the species database with those of the historical parchment under test. The specific interaction of light with parchment enables species identification through the proposed method. Animal species does not solely determine the optical properties of a parchment: manufacture matters greatly also. Indeed, calf parchment supplied by Cowley (UK) is distinguishable from calf parchment supplied by Schmedt (DE), either by visual inspection and optical measurements or through PCA representation (Fig. 3). Actually, our method enables species identification with additional discrimination based on parchment manufacture method. By contrast, proteomics and genetics work at the molecular level, hence are unable to make difference between two manufacturing processes applied to the same animal skin.
In conclusion, the results obtained by our method are reliable. That reliability comes from energy conservation in light scattering phenomena and, in particular, from distinctive features brought by animal origin species in those phenomena. The method described above compares favourably with genetics and proteomics, providing a unique optical fingerprint of the parchment’s species of origin. It is noninvasive, straightforward to implement, potentially cheap and accessible to scholars and conservators, with minimal training. Finally, it could help solving questions related to parchment production and, more generally, medieval writing production.
Methods
Parchment samples
Both modern and historical parchments were provided by the Moretus Plantin University Library (BUMP) of the University of Namur, Belgium. Modern parchments came from two suppliers: Cowley Co., United Kingdom (hereafter named Cowley, UK) and Schmedt GmbH & Co., Germany (hereafter named Schmedt, DE). They were made from three animal species: Capra hircus (goat), Ovis aries (sheep) and Bos taurus (calf). There were 21 samples in total: 6 goat parchment samples, 6 calf parchment samples and 5 sheep parchment samples from Cowley and 4 calf parchment samples from Schmedt. Ten historical parchments were also provided by BUMP for the present study. We were allowed to take 2 samples from each parchment (20 samples in total). The oldest parchment dated from twelfth century (ms499) and the most recent one from sixteenth century (ms1593).
Optical measurements
A commercial doublebeam spectrophotometer (Perkin Elmer, LAMBDA 750) equipped with an integrating sphere was used for optical measurements. The role of the sphere was to spatially integrate radiant flux coming from the parchment sample. The largest available sphere diameter (150 mm) was used because it provided better light integration and was less affected by measurement errors due to radiant flux losses. Diffuse standard coating covered the inner wall of the sphere as well as the elements used to block the ports. Measurements were performed across a wide spectral range (200 nm to 2350 nm) covering UV, visible and near infrared regions, at speed scan of 283.50 nm/min. Parchment samples were measured at a single position on both sides, indistinguishable for historical parchments but identifiable as flesh and grain sides for modern parchments. Total transmittance or reflectance measurements included both specular and diffuse components of the radiation scattered by the sample, whereas diffuse transmittance or reflectance measurements excluded the specular component. The sphere had 3 ports: entrance port (P1), exit port (P2) and offaxis port (P3). For measuring total transmittance (T), the sample was placed at the front of P1 and P2 and P3 were closed (Fig. 1b). For measuring diffuse transmittance (T_{d}), P2 was open so that the specular component was trapped on exiting the sphere. For measuring total reflectance (R), the sample was placed at the rear of the sphere (P2) and P3 was closed (Fig. 1b). For measuring diffuse reflectance (R_{d}), P3 was open so that the specular component exited the sphere at a tilt angle of 8 degree. All these measurements were noninvasive as they only required placing the sample in contact with either P1 or P2. For measuring absorbance (A), the sample was placed on a holder inside the sphere, with P2 and P3 closed. Absorbance is related to sample absorption (a) through: \(A=\,lo{g}_{10}(1a)\). For that measurement, small pieces (size: 7.5 cm×5.0 cm for modern parchments and 3 cm×2 cm for historical parchments) had to be cut from parchments in order to fit within the sphere. However, absorbance measurements were not strictly necessary since absorption could always be deduced from T and R measurements, by virtue of the energy conservation law: a = 1(T + R). In the present study, they were performed for comparison purpose only. In total, 210 (200) measurements were done for modern (historical) parchments, respectively.
Principal Component Analysis (PCA)
Spectral measurements recorded from parchments are organized in a data matrix X of size M × N (M: number of measurements, N: number of wavelengths). The number of measurements has to be defined by the user, according to the targeted analyses. X matrix elements are organized by groups using submatrices noted X_{α} (size p × N) where α identifies any parchment sample, indexed as α = 1, …, n (n: total number of samples).
Here, \({{\rm{q}}}_{{\rm{p}}}^{{\rm{\alpha }}}\) represents any of the five available types of measurements (1 ≤ p ≤ 5), namely A, T, T_{d}, R, R_{d}. The number of rows of X is therefore given by n × p = M. PCA works on the correlation matrix of correlated variables of those observations, \({S}_{X}=\frac{1}{N1}X{X}^{t}\) (t denotes matrix transposition), in order to produce uncorrelated variables, i.e. to eliminate data redundancy. For that purpose, a linear transformation P is applied on the original data matrix, i.e. PX = Y. This leads to a score matrix Y, which is a new representation of the original data using P as an orthonormal basis set formed by the principal components. Each principal component (PC) is associated with a variance σ^{2} which is the corresponding diagonal element of the correlation matrix \({S}_{Y}=\frac{1}{N1}Y{Y}^{t}\).
Software used for PCA
The program used for computing PCA is a code (pca.m) developed by MATLAB®. By default, the PCA code centers raw data of the input matrix X and computes the principal component coefficients matrix P (“coeff”), known as loadings, using singular value decomposition (SVD). The rows of X are the observations (here optical measurements) and the columns correspond to the variables (here wavelengths). The PCA code also returns a new representation of the input matrix in the principal components space, i.e. the principal component scores matrix Y (“score”) as well as the principal component variances (“latent”). The later are the eigenvalues of the covariance matrix S_{X}, to each of which are associated a percentage (“explained”). Principal components are ordered by decreasing variances. Each row of the score matrix is plotted in the twodimensional PC1PC2 space as a point using bicaplot.m code, a modified version of the original biplot.m code by MATLAB®. The “bicaplot” code was developed for a better visualization of the data.
Proteomic analyses
Samples were prepared following the ZooMS method. The parchment surface was gently rubbed with an eraser and eraser crumbs containing collagen molecules detached from the surface were place into tubes. Collagen was then extracted in solution and digested with trypsin. After those preparation steps, samples were analysed using liquid chromatography (UltiMate 3000 (ThermoFisher) coupled to electrospray tandem mass spectrometry (maXis Impact UHRTOF (Bruker)) (LCMSMS). Peptides from protein digestion were separated by reversephase liquid chromatography using a 75 µm × 150 mm column (Acclaim PepMap 100 C18). Mobile phase A was composed of 95% H_{2}O, 5% ACN, 0.1% formic acid. Mobile phase B was composed of 80% CAN, 20% H_{2}O, 0.1% formic acid. After injection of the peptides digest, the gradient started linearly from 5% B to 40% B in 15 min and from 40% B to 100% B in 5 min. The column was directly connected to a Captive Spray source (Bruker). In survey scans, mass spectrometry (MS) spectra were acquired for 0.5 s in the m/z range between 50 and 2200. The most intense peptides (2^{+} or 3^{+} ions) were sequenced during a cycle time of 3 seconds. The collisioninduced dissociation (CID) energy was automatically set according to mass to charge (m/z) ratio and charge state of the precursor ion. MaXis and Ultimate systems were piloted by Compass HyStar 3.2 (Bruker). Peak lists for all samples were created using DataAnalysis 4.0 (Bruker) and saved as mgf file for use with ProteinScape 3.1 (Bruker). Mascot 2.4 (Matrix Science) was used as the search engine for the protein identification. Enzyme specificity was set to trypsin, and the maximum number of missed cleavages per peptide was set to one. Hydroxylation (KP) and oxidation (M) were allowed as variable modification. Mass tolerance for monoisotopic peptide window was 7 ppm and MS/MS tolerance window was set to 0.05 Da. For the protein identification, Mascot used a homemade collagen protein database and a contamination protein database. Scaffold software (Proteome Software) was used to validate protein and peptide identifications and also to perform the search of species marker peptides. Homebuilt species marker database contained specific peptides that allowed us to differentiate between Capra hircus (goat), Ovis aries (sheep) and Bos Taurus (calf).
References
 1.
Gumbert, J. P. Fifty Years of Codicology. Archiv für Diplomatik, Schriftgeschichte, Siegel und Wappenkunde 50, 505–526 (2004).
 2.
Fiddyment, S. et al. Animal origin of 13^{th}century uterine vellum revealed using noninvasive peptide fingerprinting. Proc. Natl Acad. Sci. USA 112, http://www.pnas.org/content/early/2015/11/18/1512264112 (2015).
 3.
Teasdale, M. D. et al. Paging through history: parchment as a reservoir of ancient DNA for next generation sequencing. Phil. Trans. R. Soc. B. 370, http://rstb.royalsocietypublishing.org/content/370/1660/20130379 (2015).
 4.
Teasdale, MD. et al. The York Gospels: a 1000year biological palimpsest. R. Soc. Open Sci. 4, http://rsos.royalsocietypublishing.org/content/4/10/170988 (2017).
 5.
Toniolo, L., D’Amato, A., Saccenti, R., Gulotta, D. & Righetti, P. G. The Silk Road, Marco Polo, a bible and its proteome: A detective story. J. Proteomics. 75, 3365–3373 (2012).
 6.
Kennedy, C. J. & Wess, T. J. The structure of Collagen within ParchmentA review. Restaurator. 24, 61–80 (2003).
 7.
Bischoff, F. M. Observations sur l’emploi de différentes qualités de parchemin dans les manuscrits médiévaux in Ancient and Medieval Book Materials and Techniques: Erice, 18–25 septembre 1992, eds Miniaci, M. & Munafò, P. (Biblioteca Apostolica Vaticana, Città del Vaticano, 1993).
 8.
Bischoff, F. M. Pergamentdicke und Lagenordnung. Beobachtung zur Herstellungstechnik Helmarshausener Evangeliare des 11. und 12. Jahrhunderts in Pergament – Geschichte, Struktur, Restaurierung, Herstellung, ed. Rück, P. (Jan Thorbecke, Sigmaringen, 1991).
 9.
Juchauld, F., Bonnenberger, Ph., Komenda. A. Identification de l’espèce animale des cuirs de reliure et des parchemins in Matériaux du livre médiéval. Actes du colloque du Groupement de recherche (GDR) 2836 «Matériaux du livre médiéval», Paris, CNRS, 7–8 novembre 2007, eds BatYehouda, Zerdoun, M. & Bourlet, C. (Brepols, Turnhout, 2010).
 10.
Brinckmann, J., Notbohm, H. & Müller, P. K. Collagen: Primer in Structure, Processing and Assembly Ch.1 (Springer Berlin Heidelberg, The Netherlands, 2005).
 11.
Axelsson, M. K., Larsen, R., Sommer, D. V. P. & Melin, R. Degradation of collagen in parchment under the influence of heatinduced oxidation: Preliminary study of changes at macroscopic, microscopic, and molecular levels. Stud. Conserv. 61, 46–57 (2016).
 12.
Moog, G. Häute und Felle zur Pergamentherstellung. Eine Betrachtung histologischer Merkmale als Hilfe bei der Zuordnung von Pergamenten zum Ausgangsmaterial in Pergament – Geschichte, Struktur, Restaurierung, Herstellung, ed. Rück, P. (Jan Thorbecke, Sigmaringen, 1991).
 13.
Fuchs, R. Des Widerspenstigen Zähmung  Pergament in Geschichte und Struktur in Pergament – Geschichte, Struktur, Restaurierung, Herstellung, ed. Rück, P. (Jan Thorbecke, Sigmaringen, 1991).
 14.
Pangallo, D., Chovanova, K. & Makova, A. Identification of animal skin of historical parchments by polymerase chain reaction (PCR)based methods. J. Archaeol. Sci. 37, 1202–1206 (2010).
 15.
Campana, M. G. et al. A flock of sheep, goats and cattle: ancient DNA analysis reveals complexities of historical parchment manufacture. J. Archaeol. Sci. 37, 1317–1325 (2010).
 16.
Bower, M. A., Campana, M. G., CheckleyScott, C., Knight, B. & Howe, C. J. The potential for extraction and exploitation of DNA from parchment: a review of the opportunities and hurdles. Journal of the Institute of Conservation 33, 1–11 (2010).
 17.
Gonzalez, L. & Wess, T. Use of Attenuated Total ReflectionFourier Transform Infrared Spectroscopy to Measure Collagen Degradation in Historical Parchments. Appl. Spectrosc. 62, 1108–1114 (2008).
 18.
Mannucci, E., Pastorelli, R., Zerbi, G., Bottani, C. E. & Facchini, A. Recovery of ancient parchment: characterization by vibrational spectroscopy. J. Raman Spectrosc. 31, 1089–1097 (2000).
 19.
Dolgin, B., Bulatov, V. & Schechter, I. Nondestructive assessment of parchment deterioration by optical methods. Anal. Bioanal. Chem. 388, 1885–1896 (2007).
 20.
Latour, G. et al. Correlative nonlinear optical microscopy and infrared nanoscopy reveals collagen degradation in altered parchments. Sci. Rep. 6, 26344 (2016).
 21.
Možir, A. et al. A Study of Degradation of Historic Parchment Using SmallAngle XRay Scattering, SynchrotronIR, and Multivariate Data Analysis. Anal. Bioanal. Chem. 402, 1559–1566 (2012).
 22.
Patten, K. et al. Is there evidence for change to collagen within parchment samples after exposure to an Xray dose during high contrast Xray microtomography? a multi technique investigation. Heritage Science. 1, https://doi.org/10.1186/20507445122 (2013).
 23.
Gonzalez, L., Wade, M., Bell, N., Thomas, K. & Wess, T. Using Attenuated Total Reflection Fourier Transform Infrared Spectroscopy (ATR FTIR) to Study the Molecular Conformation of Parchment Artifacts in Different Macroscopic States. Appl. Spectrosc. 67, 158–162 (2013).
 24.
Vilde, V., Abel, M.L. & Watts, J. F. A surface investigation of parchments using ToFSIMS and PCA. Surf. Interface Anal. 48, 393–397 (2016).
 25.
Jolliffe, I. T. Principal Components Analysis Ch. 1 (Springer New York, United States of America, 2002).
 26.
Sekar, S. K. V. et al. Diffuse optical characterization of collagen absorption from 500 to 1700 nm. J. Biomed. Opt. 22, 22226 (2017).
 27.
Gonzalez, L. G. & Wess, T. J. The effects of hydration on the collagen and gelatine phases within parchment artefacts. Heritage Science. 1, 14 (2013).
Acknowledgements
We thank Dr. Victoria L. Welch for language revision. We thank Prof. Étienne Renard and Dr. Emilie Mineo for helpful discussions on historical aspects. This work was supported by the Pergamenum21 project of the Namur Transdisciplinary Impulse Programme.
Author information
Affiliations
Contributions
O.D. conceived the study. A.M.F.A. performed optical experiments and data analysis. M.D. and J.B. performed proteomics analyses. C.C. selected historical parchments and performed parchment sampling. J.B. took microscope images of parchments. A.M.F.A. and O.D. wrote the manuscript. All authors discussed the data and agreed on the final manuscript.
Corresponding authors
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Alvarez, A.M.F., Bouhy, J., Dieu, M. et al. Animal species identification in parchments by light. Sci Rep 9, 1825 (2019). https://doi.org/10.1038/s4159801938492z
Received:
Accepted:
Published:
Further reading

Recent developments in using the molecular decay dating method: a review
Annals of the New York Academy of Sciences (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.