Scientists have made efforts to understand the beauty of painting art in their own languages. As digital image acquisition of painting arts has made rapid progress, researchers have come to a point where it is possible to perform statistical analysis of a large-scale database of artistic paints to make a bridge between art and science. Using digital image processing techniques, we investigate three quantitative measures of images – the usage of individual colors, the variety of colors, and the roughness of the brightness. We found a difference in color usage between classical paintings and photographs, and a significantly low color variety of the medieval period. Interestingly, moreover, the increment of roughness exponent as painting techniques such as chiaroscuro and sfumato have advanced is consistent with historical circumstances.
Humans have expressed physical experiences and abstract ideas in artistic paintings such as cave paintings, frescos in cathedrals, and even graffiti on city walls. Such paintings, to convey intended messages, consist of three fundamental building blocks: points, lines, and planes. Recent studies have shed light on interesting mathematical patterns between these building blocks in paintings.
Artistic styles were analyzed through various statistical techniques such as fractal analysis1, the wavelet-based technique2, the multi-resolution hidden Markov method3, the Fisher kernel based approach4, and the sparse coding model5,6. Recently, these methods have also been applied to other cultural heritages such as literature7,8,9,10 and music11,12,13,14. Such quantitative analysis is called “stylometry,” which originates from literature analysis to identify characteristic literary style9.
In this study, we add a new dimension to the body of stylometry studies by analyzing a large-scale database of artistic paintings. With digital image processing techniques we quantify the change in variety of painted colors and their spatial structures over ten historical periods of western paintings – medieval, early renaissance, northern renaissance, high renaissance, mannerism, baroque, rococo, neoclassicism, romanticism, and realism – starting from the 11th century to the mid-19th century. Digital images of the paintings were obtained from the Web Gallery of Art15, which is a searchable database for European paintings and sculptures consisting of over 29,000 pieces ranging from the years 1000 to 1850. Most of the identifiable images contain information of schools, periods, and artists, and are good quality in resolution to apply statistical analysis.
Here we focus on the following three quantities – the usage of each color, variety of painted colors, and the roughness of the brightness of images. First, we count how often a certain color appears in a painting for each period. From the frequency histogram, we find a clear difference between classical paintings and photographs. Next, we measure a fractal dimension of painted colors for each period in a color space, which is analogically considered to reflect the color ‘palette’ of that period. Interestingly, the fractal dimension of the medieval period is lower than that of other periods. The detailed results and our inference are discussed in this section. Last, we consider how rough or smooth an image is in the sense of its brightness. In order to quantify roughness of brightness, a well-known roughness exponent measurement in statistical physics is applied. We find that the roughness exponent increases gradually over the 10 periods, which is consistent with the historical circumstances like the birth of the new painting techniques such as chiaroscuro and sfumato16,17 (Chiaroscuro and sfumato are major painting techniques developed and widely used during the Renaissance period. Literally, the compound word chiaroscuro is formed from the Italian words chiaro (light) and oscuro (dark), which refers to an artistic technique to delineate tonal contrasts and voluminous objects with a dramatic use of light. Precursors of chiaroscuro are Leonardo da Vinci (1452–1519) and Michelangelo Merisi da Caravaggio (1571–1610), and Rembrandt van Rijn (1606–1669) is a representative artist well-known for his use of chiaroscuro. The Italian word sfumato is derived from the Italian term fumo which literally means “smoke”. Leonardo da Vinci mentioned sfumato as a blending of colors without lines or borders, in the manner of smoke or beyond the focus plane. In other words, sfumato is a painting technique to express gradual fade-out between object and background avoiding harsh outlines.). Analyzing these three properties, we propose new approaches to quantitatively analyze a large scale database of paintings. Applying our method to the controversial Jackson Pollock's drip paintings, it is possible to infer that his drip paintings are quite different from works of other painters.
First we investigate how many different kinds of color appear in a painting, and how often a certain color is painted, which is similar to Zipf's plot for word frequencies in literature18. It is named as “chromo-spectroscopy.” A color is considered to be like a word for a painter. As an example of chromo-spectroscopy, Fig. 1a displays the fraction of each color used in a painting in descending rank order. If each color is chosen from a palette uniformly at random, the frequency of each color would follow a binomial distribution for a random process (see more detail in the supplement), and its rank plot would show an inverse of its cumulative, i.e., the regularized incomplete beta function19. This is because the rank plot is the inverse of its cumulative density function (see black dots in Fig. 1a). However, interestingly, the rank-ordered color-usage distribution (RCD) shows a long tail distribution, which is different from the inverse function of the regularized incomplete beta function (see Fig. 1a).
Figure 1b shows RCDs for 10 periods of European art history and photographs. The RCD of a period represents how many colors are used and how often a specific color appears during the period. All periods of painting show a universal distribution curve, but the rank of each color for each period is rather different. The RCD of photographs is similar to that of paintings at the beginning of a power-law part but the exponential tail deviates significantly from paintings, as shown in Fig. 1b. In order to clarify the difference of the tail section of RCDs between paintings and photographs, we analyze RCDs of images of photographs after applying several painting filters from popular software. There are clear changes in the tail of the distribution when only the oil painting filter is applied. An oil painting filter usually consists of two parameters – range and level – which are related to the size of an art paint brush and smearing intensity. It seems these two parameters influence the shape of the exponential tail of the RCD. Another interesting fact is that there is no clear difference between RCDs of photographs and hyper-realism paintings, which are extremely finely drawn with microscope and are hard to distinguish from photographs with unaided eyes (see Figure S4b in the supplement). This suggests that paintings are only quantitatively distinguished from photographs by the tail section of the RCD. The tail of RCD represents frequency of noisy colors or a level of details in the image.
Fractal pattern and color palette
RCDs for all periods of paintings show quite universal distribution curves. However, the most commonly painted color is different for each period. To characterize the variety of colors more quantitatively, while ignoring its individual frequency, we investigate the fractal pattern of the painted color in the RGB color space for each period.
To examine the fractal characteristics of painted colors for each period, we measure the box-counting dimension20 of the paintings in the RGB color space and compare them with two iconoclastic artists: Pieter Bruegel the Elder and Jackson Pollock. Each color used in the painting is plotted on a point in the RGB color space. Based on the definition of the box-counting dimension, we iteratively change the length of box ε from ε = 1 to ε = 32, and count the number of non-empty boxes. A non-empty box indicates that corresponding colors within the box are used in the painting at least once. If the distribution of colors in the color space is homogeneous, the box counting dimension is 3. In other words, if the box counting dimension is less than 3, the distributions in the color space is heterogeneous and fractal, which means some axes are preferred or the distribution is composed of a preferred color scheme in the color space. In this sense, measuring the box-counting dimension quantifies the spatial uniformity or fractality of painted colors for each artistic period.
Figure 2a shows that the box-counting dimensions of paintings from the 10 historic periods are in the range between 2.6 and 2.8 except for the medieval period. As Fig. 2b shows, only the box-counting dimension of the medieval period is close to that of Jackson Pollock's drip paintings (below 2.4), where he used limited colors intentionally. In addition, the box counting dimension for the paintings of Pieter Bruegel the Elder is approximately 2.55. A low box-counting dimension represents that there is a strong preference in a small number of selected colors in the medieval age. That is, the color palette in the medieval age is significantly different from the other periods.
One can find the reason why the box counting dimensions for the medieval age and Jackson Pollock are different from others in the historical facts. First, specific rare pigments were preferred for political purposes and religious reasons in the medieval age despite their expensive cost. Second, no technique of physical mixing between different pure colors was used in that period due to the tendency to emphasize the purity of colors and materials themselves. Artists recoated on a colored canvas to represent various colors in the middle age. The drip paintings of Jackson Pollock are also formed from recoating each single color dripping pattern on other layers, and the number of used colors is smaller than other western paintings before 20th century. Furthermore, oil colors and color mixing techniques were not fully developed until the Renaissance age. The introduction of new expression tools, like pastels and fingers, and painting techniques, such as chiaroscuro and sfumato, made much more colorful and natural expressions possible after the Renaissance period21. The difference of fractal dimensions between the medieval and other periods quantitatively may quantitatively reflect the historical facts and the painting technical difference in art history.
Spatial renormalization and fixed point analysis
In the RGB color space, each painting has its own set of scattered color pixels. In order to analyze the characteristics of color usages, considering the variety of color in the paintings, we define three representative points in the RGB color space. First, center of usage frequency in the color space may be compared to center of mass in physics. One can calculate center of usage frequency (CM) in the color space with the usage information and spatial position of colors such as the center of mass of physical objects. Second, iteratively resizing a painting is necessary to get the fixed point of the painting borrowed from real space renormalization concept in physics. Repeatedly resizing a painting, a painting eventually becomes one pixel. That is the fixed point of the painting (FP). The third fixed point of the randomized painting (SFP) is the same as mentioned in the second one except for shuffling the pixels of the painting. If the spatial information of the scattered color is irrelevant, FP and SFP would not be significantly different. Note that center of mass point of a shuffled image (SCM) is the same as the original CM. Then, two vectors d1 (d2) pointing from CM to FP (SFP) can be compared to quantify the randomness of the spatial arrangement of the colors in paintings. If d1 and d2 are similar, the used colors in a painting are not diverse or the spatial arrangement of the colors in a painting is close to random. Figure 3c suggests that the color arrangement of Jackson Pollock's drip paintings is quite different from other paintings, showing that Pollock's art work is quite random, especially in the spatial arrangement of colors. On the other hand, the two fixed points of Pieter Bruegel the Elder's paintings are far away each other.
Surface roughness and brightness contrast
Though we mainly focus on the usage of colors, ignoring its spatial arrangement over the first two subsections, spatial correlation of colors is also important to understand the artistic style of the paintings, as shown in previous RG analysis, because a painting is a composition of colors in the proper place. The spatial arrangement of colors makes various artistic effects possible. For example, contrast, as one of the artistic effects, is an important element to express shape and space in two dimensional fine arts. Among various types of contrast, brightness contrast is the most important in art history due to the cultural background of Europe which usually adopts the contrast of light and darkness as a metaphorical expression. In this subsection, taking both the color information of pixels and their spatial arrangement into account, we examine the prevalence of brightness contrast in European paintings over 10 artistic periods.
To quantify brightness contrast, we utilize the two-point height difference correlation (HDC) and its roughness exponent α, the slope of HDC curve in a double logarithmic plot of the surface growth model in statistical physics22. First we get the brightness in grey-scale from the RGB color information through a weighted transformation (see Methods), and define a “brightness surface” of an image by adopting the brightness of a pixel as a height at that position of the image as shown in Fig. 4a and b. A three-dimensional surface, like a deep-pile carpet, is obtained from the 2-dimensional painting, where the HDC is calculated as a function of distance r. This method is widely used in condensed matter and statistical physics to analyze the roughness of a growing surface, for example a semiconductor surface grown by chemical deposition22. For comparison, a shuffled image, by changing a pixel's position randomly, is analyzed together.
As shown in Fig. 4a and b, since the brightness of a point is defined as its height, the height difference between two points represents the brightness difference. The two-point HDC of a randomly shuffled painting is displayed in blue dots in Fig. 4c and d for comparison. The slope α for randomized images is 0 since there is no spatial correlation any more. Figure 4d shows an example of Jackson Pollock's drip painting, which is hard to distinguish from randomly shuffled painting when only the spatial correlation is considered. The roughness exponent of Jackson Pollock's drip painting is very small comparing to that of other European paintings.
Since HDC describes the spatial correlation between color pixels on a surface as a function of distance, the slope of the HDC function, i.e., the roughness exponent α, denotes the average brightness difference according to the contrast effect. Figure 5a shows that the roughness exponent α gradually increases over the 10 artistic periods, which is consistent with historical circumstances. First, the increasing tendency of α is related to changes in painting techniques and genres, such as from portraits to landscape. In the history of western art, many new painting techniques were developed and spread during the Renaissance period. For example, chiaroscuro, which is one of the canonical painting modes in the Renaissance period16, characterizes strong contrasts between light and shade. The roughness exponent and the HDC capture the level of brightness and relative spatial position. Hence, a roughness exponent α of a painting could be a quantitative indicator of a chiaroscuro technique, and its increasing tendency over artistic periods reflects the spread of the chiaroscuro technique over the continent21. In addition, the Renaissance art movement led that painting genres became more diverse. Therefore, more portraits and landscape paintings were encouraged. Large objects in paintings such as a torso, i.e., the upper body of portraits, or mountains and sky in landscapes decrease the brightness difference in a short distance, but makes the increment of the HDC bigger as distance increases21. Therefore, the historical renovation of painting techniques and the diversification of painting genres are clearly captured in an increasing tendency of the roughness exponent α.
Another example, sfumato is another major painting mode developed in the Renaissance period to express a vanishing or shading around objects in a painting17. Smoothing the edges of objects in a painting makes the variance of brightness decrease because it doesn't allow abrupt changes at the boundary. In this case, image entropy23 would be a good measurement for the sfumato technique, which indicates the variance of brightness in a specific locale. Since the variance is inversely proportional to homogeneity, the image entropy describes the level of local homogeneity of brightness in a painting.
Figure 5b shows that the image entropy H increases up to Neoclassicism and then decreases, which is somewhat different from the roughness exponent since the image entropy only considers the complexity of the color gradient around a pixel locally comparing to the fact that the roughness exponent also consider the color brightness difference of remote distance. We think that the different behaviors of these two measures may reflect the tendency that the chiaroscuro technique is still developing but the sfumato declines. It may be rejecting mysterious expression and respecting the realistic one.
From the analysis of a large-scale European painting image archive, we display that chromo-spectroscopy of 10 art historical periods shows a universal distribution curve which distinguishes art paintings from photographs. Additionally, fractal analysis allows us to rediscover the expansion of the color palette after the medieval period, which is consistent with the fact that the color palette of the medieval age was relatively narrow comparing to other periods because of historical circumstances. Furthermore, we measure the roughness exponent and image entropy of brightness surfaces over the 10 art historical periods. We find that these mathematical measurements quantitatively describe the birth of new painting techniques and their increasing use. Our approaches successfully provide quantitative indicators reflecting historical developments of artistic styles. Applying them, it is possible to deduce that the Jackson Pollock's drip paintings are not typical art work, of course, these are still controversial in the art world.
There are several limitations of our approaches and we provide suggestions for future works. First, although the database is quite large, our dataset does not cover all paintings of the 10 art historical periods. In this reason, it is possible that there exist sampling bias in our results which we have not yet figured out. For better statistics, analyzing much bigger (higher resolution) images such as the Google Art Project24 will give us more concrete insight for artistic style. Another possible error is unintended color distortion while converting original paintings into digital images, which may cause color information loss or bias. Even though we have checked that our results are not significantly changed from artificial color quality reductions, we could not follow all possible distortion effects. It is also true that present colors in the paintings are different from the original ones when they were completed. Old paintings are hard to preserve and usually suffer from degradation of physical materials of paintings such as oxidation and corrosion. These are big remaining issues not only for this study but also for all stylometric analyses in arts. Nonetheless, we expect that our quantitative study would be helpful to bridge the gap between art and science.
Source of dataset and statistics of paintings
In this study, we analyzed the digital images of European paintings in the Web Gallery of Art which exhibits artworks ranging from 11th century to mid-19th century15. The European paintings are classified into 10 art historical periods: medieval, early renaissance, northern renaissance, high renaissance, mannerism, baroque, rococo, neoclassicism, romanticism, and realism. We filtered non-painting images, such as sculptures, miniatures, illustrations, architecture, pottery, glass paintings, and wares. The number of refined images for each period is summarized in SI Table S1. In total we have analyzed 8,798 painting artworks. As shown in Fig. S1, over 94% of images are larger than 700 × 700 pixels and the largest one is 1350 × 1533. Therefore, the quality of the images is good enough to perform a statistical analysis. Furthermore, in order to discuss the difference between paintings and photographs, two more datasets are collected for hyper-realism and photographs. We collected 105 hyper-realism images from hyper-realism artists' web sites25,26,27,28,29,30,31, the largest one is 2974 × 1954, and the two sets of photographs from the official Instagram site of National Geographic32 and the online photo gallery of a Korean portal site33.
In order to investigate the fractal patterns of painted colors in the RGB color space, we measured box-counting dimensions20. The box-counting dimension is defined as the following: where N(ε) is the number of non-empty boxes and the side length of each box is ε. A ε value represents the color quality in a digitized unit, for example, ε = 1 corresponds to 2563 possible colors in 24-bit RGB color system and ε = 32 is associated with 83 possible colors in 8-bit RGB color system. Each ε value corresponds to log2(256/ε)3-bit RGB color system. Changing ε = 32, 16, 8, 4, 2, and 1 (see Figure S6 in the supplement) and examining N(ε) for each ε, we measured dbox(ε).
To consider brightness surfaces of images, we converted digital color images into grayscale images using the following weighted filter: where R, G, and B are the red, green, and blue intensities of a pixel, and Igray-scale is the brightness of a certain color, which is interpreted as a height on the image. The reason for the difference in weighting values is due to the color sensitivity of a human eye34, and there exist several other weighting filters for R, G, and B intensities for specific purposes. However, there was no significant difference in the results with different filters.
Two-point height difference correlation function
To measure the roughness exponents of brightness (height) surfaces, a two-point height difference correlation (HDC) function is calculated22. The definition is which follows the simple scaling form, G(r) ~ r2α, for small r, and where r is a distance between two pixel points, the over-bar represents the spatial average at a fixed distance r for all possible points, Nr is the number of possible pairs at a distance r, h(x) is the height at a point x (0 ≤ h(x) ≤ 255), and α is the roughness exponent. The roughness exponent was measured in a double-logarithmic plot of G versus r, where the fitting range was used from ra = 10 to rb, where the HDC saturates to the same value both for the original and randomized paintings. It approximately corresponds to 30% of the image width and a square root of 9% of the image area.
Entropy of a gray-scale image23, is given by the following equation: where p(x) = h(x)/S, h(x) is the height at a point of the brightness surface (0 ≤ h(x) ≤ 255) and S is the sum of all height values in the image for normalization. A weighting factor m(x) is given by m(x) = 1+σ2(x), where the local height variance is calculated only over for its surrounding neighbor pixels and itself at a position x. Since this image entropy depends on an image size, all images are resized to 500 × 500 pixels by Lanczos algorithm before measuring the image entropy.
This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Ministry of Science, ICT & Future Planning (No. 2011-0028908).