Morphometrics of eight Chinese cavefish species

Chinese cavefishes are a bizarre and interesting vertebrate taxa, but one with relatively little research. China holds the highest global cavefish diversity, accounting for about one-third of known species. Sinocyclocheilus is the largest genus of cavefishes in the world and is endemic to the south of China. The distribution of Sinocyclocheilus species is very narrow, and sometimes they inhabit just a single cave; this feature increases the vulnerability to extinction. With this study we provide the first comprehensive dataset related to the morphometrics of eight Sinocyclocheilus species. In addition to enhancing our knowledge on these poorly known species we aim to provide a dataset useful for future comparative analyses aiming to better understand the adaptive ability of cavefishes.

www.nature.com/scientificdata www.nature.com/scientificdata/ will contribute in improving species knowledge, an important step towards species protection 29 . To do that, we started with sharing the information related to the specimens present in the collection of the Institute of Zoology of the Chinese Academy of Sciences in Beijing (China), which holds the biggest collection of Chinese cavefishes. Specimens from different species and populations are present in this collection and they were often used in taxonomic and phylogenetic studies. Methods experimental design. We examined specimens belonging to 8 species of Chinese cavefishes from the collection of the Institute of Zoology of the Chinese Academy of Sciences in Beijing (China). The examined species inhabit groundwater environments in the Provinces of Guangxi (N species = 6) and Yunnan (N species = 2) ( Fig. 1). We built up a large database including date and locality of fish collection, the description of their body organs and morphometrics. When precise coordinates were present, we provide a specific code (species initials + a number) to distinguish between different populations (Table 1). According to the standard methodology used to record fishes' morphology 32 , we identified multiple landmarks from which measurements were taken (Fig. 2). These points correspond to the following body parts: A (snout tip); B (nostril); C (eye); D (top end of the head); E (farthest backward end of the head); F (beginning of the forward pectoral fin base); G (end of the forward pectoral fin base); H (farthest end of the forward pectoral fin lobe); I (beginning of dorsal fin base); J (beginning of the backward pectoral fin base); K (end of dorsal fin base); L (farthest end of the backward pectoral fin lobe); M (farthest end of the dorsal fin lobe); N (beginning of the anal fin base); O (end of the anal fin base); P (farthest end of the anal fin lobe); Q (top beginning of caudal fin); R (low beginning of caudal fin); S (middle point between Q and R); T (median end of the caudal fin lobe); U (farthest end of the top caudal fin lobe); V (farthest end of the low caudal fin lobe); W (end of the backward pectoral fin base). Alongside measurements involving the above listed points (see below), we also recorded data from additional parts of fishes' body (identified with dashed lines, Fig. 2): Snout (distance between the mouth tip and the beginning of the eye); Eye (eye diameter); Eyeball (eyeball diameter); Mouth width (length between the two mouth angles); Mouth length (length of the lower jaw).
Specimens sampling. We first described the shape of three body organs: the eye, the mouth and the caudal fin. For the eye, we considered three different categories according to the eye ball's development degree: "Developed" when is fully developed; "Reduced" when is small and poorly developed; "Absent" when the fish lack eyes (Fig. 3). We then described the mouth position according to where the opening occurs: "Terminal" if it opens at the tip of the fish head; "Subterminal" if it opens close to the tip head but downward; "Inferior" if it opens downward; "Superior" if it opens upward (Fig. 3). We finally described the caudal fin according to its shape. We used five different shape categories: "Rounded", "Truncate", "Emarginate", "Forked", "Lunate" (Fig. 3).
After the first descriptive part, we recorded measurements of the fishes' body parts. Measurements were taken using a digital calliper and analysing pictures of specimens. Digital calliper was used to record measurements hardly visible form pictures; in the following table, morphometrics recorded using this methodology are www.nature.com/scientificdata www.nature.com/scientificdata/ indicated with the symbol "*". Pictures were taken using a digital camera and placing fishes on a light background with a ruler as a scale. Files were then analysed with the software ImageJ. Once the scale was settled, the distance between two points (Fig. 2) was measured with a straight line; the same method was used to evaluate the length of dashed lines (Fig. 2).
The recorded measures were the following: • Eye*; • Eye_ball*; • Snout*; • Mouth width*; • Mouth length*; • AD: linear distance between the snout tip and the top end of the head; • B_height: head height measured at the nostril; • C_height: head height measured at the eye; • D_height: head height measured at the upper end; • DI: linear distance between the top end of the head and the beginning of the dorsal fin; • AE: maximum head length, measured from the snout tip until the farthest backward end of the head; • FG: length of the forward pectoral fin base; • FH: maximum extension of the forward pectoral fin; • IM: maximum extension of the dorsal fin; Column Data description Typology of data  Besides the above mentioned fish standard lengths, we recorded the measurement of a specific body part characterizing Chinese cavefishes: the humpback area 8 . This peculiar structure develops on the fish back, between the head and the dorsal fin (Fig. 2), and it is used to store energy, a practical adaptation to food deprived environments 8,9 . The humpback area (DID) is located above the DI segment (shaded area in Fig. 2a) and was delimited connecting back D from I following the animal shape. 3. Measurements of 28 fish body parts (27 in four species because their eye diameter equals the eye ball diameter). 4. NA means no specific data existing. Preserved specimens were not always integer or in some cases, after their fixation in alcohol, their original shape was not well conserved. This was also used in the category "Eye" when eye diameter equals the diameter of the eye ball. Furthermore, NA was used in the "Population" column to indicate that precise coordinates were not present.
Detailed explanation of dataset Morphometrics of eight Chinese cavefishes 33 is given in Table 1.

Technical Validation
Studied specimens belong to the fish collection of National Zoological Museum, the Institute of Zoology, Chinese Academy of Sciences (ASIZB) 34 ; with an appropriated request, the same fishes can be further studied. Blinded fish measurements were performed to further reduce any possible bias 35 . The whole dataset was double-checked for any possible error. Outliers were identified in two ways: before by visual check (i.e., plotting the data), and then using three times the standard deviation from the data mean (+/−) as cut-off. Successively, the relative measurement was taken again to check whether the outlier was due to measurement mistakes.

Usage Notes
Dataset is provided in CSV format, ready to be used with statistic programs like R (http://www.R-project.org/) and PAST. Precise coordinates of collection points are not shown to increase species protection 36 . Data were collected with instruments allowing high precision (0.01 mm). Prior to any analyses, we suggest to log-transform the measures to improve linearity and reduce skewness.

Code availability
No code was used in this study.