Serial scanning electron microscopy of anti-PKHD1L1 immuno-gold labeled mouse hair cell stereocilia bundles

Serial electron microscopy techniques have proven to be a powerful tool in biology. Unfortunately, the data sets they generate lack robust and accurate automated segmentation algorithms. In this data descriptor publication, we introduce a serial focused ion beam scanning electron microscopy (FIB-SEM) dataset consisting of six outer hair cell (OHC) stereocilia bundles, and the supranuclear part of the hair cell bodies. Also presented are the manual segmentations of stereocilia bundles and the gold bead labeling of PKHD1L1, a coat protein of hair cell stereocilia important for hearing in mice. This depository includes all original data and several intermediate steps of the manual analysis, as well as the MATLAB algorithm used to generate a three-dimensional distribution map of gold labels. They serve as a reference dataset, and they enable reproduction of our analysis, evaluation and improvement of current methods of protein localization, and training of algorithms for accurate automated segmentation. Measurement(s) outer hair cell • cochlear hair cell Technology Type(s) scanning electron micrograph Factor Type(s) anti-PKHD1L1 labeling distribution Sample Characteristic - Organism Mus musculus Measurement(s) outer hair cell • cochlear hair cell Technology Type(s) scanning electron micrograph Factor Type(s) anti-PKHD1L1 labeling distribution Sample Characteristic - Organism Mus musculus Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.12210158

resembling a very similar appearance to microvilli, but over time a subset of these microvilli closest to the kinocilium transform into stereocilia: they elongate, thicken, and their actin filaments get densely packed, while the rest of the microvilli further away from the kinocilium are resorbed over time 25 . Upon completion of this process just three rows of stereocilia are left at the apical surface of the hair cell, as seen in adult mouse hair cells. The functional significance of the shorter stereocilia (or microvilli) past the third row in developing bundles is unknown. While Ca 2+ -imaging experiments conducted on rat and mouse hair cell stereocilia bundles show MET activity in the second and third rows of stereocilia 24,26 , no convincing evidence currently exists suggesting presence of MET activity in shorter rows of stereocilia (and/or microvilli) past the third row in developing stereocilia bundles. Because of their transient nature, these structures are often considered of no major significance to the function of adult stereocilia bundles.
In a recent study, we showed that one type of stereocilia surface specialization, the stereociliary coat, is formed at least in part by a newly identified stereociliary protein, Polycystic Kidney and Hepatic Disease 1-Like 1 (PKHD1L1) 17 . This study was carried out using serial scanning electron microscopy of hair cell bundles immunolabeled with anti-PKHD1L1 and gold bead conjugated antibodies. We showed that PKHD1L1 is located at the tips of stereocilia, and quantified the gold bead distribution in different regions of the surfaces of stereocilia. PKHD1L1-deficient mice lack the surface coat at the upper but not lower regions of stereocilia, confirming that PKHD1L1 is a component of the surface coat, and they develop progressive hearing loss, showing that PKHD1L1 is required for normal hearing in mice.
To quantitatively evaluate the distribution of PKHD1L1 and its labeling density at the surface of stereocilia, we used FIB-SEM to collect serial electron microscopy data sets of stereocilia bundles from postnatal day 4 (P4) outer hair cells (OHCs). FIB-SEM uses a focused gallium-ion beam to etch the surface in defined steps of just a few nanometers thick, and SEM with backscatter detection to image the heavy metal counterstain and 10-nm gold beads at each freshly etched surface. Using this sequential milling and imaging approach, we generated a number of serial EM data sets, which were then segmented and three dimensionally reconstructed using Dragonfly and Amira software packages. Although a number of organ of Corti samples were carefully evaluated using TEM, only one sample was chosen for further FIB-SEM imaging. The specificity of anti-PKHD1L1 labeling has been evaluated as part of the original study using light microscopy, SEM and TEM 17 . The serial EM datasets in this submission are of six closely located OHCs, collected from the middle cochlear turn of a P4 mouse cochlea. An overview of the sample preparation, data collection, and analysis steps are provided in Fig. 1. In addition, movies illustrating 3D localization of PKHD1L1 in immunogold labeled outer hair cell stereocilia bundles using FIB-SEM (postnatal day 4) in a single OHC, and 3D cumulative distribution map of anti-PKHD1L1 immuno-gold beads from six mouse OHC stereocilia bundles (postnatal day 4) were deposited at figshare 27 . Both movies are published with permission from the original study 17 .
In our datasets, the pixel size varied between 1.95 and 2.75 nm, and the milling step varied between 10 and 20 nm; see Table 1 for details. The manual segmentation workflow implemented in this study produced stacks containing the kinocilium, three rows of stereocilia (manually assigned to the tallest, middle and short rows), and the gold beads.
All datasets referenced in this study were deposited at The Cell Image Library repository  , and can be accessed using the links provided in Table 2. This submission is accompanied by 48 datasets, six of which are the original FIB-SEM datasets of six stereocilia bundles 28,36,44,51,52,60 , one test dataset used to validate the performance of the MATLAB algorithm [72][73][74] , and one dataset deposition of the MATLAB scripts used to perform the analysis 75 . All remaining dataset depositions are the files of various intermediate steps of data analysis and segmentation performed in this study. The files are grouped into eight dataset groups, each accessible via a dedicated group link provided in Table 2, as well as via individual links. Dataset groups 1-6 consist of files related to cells 1-6 respectively, dataset group 7 consists of files related to the test cell, and dataset 8 consists of files composing the MATLAB script.
Because the terms 'stereocilia' and 'kinocilia' are used very often throughout the manuscript, and some steps of our analysis are common for both structures, we implemented a terminology as outlined below. When referring to the hair cell primary cilia only, we call them 'kinocilia', as commonly known in the field. When referring to stereocilia only, we call them stereocilia. When referring to kinocilia and stereocilia together (while describing the segmentation, and other steps of the analysis that don't distinguish between the two structures) we refer to them jointly as 'cilia'. While describing the details of the MATLAB script, designed to work with any tower-shaped object (cilia or not), we refer to these structures as 'towers'.

Methods
This section contains a more detailed description of the steps and procedures used to produce the results previously reported as an original study 17 . All experiments were carried out in compliance with ethical regulations and approved by the Animal Care Committees of the Harvard Medical School and Massachusetts Eye and Ear.
imaging area outlined by white dashed lines, also shown in h. Panel i shows the microscope chamber with the sample (green arrow) mounted on the microscope stage, tilted to 52°, and brought close to the FIB source (black asterisk) and the SEM source (blue asterisk). Images f, h were acquired by the SEM beam and are at 52° tilt from the image in panel g, acquired by the ion beam. (j-l), Image processing steps: image alignment (j), stereocilia and gold bead segmentation (k), and 3D reconstruction (l) using Dragonfly and Amira software packages. (m,n), Volume analysis using a custom MATLAB algorithm allowed to generate gold bead distribution maps (m) and quantify the gold beads per stereocilia surface area (n). Panels m-n are published with permission from the original study 17  www.nature.com/scientificdata www.nature.com/scientificdata/ Cochlear dissection, sample preparation, epoxy embedding, FIB-SEM image acquisition. Organ of Corti explants were dissected at P4 in L-15 medium (GIBCO), fixed with 4% formaldehyde in HBSS for 2 hours at room temperature and washed in HBSS (Fig. 1a). After the fixation, tectorial membrane overlaying the hair cells was pulled away to expose the sensory epithelium. Cochleas were immunolabeled as previously described 17 . Briefly, after nonspecific binding sites were blocked by 10% goat serum in Ca 2+ , Mg 2+ -free HBSS for 2 h at room temperature, samples were incubated at 4 °C with primary antibodies diluted in blocking solution 1:200 (rabbit anti-PKHD1L1 antibody, NovusBio #NBP2-13765) in a wet chamber overnight, followed by several rinses in Ca 2+ , Mg 2+ -free HBSS. Next, samples were incubated in blocking solution for 30 min at room temperature, and overnight at 4 °C with secondary antibody solution (gold Conjugate EM Goat F(ab′)2 anti-rabbit IgG: 10 nm gold (BB International # 14216), diluted 1:30 in blocking solution).
Following the secondary antibody application samples were rinsed in Ca 2+ , Mg 2+ -free HBSS (3×), postfixed with 2.5% glutaraldehyde in 0.1 M cacodylate buffer (pH 7.2) supplemented with 2 mM CaCl 2 containing 1% tannic acid for 1-2 hours at room temperature. Following a triple rinse in cacodylate buffer, they were postfixed and stained with 1% osmium tetroxide/1.5% potassium ferrocyanide in 0.1 M cacodylate buffer for 2 hours at room temperature in the dark. Explants were then washed three times in 0.1-M cacodylate buffer (pH 7.2), followed by a brief wash in distilled water. They were then dehydrated in an ascending series of ethanol solutions (50%, 70%, 95%, 100%), twice for each concentration, for 10 min each step. Samples were then equilibrated in propylene oxide 3 times for 15 min. Next, they were infiltrated and embedded in propylene oxide/epoxy resin mixtures (Araldite 502/Embed-812 embedding media, EMS) as follows: resin and propylene oxide (1:3) for 4-6 h; followed by resin and propylene oxide (1:1) for 4-8 h; then resin and propylene oxide (3:1) for 8 h. Tissue pieces were then positioned in molds (EMS, 70900) for 24 h and polymerized in the oven at 60 °C for 48 h (for more details on reagents and solutions, see Table 3).
The embedded specimen was exposed using a Reichert Ultracut S ultramicrotome equipped with diamond cutting and diamond trimming knives (Fig. 1c,d). Upon approaching the region of interest, the sample orientation exposing the rows of inner or outer hair cell stereocilia bundles was confirmed by viewing methylene blue-stained semithin sections on a light microscope, and by imaging ultrathin sections on TEM when necessary.
Blocks with the embedded tissue were then cut with a razor blade or a coping saw to a smaller size (between 1-5 mm tall) reducing their height upon mounting. Special care was given to protect the working surface with exposed hair cells and stereocilia bundles from the dust generated by cutting the block, typically by wrapping the block with a wide piece of parafilm, also ensuring the film does not contact the working surface either. The cut resin block was then mounted on aluminum stubs using conductive carbon adhesive tabs, with the working surface facing upwards. All surfaces of the resin block, except for the working surface, were carefully covered with colloidal silver (or carbon) paint to increase sample conductivity, minimizing its charging, then sputter-coated with at least 5 nm of platinum.
Samples were observed on a FEI Helios 660 FIB SEM microscope with an insertable CBS backscatter detector to identify the tissue within the block as seen on Fig. 1e, and locate the region of interest. The sample was brought to the eucentric point (where the electron beam and the ion beam coincide) at ~4 mm working distance to allow for the ion beam image (typically operated at 30 kV and 80 pA) to coincide with the electron beam image. Next, a thin layer of platinum or tungsten was locally deposited on the desired milling area (typically ~10 × 30 μm 2 ) using the electron beam (at ~13 nA) to protect the sample. Although inefficient, the SEM metal deposition was always used to deposit a thin protective layer of platinum before switching to ion beam deposition. Next, a more efficient, but more damaging ion beam (~0.8 nA) deposition procedure was carried out, aiming for a final deposition of ~1 μm of platinum. The deposited platinum is clearly visible as the overlaying thick white stripe of protective metal coat on Fig. 1f,g. A very similar procedure was used to prepare a small nearby area where the two fiducial markers were etched (arrows on Fig. 1f,g). The trenches exposing the cell-substrate interface in the cross-section were milled using the focused ion beam set to 30 kV and 2.5 nA.
After preparing the volume for sectioning, serial images of the block face were acquired by repeated cycles of sample milling and imaging using the Auto Slice & View G3 operating software (FEI) with a milling step set between 10-20 nm. During this process, the ion beam was set to 30 kV, 80 pA, and 1 μs dwell time. The electron beam images were usually acquired in an immersion mode by detecting backscattered electrons using the BSE-TLD detector, with an electron beam set to 3 kV at 1.6 nA and 5-10 μs dwell time. www.nature.com/scientificdata www.nature.com/scientificdata/ A total of six OHCs were imaged with FIB-SEM, and were used to create this dataset. Seven stacks of ~150-400 slices at the pixel size of 1.95-2.75 nm and 10-20 nm z-step size were collected using the Auto Slice and View function of the FEI acquisition software (see Table 1 for details outlining some of the acquisition parameters).
Image processing. Collected 8-bit serial TIFF images were combined in ImageJ software into a stack, then cropped to reduce the file size below 4 Gb in order to improve the performance of the 3D segmentation and rendering software. Figure 1j-n summarizes the steps of the data analysis described in detail in the sections below. All procedures were carried out on Dell T5810 workstation equipped with an 8-Core 3.4 GHz Intel Xeon E5-1680 v4 Processor, NVIDIA Quadro M5000 8 Gb graphics card, 64 Gb RAM and solid-state drives.
In one case (see Cell #4 51,52 , Table 1) the entire volume was collected through two consecutive Slice and View sessions, later combined for the analysis 53 . The voxel size was noted from the Slice and View software upon completion of the project, also confirmed from the 'Projectlog.log' file, in which 'targetthickness' defines the milling (Z) step, while the acquisition horizontal field width ('HFW' , μm) divided by the number of pixels from the image aspect ratio defines the pixel size. For example, for the Cell #1 28 (see Table 1) the acquisition HFW is 14.9 μm, and collected images were 6144 pixels wide, resulting in 14.9/6144 = 0.002425 μm, or 2.43 nm.
Alignment and filtering of the FIB-SEM stacks were carried out using Dragonfly (2017 version) image processing software. Upon importing the files into the software, X, Y and Z dimensions (in nm) were manually entered. If the data set representing an individual cell had been collected over several slice and view sessions, and was broken up into sections, the individual sections were combined at this stage of the analysis. The image stacks were then aligned using the Maximization of Mutual Information function.
Briefly, using the reference image, each subsequent image was shifted in X-Y against the reference image until the 'mutual information' score between the two images was at its maximum. A screen captured demonstration of FIB-SEM data alignment and filtering procedures using Dragonfly software package was deposited at figshare 27 . However, since the Z-step for all of our datasets was at 10-20 nm, which is significantly larger than the X-Y pixel size (1.95-2.75 nm) we cannot rule out the possibility of some remaining distortions following the alignment procedure.
If a cell originally had two or more sections (i.e. was a concatenated stack), the first reference slice (shown in blue in Dragonfly) was moved to align with the current slice (shown in red). Once the first slice of each section in the concatenated stack was manually aligned to the previous slice (of different stack), the abovementioned alignment steps were carried out. Once no horizontal lines were visible on either of the orthogonal views (indicating misalignment), the new data set was saved for further analysis.
Aligned stacks were then subsequently filtered using edge-preserving and smoothing filters to normalize the local contrast by simultaneously sharpening cellular membranes and gold beads as needed. A more detailed description is provided within the 'step-by-step instruction notes' section below. 3D volume segmentation. Segmentation of the image stacks was carried out in Amira 6.2.0, using the XY and XZ orthogonal slice views. Upon importing the image stack, the voxel dimensions were entered once again, this time in micrometers. Once the data were loaded, a LabelField is automatically generated, corresponding to  www.nature.com/scientificdata www.nature.com/scientificdata/ the imported stack in the 'Segmentation' tab. With a size-7 brush, the outline of each stereocilium was traced within the cell membrane on approximately 50 individual slices, each time skipping about 15-30 slices (150-300 nm) in between. Any electron dense materials outside of clearly visible stereociliary membranes (like the surface coats and occasional artifacts) were not included as part of stereocilia during segmentation. The outlines of the stereocilium made by the brush were then filled using the 'Fill holes -> All slices' function. Next, the selection was interpolated through the skipped slices using the 'Selection -> Interpolate' function ( Fig. 2). Once a selection was made, the voxels constituting each stereocilium were assigned to a separate 'material', i.e. a separate object within the volume, and the material order number reflects the ID number of each stereocilium, later used in the MATLAB analysis workflow. A screen captured demonstration of FIB-SEM data segmentation using Amira software package was deposited at figshare 27 .
The gold beads were segmented using a size-5 brush on the XY orthogonal view. This workflow resulted in fully segmented stereocilia, each with a unique ID number, typically corresponding to the order they were segmented during the previous step of the analysis. These ID numbers were later used to assign the row identity of each stereocilium for the MATLAB analysis, also used to specify whether any individual stereocilium is to be excluded from the analysis.
The Amira project can be saved in several ways containing more or less data, but only two files were essential for further MATLAB analysis: the 'materials' segmenting the volumes of each cilium (saved as 'Towers.am' file), and the volumes representing the gold beads (saved as 'GoldParticles.am'). Each file can be opened in ImageJ and viewed as a multi-page image document. Each pixel of this file has been assigned an 'index' value, ranging from '#0' for the background material (i.e. the unused pixels), increasing up to the number of segmented cilia within the bundle.
In this study, only three rows of stereocilia were segmented, even when additional shorter stereocilia were observed. The shorter stereocilia were excluded from our analysis because their functional significance at that developmental stage is not clear, and they are not present in mature hair cells. Thus, for each dataset, a segmented bundle consisted of (1) the kinocilium, (2) the tallest row stereocilia, (3) the middle row stereocilia immediately next to the tallest row stereocilia, and (4) the short row of stereocilia, defined as shorter stereocilia immediately next to the ones from the middle row. Any stereocilia not in immediate proximity to the middle row stereocilia were excluded from the study. Sometimes the analyzed serial EM volume did not include the entire length of the stereocilium (for example, see Cell #3 [44][45][46][47][48][49][50], in which case the stereocilia were still segmented, but excluded from the MATLAB analysis as described below.
Step-by-step Instructions: Pre-and Post-Reconstruction of 3D datasets I. Create a stack. Preferred software package -ImageJ.
A. Open the raw images. B. Drag files (in order) to ImageJ window. C. Once all images are open, go to: Image > Stacks > Images to Stack. D. Save stack as TIFF format into a desired folder. Note: If the data set representing an individual cell has been collected over several slice and view sessions, and is broken up into sections, make separate stacks for each session, then combine them: E. Crop the stack, ideally keeping the size below 4 Gb. Note: for deposition purposes, all data sets deposited with this submission were built into a stack to enable their quick access and view in the browser using the "detailed view" link to the repository's online image viewer window. II. Determine the voxel size and the Z step.
A. Voxel Size.
1. Open one of the images in the Notepad application, or in MS Word. 2. Scroll to the bottom of the file, search for "PixelSize" → X unit = "m" = "2.43E-09". The value reflects the pixel size in X and Y. Note: Alternatively, open the 'Projectlog.log' file, and find the acquisition horizontal field width ("HFW", μm), and divide by the number of pixels in the image aspect ratio to calculate the pixel size. For example, for Cell #1, the acquisition HFW is 14.9 μm, and collected images were 6144 pixels wide, resulting in 14.9/6144 = 0.002425 μm, or 2.43 nm (Table 1). All 'Projectlog.log' files were deposited with corresponding datasets, and can be found in the accompanying compressed (zipped) folders.
B. Z step.
1. For Z, open the 'Projectlog.log' file and search for "targetthickness". The value reflects the Z milling step size in meters: "1.5E-08" is 15 nm. III. Aligning and Filtering. Preferred software package -DragonFly.
A. Open Dragonfly application, and go File > Import Image Files > Add… > (choose stack) > Next. B. For Image Spacing (in nm), convert x, y, and z dimensions to nm, then click Finish www.nature.com/scientificdata www.nature.com/scientificdata/ C. Note: If the data set representing an individual cell has been collected over several slice and view sessions, and is broken up into sections, follow the steps for each individual section. D. Alignment. www.nature.com/scientificdata www.nature.com/scientificdata/ • Is a measure of mutual dependence between the two variables.
• Given a reference image, and a second image which needs to be put into the same coordinate system as the reference image, the image is rotated and translated (in X-Y) until the mutual information between it and the reference image is maximized. • Considers all voxels in the images to be registered to estimate the statistical dependence between corresponding voxel intensities.
Note. Maximum translation: • Determines the maximum translation that will be allowed between two slices.
• Small setting limits to 0.5% of the size of the image.
(b). If the cell originally had two or more sections (concatenated stack), first: Registration method > Manual → move reference slice (blue) to match the current slice (red). (1). Once the first slice of each section in the concatenated stack is manually aligned to the previous slice (of the different stack), follow Mutual info alignment steps as above (a).
(c). Go through the aligned stack, and make sure there are no horizontal lines on either orthogonal view (indicating misalignment).
(1). Once satisfied with the result, click Create New Data Set > OK.
E. Filtering. The Histogram balancing, Unsharp and Mean shift filters were found most useful in processing the datasets. Also could be considered the Slope map; CLAHE; Sobel; Median; Bilateral; Laplacian; and Gaussian filters.
1. To apply a filter, run Tools > Image Processing Toolbox > Import Image Files > (select stack/concatenated stack) > (input voxel dimensions in nm, as in step III). 2. If brightness/contrast varies between slices within the stack, consider using Histogram Balancing filter, and choose reference slice. Note: Histogram balancing filter spreads out the most frequent intensity values in an image. The equalized image has a roughly linear cumulative distribution function.
(a). Reference slice -a representative slice with ideal brightness/contrast ratio that will be applied to ALL slices. (b). Settings found useful: unit: 8 (default) > reference slice: __ (enter slice number).
3. If the stack appears blurry (slight ambiguity in details), consider using Unsharp filter Note: Unsharp filter sharpens an image by subtracting a blurred version of itself from the original.
(a). May make more fuzziness to stack, follow up with step 4 (below www.nature.com/scientificdata www.nature.com/scientificdata/ A. Start-up and Data Import. 1. Open Amira application, and go to Open Data > (select filtered stack) > Out-of-Core Data window will appear. (a). If 200 GB of hard drive space is available, select "Read complete volume into memory"; if not, select "Convert to Large Data Access (LDA) format".
2. Image Read Parameters window will appear, input voxel size dimensions in microns.
1. Once data has been loaded, select the "Segmentation" tab. This is the Segmentation Editor where the stack will be reconstructed. (a). A LabelField will automatically be generated that corresponds to the imported stack.
2. To add a material (which will be for one cilium), select "Add" and a material field will appear. C. Segmentation procedure.
1. From there, on the Selection window, press the paintbrush tool. This will be used for segmenting the cilia. 2. Change the size of the brush to 7 if working in 1:1 zoom. 3. To segment, trace the circumference/perimeter of the cilium and repeat every 2-3 slices (for XY view) or every 15-30 slices (for XZ view). 4. Add selection as a 'new material' . 5. Go to Segmentation > Fill holes > All slices to fill the area within the selection on all slices with outlined membranes. 6. Go to Selection > Interpolate to select the voxels in between the selected planes, and add selected voxels to the same 'material' . 7. OHC segmentation.
(a). For tall and medium stereocilia row, use XY orthogonal view. (b). For small stereocilia row, use XZ orthogonal view. Note: All stereocilia are within the same LabelField, but each cilium represents an individual material.
8. Gold beads. Note: only gold beads near the segmented cilia were segmented.
(a). Use XY orthogonal view with brush size 5 in 1:1 zoom. (b). Only identify the brightest/most obvious gold beads; skip repeats in following slices. Note: use different material than segmented cilia; each gold bead added to the SAME material. V. Post-segmentation -Saving data and tables.
A. To save the Amira project (LabelFields + stack), go File > Save Project As… > Select "Minimize Project Size" > (select folder to save) > Save data type as "Amira Project and data files [pack & go] (.hx)". B. If you wish to save only the LabelFields (segmented rows + gold), the only files required for the MATLAB code analysis, then on the Project tab select the LabelField desired > right-click > Save Data As > Save data type as "Amira RLE (.am)". C. After segmentation of stereocilia bundles and cuticular plate (including the gold beads) is complete, turn on the 3D volume rendering view on the Segmentation tab. D. Open an Excel file to specify which stereocilia are connected to each other.
1. Values to be placed in the excel spreadsheet are the "intensities" of the cilia, and reflect their ID numbers. (a). To find stereocilia ID numbers: (1). Open LabelField in Amira, and change Image Source (on Segmentation Editor) to "No Source".
(2). Hover over each cilium with the cursor, and look at the bottom-right corner "Selection" window to get the intensities. Note: All LabelField '.am' files can also be opened in ImageJ, in which case the ID number of each cilium is the "index" value displayed on the bottom line of the main window of ImageJ upon positioning the cursor over the cilium of interest. For all background voxels, the "index" value is set to zero (assigned to any material).
2. Order of rows within the spreadsheet is expected to be as follows: Kinocilium (identified as row 1 towers by the MATLAB scripts), tallest row stereocilia (row 2 towers), middle row stereocilia (row 3 towers), and shorter row stereocilia (row 4 towers). Instructions on how to prepare the two necessary (2020) (a binary  table excluding incomplete cilia as 0 and including the complete cilia as 1) as "TowerUsage.csv".
1. "LinkTable.csv" allows the MATLAB algorithm to determine the orientation of each cilium based on provided information on which cilium is likely connected with which neighboring one with a "tip link" to determine each stereocilia's mechanosensitive direction. The text below refers to the 'north' direction for each stereocilium, which is a vector pointing 180° away from the mechanosensitive direction. 2. "TowerUsage.csv" table allows one to exclude individual stereocilia from the analysis and was used to exclude some stereocilia that were only partially included in the dataset. For example, see Cell#350 and the test Dataset 774, described in a greater detail within the Technical Validation section below. 3. Save both tables as 'csv' files.
D. It is important to get the correct file names and formats for the script to successfully execute the MAT-LAB analysis step.
3D-structure reconstruction, surface rendering. An additional step was taken to generate the surfaces of the segmented volumes for demonstration purposes. This step has been carried out in Amira, by generating the surfaces of all objects selected for the illustration, and rendering a movie. The color coding used in the rendered movies was only intended for illustration purposes, presenting stereocilia of each row in a distinct shade of blue (Fig. 3). The row identities were kept consistent with the 'LinkTable.csv' file used in MATLAB analysis. In order to generate and edit the surface on the 'Project tab' the created LabelField was selected, then 'Generate Surface' module was connected to the LabelField. Executing the 'Generate Surface' function takes some time, depending on the size of the model. When creating the 'Generate Surface', the degree of smoothing applied to each surface was modified as needed. This is particularly important, as in some cases extensive smoothing could cause some small or thin structures to disappear. A smoothing type port 'uncontrasting smoothing', set to smooth level 5 was used to visualize the cell body. It created a smoothed surface with few constraints and some loss in details. For the LabelField 'Towers.am' containing the cilia segmentation, and the LabelField 'GoldParticles.am' containing the gold bead segmentation, no smoothing was applied. For these files, each surface appeared as a block rendering, preserving the highest level of detail as originally segmented. The'Surface View' module was then used to visualize the entire model in the Amira viewer window. For 'Towers.am' Draw Style'transparent' was applied to superimpose the surfaces using a'base transparency level' set to 0.    www.nature.com/scientificdata www.nature.com/scientificdata/ The output result is summarized in a movie illustrating 3D localization of PKHD1L1 in immunogold labeled outer hair cell stereocilia bundles using FIB-SEM deposited at figshare 27 , and published with permission from the original study 17 . These were saved as.hx 'Amira Project and data files [pack & go]' and are available for download as compressed (zip) folders as part of the submitted datasets 34,42,49,58,65,70 . Volume analysis. The post-segmentation analysis was performed using a custom MATLAB algorithm 75 , aiming to quantify the distribution of gold beads in 3D. The scripts are deposited with this submission, are available for download (https://doi.org/10.7295/w9cil50728), and a detailed, step-by-step description of the algorithm is outlined below.
Stereocilia registration and gold bead quantification were performed in MATLAB. Due to the tower-looking nature of stereocilia and kinocilia, we refer to them as towers for the volume analysis. Three scripts were generated to carry out the analysis: (1) 'towersOfGold'; (2)

'togDisplayResults'; and (3) 'togAggregateResults'.
These scripts expect a data organization with the folders as shown below: All_Cells Cell3 50 GoldParticles.am LinkTable.csv Towers.am TowerUsage.csv Cell4 59 GoldParticles.am LinkTable.csv Towers.am TowerUsage.csv As illustrated, input data for each cell consist of 4 files: 'GoldParticles.am', 'Towers.am', 'LinkTable.csv', and 'TowerUsage.csv'. The '.am' files are segmentation volumes obtained with Amira, and correspond to tower labels as well as gold bead masks. 'LinkTable.csv' contains annotations of the neighborhood relationships between towers. Because the hair cell stereocilia bundle is a polarized structure, and the MET axis varies between the pairs of stereocilia within the bundle, we reasoned that the individual position of each stereocilium with regards to the MET axis should be taken into account when determining the average distribution of the gold beads across the population of stereocilia. Although the position of the tip link mediating the MET is the best predictor of the MET axis for any given pair of stereocilia, our FIB-SEM datasets did not visualize the tip links well enough to rely on their position when determining the MET axis. Instead, we developed an alternative approach in which each stereocilium within the bundle was assigned its 'most likely' connection with stereocilia from the neighboring rows based on their position.
The process of assigning stereocilia within the 'LinkTable.csv' file is illustrated in Fig. 4 using a subset of stereocilia from Cell#3 50 , while the final result is presented in Table 4. In this example, tower 109 in row 1 (the kinocilium) is immediately behind tower 18 in row 2, which in turn is flanked by towers 17 and 19 (all tall row stereocilia). Zeroes in such table are used as placeholders, and do not correspond to any tower. The structure and content of Table 4 is similar to the corresponding 'LinkTable.csv' file, although the latter omits the column and row headings.
'TowerUsage.csv' is a binary table, of identical size as 'LinkTable.csv' , indicating which towers shall be considered during analysis. For example, a binary table corresponding to towers presented in Table 4 is summarized in Table 5. The structure and content of Table 5 is similar to the corresponding 'TowerUsage.csv' file, although the latter omits the column and row headings.
In the excerpts above, all towers are used, but there are cases where some are excluded due to inaccuracies in segmentation or data collection (for example see Cell#3 50 , used for the analysis below).
'towersOfGold' script outputs two types of data: figures and tables. After running the 'towersOfGold' script on All_Cells/Cell3, for example, two folders will be created in folder corresponding to Cell3, subsequently named ' Figures', and 'Tables'. In 'Figures' folder, live MATLAB plots (fig format) are saved, also shown on Figs. 5 and 6. The figures can be opened in MATLAB upon completion of the analysis. In 'Tables' folder, a number of tables are saved, containing: • row-aggregated gold bead coordinates (X, Y and Z), entitled AgrGPRow1.csv, AgrGPRow2.csv etc.; • row-aggregated tower convex hull coordinates (AgrTwrRow1.csv, AgrTwrRow2.csv, etc.); • heights and radii of every tower within the same row (HtRadRow1.csv, HtRadRow2.csv, etc.); • average heights and radii of towers for each row (AvgHeightAndRadius.xls).

Tables can be visualized by running 'togDisplayResults'.
Finally, 'togAggregateResults' script collects and aggregates results of 'towersOfGold', as well as further computing 'cylinder' and 'polar' histograms of gold beads. In this dataset example, 'togAggregareResults' applied to   www.nature.com/scientificdata www.nature.com/scientificdata/ minDistThr, default 100 (in micrometers). A threshold distance from the gold bead to the nearest tower. If positioned further than this threshold, the bead is excluded from analysis.
The 'towersOfGold' script performs the following steps: 1. Setup output folders. 2. Read and resize data, sample border of tower masks, and compute centroids of gold bead blobs display  www.nature.com/scientificdata www.nature.com/scientificdata/ the same number of elements each, which increases robustness of the mean-shift clustering algorithm in the next step. Display Fig. 5b upon completion of the step. 4. Determine 'upward' direction from the outputs of Fig. 4b; this is defined as the direction such that the minimum projection of the towers data has the smallest standard deviation; i.e., it is the direction opposite the one where the tower extremities are approximately aligned to a plane. Direction candidates are obtained by clustering of directions from step 3, which is performed using mean-shift in a sphere 76 . Mean-shift outputs two nearly opposite directions. Display Fig. 5c upon completion of the step. 5. Associate gold beads to towers using either 'nearest neighbors' or 'distance transform' , and display Fig. 5d. 6. Align towers and gold beads to upward direction, by rotating them around their center of mass; rotation of a point about a vector in 3D by a certain angle is done using the Rodrigues rotation formula 77 , then display Fig. 5e.  7. Transform data such that upward direction is [0 0 1], and the bottom plane is z = 0. Display Fig. 5f showing the top view of towers, along with tower IDs, after transformation. Figure 6a shows perpendicular views (right, front, top) of a single tower, as well as its associated gold beads. 8. Compute azimuth direction for individual towers, according to data from 'LinkTable.csv' then display Fig. 6b. 9. Compute azimuth transforms for individual towers / gold beads, so that 'ahead' direction is [0 1 0]. Display  Fig. 6d, showing the original azimuth direction (in red) and the final azimuth after transform in green. Figure 6c shows the azimuth transform for a single tower (top view); the green arrow points north before the transform, while the red arrow points to north after the transform. 10. Apply azimuth transform to all towers and associated with them gold beads, taking 'TowerUsage.csv' into account. Display Fig. 6e. 11. Compute average height/radius per row of towers. 12. Register tower rows to common (template) tower using the average height/radius for each row of towers then display Fig. 6f. 13. Compute convex hull of aggregated towers; save convex hull data and gold beads locations to disk.
www.nature.com/scientificdata www.nature.com/scientificdata/ segment for each row, which was in turn calculated as the average height of stereocilia of that row divided by the number of segments used. Normalized labeling density was then calculated as the gold count per segment area per cilium, for each stereociliary sector, and presented as shown on Fig. 1n. In addition, a movie illustrating the 3D cumulative distribution map of anti-PKHD1L1 immuno-gold beads from six mouse OHC stereocilia bundles  'togAggregateResults' script. (a,b) The output figures display the gold bead distribution within a defined number of segments, which were further subdivided into a defined number of radial sectors. For this illustration, five segments along the length of stereocilia were chosen, numbered from 1 to 5 from the proximal end of stereocilia to its distal end, respectively. The most distal segment (#5 in this figure) always represents the volume above the tip of stereocilium. Each segment was further subdivided to four radial sectors (azimuth bins), numbered I, II, III and IV (blue, green, yellow and red diamonds, respectively), as shown on schematic diagrams on the left of all panels. Both panels, a and b represent the same data, and are displayed as a heat map. In a, the surface of the cylinder is represented in a 'side view' as a rectangle of the lateral surface of the tower. In b however, the surface of a cylinder is represented as a 'top view' arrangement of concentrically positioned rings, where the outermost ring is of the most proximal segment, while the innermost ring is of the area above the tip of stereocilium. (c) A 3D cumulative distribution map of gold beads. Although a specific view is shown on this panel, the actual figure generated by MATLAB is a 3D plot, and can be manually positioned at any angle. The schematic diagram on the left summarizes the actual orientation of the cilia in reference to the panels a and b using the blue, green, yellow and red diamonds. Yellow dots represent gold beads, while gray triangles represent the 'average' ciliary surface triangulated from the cumulative collection of surface coordinates pooled from all stereocilia analyzed. (2020) 7:182 | https://doi.org/10.1038/s41597-020-0509-4 www.nature.com/scientificdata www.nature.com/scientificdata/ (postnatal day 4) was deposited at figshare 27 . The movie, and the Fig. 1n represent the same data, and both are published with permission from the original study 17 .

Data records
Each dataset represents a set of raw image files and all related intermediate data sets derived from it, and are associated with the same outer hair cell. Some of the main parameters for each cell and related dataset are summarized in Table 1. Although the FEI Slice and View software package generated individual TIFF files for each section of the serial EM data set, the images were combined into a single multipage TIFF file prior to deposition at The Cell Image Library repository. This was done to promote easy data access, and allow for their online preview within the browser window using the "detailed view" link embedded within each entry. Dataset 1. Includes the following files, all related to Cell #1 of the current study. The numbering order of the files reflects the order of operations performed during the analysis. Group link: http://cellimagelibrary.org/ groups/50680 1.1_Cell_1_and_Cell6_Raw_Image_Stack 28 is a multipage TIFF file consisting of 388 individual EM sections acquired at 15 nm milling step, resulting in 2.43 × 2.43 × 15 nm 3 voxel size. This file also contains the raw data related to Cell#6.
Note: The data acquisition session was interrupted after frame #161 to adjust the imaging area, resulting in a large shift on subsequent frame #162.
1.2_Cell_1_Stack_Aligned 29 is a multipage TIFF file containing aligned images of Cell#1. The file contains 387 images.
1.3_Cell_1_Stack_Cropped 30 is a multipage TIFF file containing only the relevant volume of Cell#1, cropped to reduce the file size. The file contains 387 images, as the plane #1 of the original stack was removed.
1.5_Cell_1_Stereocilia_Segmentation 32 is a multipage TIFF file illustrating stereocilia segmentation results. Same as 'Towers.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview. The file contains 348 images, as all the planes that contained no segmentation data were removed.
1.6_Cell_1_GoldParticle_Segmentation 33 is a multipage TIFF file illustrating the gold bead segmentation results. Same as 'GoldParticles.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview. The file contains 348 images, as all the planes that contained no segmentation data were removed.
1.7_Cell_1_Surface_Rendering 34 file is a compressed (zipped) folder containing all files required to render the surface of the bundle. Also included is the 'ProjectLog' file produced by Slice and View acquisition software.
1.8_Cell_1_MATLAB_Volume_Analysis_Results 35 file is a compressed (zipped) folder containing all files required to execute the MATLAB volume analysis. The folder also includes the resulting output files and figures. Dataset 2. Includes the following files, all related to Cell#2 of the current study. The numbering order of the files reflects the order of operations performed during the analysis. Group link: http://cellimagelibrary.org/ groups/50688.
2.2_Cell_2_Stack_Cropped 37 is a multipage TIFF file containing only the relevant volume of Cell#2, cropped to reduce the file size.
2.5_Cell_2_Stereocilia_Segmentation 40 is a multipage TIFF file illustrating stereocilia segmentation results. Same as 'Towers.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview.
2.6_Cell_2_GoldParticle_Segmentation 41 is a multipage TIFF file illustrating the gold bead segmentation results. Same as 'GoldParticles.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview.
2.7_Cell_2_Surface_Rendering 42 file is a compressed (zipped) folder containing all files required to render the surface of the bundle. Also included is the 'ProjectLog' file produced by Slice and View acquisition software.
3.4_Cell_3_Stereocilia_Segmentation 47 is a multipage TIFF file illustrating stereocilia segmentation results. Same as 'Towers.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview.
3.5_Cell_3_GoldParticle_Segmentation 48 is a multipage TIFF file illustrating the gold bead segmentation results. Same as 'GoldParticles.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview.
3.6_Cell_3_Surface_Rendering 49 file is a compressed (zipped) folder containing all files required to render the surface of the bundle. Also included is the 'ProjectLog' file produced by Slice and View acquisition software.
3.7_Cell_3_MATLAB_Volume_Analysis_Results 50 file is a compressed (zipped) folder containing all files required to execute the MATLAB volume analysis. Folder also includes the resulting output files and figures. Dataset 4. Includes the following files, all related to Cell #4 of the current study. The numbering order of the files reflects the order of operations performed during the analysis. Note: The data acquisition session was interrupted after frame #224, and later resumed as a separate imaging session. This resulted in two stack files that were subsequently concatenated into a single multipage TIFF file. 4.4_Cell_4_Stack_Cropped_Aligned 54 is a multipage TIFF file containing aligned images of Cell#4, which were cropped to contain only the relevant volume, and reduce the file size.
4.6_Cell_4_Stereocilia_Segmentation 56 is a multipage TIFF file illustrating stereocilia segmentation results. Same as 'Towers.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview. The file contains 362 images, as all the planes that contained no segmentation data were removed.
4.7_Cell_4_GoldParticle_Segmentation 57 is a multipage TIFF file illustrating the gold bead segmentation results. Same as 'GoldParticles.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview. The file contains 362 images, as all the planes that contained no segmentation data were removed.
4.8_Cell_4_Surface_Rendering 58 file is a compressed (zipped) folder containing all files required to render the surface of the bundle. Also included is the 'ProjectLog' file produced by Slice and View acquisition software.
4.9_Cell_4_MATLAB_Volume_Analysis_Results 59 file is a compressed (zipped) folder containing all files required to execute the MATLAB volume analysis. The folder also includes the resulting output files and figures. Dataset 5. Includes the following files, all related to Cell #5 of the current study. The numbering order of the files reflects the order of operations performed during the analysis. Group link: http://cellimagelibrary.org/ groups/50713. 5.1_Cell_5_Raw_Image_Stack 60 is a multipage TIFF file consisting of 233 individual EM sections acquired at 20 nm milling step, resulting in 2.75 × 2.75 × 20 nm 3 voxel size.
5.2_Cell_5_Stack_Cropped_Aligned 61 is a multipage TIFF file containing aligned images of Cell#5, which were cropped to contain only the relevant volume, and reduce the file size.
5.3_Cell_5_Stack_Filtered 62 is a multipage TIFF file filtered with unsharp filter (unit: 8 > kernel: 3D, 5 > standard deviation: 1.0 > unsharp factor: 1.0). 5.4_Cell_5_Stereocilia_Segmentation 63 is a multipage TIFF file illustrating stereocilia segmentation results. Same as 'Towers.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview.
5.5_Cell_5_GoldParticle_Segmentation 64 is a multipage TIFF file illustrating the gold bead segmentation results. Same as 'GoldParticles.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview.
5.6_Cell_5_Surface_Rendering 65 file is a compressed (zipped) folder containing all files required to render the surface of the bundle. Also included is the 'ProjectLog' file produced by Slice and View acquisition software.
5.7_Cell_5_MATLAB_Volume_Analysis_Results 66 file is a compressed (zipped) folder containing all files required to execute the MATLAB volume analysis. The folder also includes the resulting output files and figures. Dataset 6. Includes the following files, all related to Cell #6 of the current study. The numbering order of the files reflects the order of operations performed during the analysis. Group link: http://cellimagelibrary.org/ groups/50720. www.nature.com/scientificdata www.nature.com/scientificdata/ Note: The data acquisition session was interrupted after frame #161 to adjust the imaging area, resulting in a large shift on subsequent frame #162.
6.2_Cell_6_Stack_Aligned_Cropped 67 is a multipage TIFF file containing aligned, then cropped images of Cell#6, which were cropped to contain only the relevant volume, and reduce the file size. No filters were used to analyze this dataset. File contains 324 images, as all the planes that contained no relevant data were removed.
6.3_Cell_6_Stereocilia_Segmentation 68 is a multipage TIFF file illustrating stereocilia segmentation results. Same as 'Towers.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview. File contains 324 images, as all the planes that contained no segmentation data were removed.
6.4_Cell_6_GoldParticle_Segmentation 69 is a multipage TIFF file illustrating the gold bead segmentation results. Same as 'GoldParticles.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for its online preview. File contains 324 images, as all the planes that contained no segmentation data were removed.
6.5_Cell_6_Surface_Rendering 70 file is a compressed (zipped) folder containing all files required to render the surface of the bundle. Also included is the 'ProjectLog' file produced by Slice and View acquisition software.
6.6_Cell_6_MATLAB_Volume_Analysis_Results 71 file is a compressed (zipped) folder containing all files required to execute the MATLAB volume analysis. Folder also includes the resulting output files and figures. Also included is the "All Cells combined gold bead quantification results" table with all gold bead quantification results across all six cells used in this study. Dataset 7. Technical Validation dataset. Group link: http://cellimagelibrary.org/groups/50725. 7.1_Test_Cell_Stereocilia_Segmentation 72 is a multipage TIFF file illustrating stereocilia segmentation results of Cell#3. Same as 'Towers.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for online preview. An artificial dataset with manually placed gold beads (yellow dots) was used to access the accuracy of the MATLAB analysis workflow, experimentally validating its performance. All stereocilia excluded from the analysis were labeled with white 'X' marks. (c,d) A 3D cumulative distribution map of manually placed gold beads generated by the MATLAB script. Top and side views (c and d, respectively) are shown for the kinocilium (left) and three rows of stereocilia. Yellow asterisks represent gold beads; gray dots represent the convex hull points of stereociliary surface. Illustrated gold beads are tightly clustered within the locations where they were intentionally placed by the authors. The choice of gold bead placement location was unique for each row of stereocilia to control for potential errors of data processing such as stereocilia orientation, azimuth correction, or placement within the rows during MATLAB analysis. No beads were placed on the surface of the kinocilium.
www.nature.com/scientificdata www.nature.com/scientificdata/ 7.2_Test_Cell_GoldParticle_Segmentation 73 is a multipage TIFF file illustrating the manually placed gold beads in a pattern as described in the Technical Validation section. Same as 'GoldParticles.am' file used for MATLAB volume analysis of this cell, but converted to TIFF to allow for online preview.
7.3_Test_Cell_MATLAB_Volume_Analysis_Results 74 file is a compressed (zipped) folder containing all files required to execute the MATLAB volume analysis. The folder also includes the resulting output files and figures. 75 . MATLAB script is a compressed (zipped) folder containing the MATLAB script files required to perform the abovementioned analysis (https://doi.org/10.7295/w9cil50728).

Technical Validation
A manually generated test Dataset 7 [72][73][74] was used to evaluate the performance of the MATLAB algorithm, testing it for any potential errors. One of the existing stereocilia segmentation files was used, in which all gold beads were removed. Using this data set, we manually placed a single gold bead per stereocilium, but varied its location among the rows of stereocilia (Fig. 8a,b). For the tallest row, the bead was placed at the tip on the surface opposite from the middle row stereocilia. For the middle row, the bead was placed on the surface away from the tallest row cilia; and for the short row stereocilia it was placed at the lower end near the cuticular plate, on the surface opposite from the middle row stereocilia. No beads were placed on the surface of the kinocilium. This provided a control for script's performance in the following tasks: • every tower (stereocilium) is properly oriented in space (for example, was not positioned upside down when its principal directions were computed); • every tower (stereocilium) is properly identified within the rows, as assigned by the user in the 'LinkTable.csv' file. That is evident, as each row in the test set has its unique, row-specific placement of the gold bead. In the event of any stereocilia mistakenly assigned to the wrong row, any misplaced gold beads would be evident in the resulting gold bead distribution map (Fig. 8c,d); • all towers were properly oriented during the azimuth transformation step: although without a high precision, each gold bead was manually placed as outlined above, and all of them were tightly clustered within the proper segment of stereocilia following the azimuth transform step. • each gold bead is assigned to the surface of the nearest stereocilium. In the event of any gold bead mistakenly assigned to the wrong stereocilium, such misplaced gold beads would appear misplaced in the resulting gold bead distribution map (Fig. 8c,d).
As seen from the gold bead distribution map on Fig. 8c,d, all stereocilia and gold beads were identified and analyzed as expected, validating the performance of the MATLAB algorithm in performing all abovementioned tasks.

Code availability
This study was carried out using the following software packages: Dragonfly 2017 by Object Research Systems, free for non-commercial users, and available for download at https://www.theobjects.com/dragonfly/index.html. Amira 6.2.0 by Thermo Fisher Scientific. MATLAB 2017a by MathWorks. The MATLAB script is deposited with The Cell Image Library (https://doi.org/10.7295/w9cil50728), available for download as Dataset 8 75 of this submission.