Comparison between a deep-learning and a pixel-based approach for the automated quantification of HIV target cells in foreskin tissue

The availability of target cells expressing the HIV receptors CD4 and CCR5 in genital tissue is a critical determinant of HIV susceptibility during sexual transmission. Quantification of immune cells in genital tissue is therefore an important outcome for studies on HIV susceptibility and prevention. Immunofluorescence microscopy allows for precise visualization of immune cells in mucosal tissues; however, this technique is limited in clinical studies by the lack of an accurate, unbiased, high-throughput image analysis method. Current pixel-based thresholding methods for cell counting struggle in tissue regions with high cell density and autofluorescence, both of which are common features in genital tissue. We describe a deep-learning approach using the publicly available StarDist method to count cells in immunofluorescence microscopy images of foreskin stained for nuclei, CD3, CD4, and CCR5. The accuracy of the model was comparable to manual counting (gold standard) and surpassed the capability of a previously described pixel-based cell counting method. We show that the performance of our deep-learning model is robust in tissue regions with high cell density and high autofluorescence. Moreover, we show that this deep-learning analysis method is both easy to implement and to adapt for the identification of other cell types in genital mucosal tissue.

Sexual transmission of HIV, which occurs at the anogenital mucosa, accounts for the majority of new HIV infections worldwide 1 .The availability of target cells bearing the HIV co-receptors CD4 and CCR5 in genital tissue is a critical determinant of productive infection [2][3][4][5] .The inner foreskin of the penis is enriched in HIV-susceptible T cell populations and other cell types in the foreskin such as macrophages, dendritic cells, and Langerhans cells, which can contribute to the spread of HIV by direct infection or by disseminating viral particles to susceptible T cells 6,7 .The genital mucosa is exposed to sexually transmitted pathogens, commensal bacteria, and exogenous materials (hygiene products, genital secretions, etc.) that can induce local inflammation and increase HIV susceptibility [8][9][10][11][12][13] .Thus, it is essential to directly quantify immune cells in genital mucosal tissue when testing new prevention modalities or determining best practices to prevent HIV transmission.
High-parameter flow cytometry can provide detailed information on the phenotype and activation state of immune cell subsets, but is ill-suited for measuring immune cell density in tissue 14 .Disruption of tissue into single cell suspensions for flow cytometry reduces the viability of sprawling cells, such as dendritic cells and Langerhans cells, and results in the loss of information on the spatial distribution of cells within tissue.Furthermore, tissue disruption may alter the expression of cell surface markers relevant to viral entry.Immunofluorescent (IF) imaging overcomes the limitations imposed by tissue disruption but introduces new challenges in quantification.Structural proteins and collagen fibers in genital mucosal tissue exhibit high levels of autofluorescence that can obscure positive staining and make cell quantification challenging 15 .Manual counting remains the gold standard but is laborious and variability and bias may be introduced by the operators.To overcome these issues, significant effort has recently been made to automate cell identification (cell segmentation) and quantification [16][17][18][19][20] .

Immunofluorescence microscopy
Foreskin tissues used in this study (n = 40) were collected from adult males undergoing elective penile circumcision as part of a larger study on HIV prevention in Uganda 26 .Frozen tissue blocks were prepared in the Uganda Virus Research Institute-International AIDS Vaccine Initiative laboratory in Uganda immediately after circumcision and then shipped to the University of Western Ontario in Canada for immunofluorescent microscopy analysis.Frozen tissue blocks were sectioned at a thickness of 8 µm using a CM1850 cryostat (Leica, Wetzlar, Germany) and two sections from each tissue block were adhered to a glass slide for staining.Slides were stored in an air-tight box at − 80 °C for up to 1 month prior to staining.Slides were thawed and air-dried for 30 min, and tissue sections fixed by applying 100 µl of 3.7% formaldehyde in PIPES buffer (100 mM PIPES, 2 mM MgCl 2 , 1 mM EGTA in PBS, pH 6.8) for 5 min at room temperature.Slides were washed 3 times in 1 × PBS between all subsequent blocking and staining steps and all staining was performed in the dark at room temperature, unless otherwise stated.Each tissue section was blocked by applying 100 µl of a solution consisting of 10% normal donkey serum, 0.1% Triton X-100, and 0.01% sodium azide diluted in 1 × PBS (henceforth referred to as blocking solution) for 30 min.Each section was incubated with 100 µl of undiluted anti-human primary CD3 antibody (Clone: SP7) (Abcam, Cambridge, United Kingdom) for 1 h at 37 °C, followed by 30 min incubation with 100 µl of donkey anti-rabbit Alexa Fluor 488-conjugated secondary antibody (Polyclonal, diluted to 0.25% in blocking solution) (Abcam).Next, sections were each incubated with 100 µl of mouse anti-human CCR5 primary antibody (diluted to 5% in blocking buffer) (Monoclonal, a gift from Dr. Matthias Mack, University of Regensburg, Germany) for 1 h at 37 °C, followed by 30 min incubation with 100 µl of donkey anti-mouse Alexa Fluor 647-conjugated secondary antibody (Polyclonal, diluted to 0.25% in blocking solution) (Abcam).Finally, sections were each incubated with 100 µl goat anti-human CD4 primary antibody (Polyclonal, diluted to 5% in blocking buffer) (Abcam) overnight followed by 30 min incubation with 100 µl of donkey anti-goat Alexa Fluor 568-conjugated secondary antibody (Polyclonal, diluted to 0.25% in blocking buffer) (Abcam).Coverslips were mounted onto stained slides by applying 100 µl Fluoromount G mounting media with DAPI (Thermo Fisher Scientific, MA, USA) per slide.Slides were stored at 4 °C for up to one week prior to imaging.Tiled images for whole tissue sections were scanned with a DM5500B fluorescence microscope (Leica, Wetzlar, Germany) using the 20 × objective lens for CCR5 (Y5 filter set, referred to as far-red channel), CD4 (DSR filter set, referred to as red channel), CD3 (GFP filter set, referred to as green channel), and cell nuclei (CFP filter set, referred to as blue channel).Representative staining is depicted in Supplementary Fig. 1.Excitation and emission filters are listed in Table 1.

Manual counting of HIV target cells
All possible cell type combinations positive for CD3, CD4, CCR5, and nuclei were counted individually (i.e., all DAPI +, CD3 + /DAPI +, CD4 + /DAPI +, CCR5 + /DAPI +, CD3 + CD4 + /DAPI + CD3 + CCR5 + /DAPI +, CD4 + CCR5 + /DAPI +, and CD3 + CD4 + CCR5 + /DAPI +).Total cell count was determined by counting all positive stained nuclei (DAPI).Manual counting of each cell type was completed using composite images of nuclei staining and the marker(s) of interest only (e.g., manual counting of CD3 + cells was performed on a composite image created by merging the CD3 and DAPI channels).Counting was completed by a single trained lab member using the Cell Counter plugin for Fiji.A cell was considered positive for a marker when positive marker staining overlapped, surrounded, or was directly adjacent a positive nucleus (Supplementary Fig. 2).

Pixel-based counting using Fiji and CellProfiler
Pixel-based cell counting was performed using an algorithm based on the open-source image analysis programs Fiji and CellProfiler [27][28][29] .Image processing functions in Fiji were used to pre-process images.Both Fiji and Cell-Profiler were used for thresholding and cell segmentation after pre-processing 27,28 .The epidermis and dermis underwent pre-processing, thresholding and segmentation using independent parameters, as the epidermis and dermis are very different in terms of cell density and level of autofluorescence.
Prior to pre-processing, full tissue section scans were split into the dermis and epidermis via manual tracing in Fiji and saved as separate files.The epidermis and dermis images were then split into individual channel images (CD3, CD4, CCR5, nuclei) by applying the "Stack to Images" function in Fiji.All individual channel images were pre-processed by applying the "Subtract Background", "Brightness/Contrast", and "Minimum Filter" functions successively.The "Subtract Background" and "Minimum Filer" functions were applied using parameters listed in Table 2; the "Auto" setting was used for used for all "Brightness/Contrast" adjustments.
Fiji was then used to apply threshold levels to pre-processed images and to select for positive staining.Preprocessed images were converted to 8-bit format and the "Adaptive Thresholding" plugin for ImageJ was used to threshold images 27 .The "Fill Holes", "Despeckle", "Erode", and "Watershed" functions were then applied successively.Positive marker staining selection was completed by applying the "Analyze Particles" function with the options for "Display Results", "Clear Results" and "Add to Manager" checked, and the "Circularity" setting set from 0.00 to 1.00 to detect all objects.The regions-of-interest (ROIs) generated after applying the "Analyze Particles" function were saved and overlayed with a black 8-bit image with the same dimensions as the image being analyzed to create a binary image representing positive marker signal.The number of ROIs for nuclei staining represents the total cell count of the image.Parameter values for each function are listed in Table 3.
CellProfiler was then used to quantify cells of interest by determining the number of instances where each combination of positive marker staining overlapped with positive staining of nuclei.The binary black and white images of positive marker staining were imported into CellProfiler using the "ConvertImageToObjects" function.The "RelateObjects" function was then applied to determine positive marker staining for CD3 and/or CD4  www.nature.com/scientificreports/and/or CCR5 that overlaps with positive nuclei staining 27,28 .Scripts for high-throughput pre-processing and automated counting of images are available in the public domain (https:// github.com/ prodg erlab/ pixel-basedquant ifica tion).

Annotation of images for StarDist model training in Labkit
A total of 40 fields-of-view (FOVs) were randomly generated from both inner and outer whole tissue section scans (600 × 600 µm, 1500 × 1500 pixels) for manual annotation for StarDist model training.Prior to annotation, all FOVs were visually assessed for exclusion of areas with tissue folding (due to improper adherence of tissue sections to glass slides during cryosectioning).Each FOV contained the full thickness of the epidermis (both the apical and basal edges of the epidermis clearly identifiable) and at least half of the full thickness of the dermis.All possible marker combinations (described above, in manual counting) were annotated individually in each FOV using the Labkit plugin (version 0.3.7)for Fiji 27,30 .All cells were then manually traced and assigned an individual label with the override option selected to prevent overlapping cells (Supplementary Fig. 2C,F).Completed annotation images were saved as separate tiff files.Each annotated set used for training included 1 raw 4-channel FOV image and 8 annotated images (considered as ground truth for training).The scripts used for model training and export were adapted from example Jupyter notebooks in the StarDist repository, which provided instruction for 2D (U-Net-based) StarDist model training.Adjustments were made so that IF microscopy images with 4 channels can be used as input.Image augmentation functions were adapted to include random flipping and intensity modification of training images, to improve the accuracy of the StarDist model without the need for additional training images and annotation.Training was completed with the n_rays parameter for Config2D set to 32 and all other parameters were left to their default value in accordance with the Jupyter notebook example.The full dataset used for training is available in the public domain (https:// zenodo.org/ record/ 80919 14).Model training was performed for 400 epochs with 100 steps per epoch.Probability and overlap parameters for non-maximum suppression (NMS) post-processing were optimized and applied when using the model in the StarDist plugin (version 0.6.0)for Fiji 25,27 .

StarDist model training
When using the model for prediction, images fed into the model were first normalized (percentile range 1.7 to 99.8).The model was then run with the probability/score threshold set to 0.50, overlap threshold set to 0.70, number of tiles set to 16, and boundary exclusion set to 2. Our trained StarDist model with optimized parameter pre-loaded is available in the public domain (https:// zenodo.org/ record/ 80918 89).

StarDist model validation
We compared the performance of our StarDist model against manual counting and pixel-based automated cell counting using 10 randomly selected FOV images (different from training FOVs), containing images from both inner and outer foreskin tissues.Validation FOVs were 600 × 600 µm (1500 × 1500 pixels) in size, excluded areas with tissue folding, and included the full thickness of the epidermis and at least half of the thickness of the dermis.Manual counting of the validation images was completed by a single trained lab member.

Statistical analyses
For StarDist model validation, the difference in FOV cell counts between manual counting and (i) the pixel-based model, and (ii) the StarDist model, was determined and statistical significance (p < 0.05) was assessed using a paired t test.Using manual counting as the gold standard, the total number of true positive, true negative, false positive, and false negative counts were determined for the pixel-based and StarDist models, and used to calculate each models sensitivity, precision, false negative rate, and false discovery rate.
Performance of the StarDist and pixel-based models in areas of high autofluorescence was assessed by quantifying the number of high autofluorescence image crops with at least one instance of autofluorescence inaccurately counted as a cell, compared to manual counting 25,27,29 .Performance in areas of high cell density was assessed by counting the number of high cell density image crops with at least one instance of a cell being either falsely merged or falsely split, compared to manual counting.Differences in Cell Merging, Cell Splitting, and Autofluorescence Misidentification between the Pixel-Based method and StarDist method was assess using Fisher's exact tests.www.nature.com/scientificreports/All results are presented as mean ± SDs.Graphics were generated and statistical analyses performed using Prism 9 (GraphPad software, La Jolla, CA, USA).

Accuracy of the custom StarDist model rapidly improves with additional training images
The accuracy of the StarDist model in cell segmentation (compared to manual counting) was quantified for nuclei (all cells), CD3 + cells, CD4 + CCR5 + cells and CD3 + CD4 + CCR5 + cells after training with 10, 20, 30, and 40 images (Fig. 1).Pooled cell counts were generated from the 10 validation FOVs and percent difference from manual counts was used as a measure of accuracy.The accuracy of the StarDist model increased with each addition of 10 annotated images for training (Fig. 1).The percent difference of automated StarDist counts from manual counts decreased exponentially, with only small gains in accuracy made between 30 and 40 training FOVs (0.34% decrease for all cells (nuclei), 1.8% decrease for CD3 + cells, 5.2% decrease for CD4 + CCR5 + cells, and 4.6% decrease for CD3 + CD4 + CCR5 + cells).
Automated cell segmentation of HIV target cells in foreskin tissue using our custom StarDist model had higher sensitivity and precision than the pixel-based approach for all cell types.Sensitivity of the StarDist model was > 94.0% for all cell types analyzed and > 97.5% for CD3 + CCR5 + cell counts, CD4 + CCR5 + cell counts, and CD3 + CD4 + CCR5 + cell counts.Meanwhile the sensitivity of the pixel-based workflow was < 85.0% for all cell types except for counts for total cell count (93.6%) and CD3 + cells (91.1%).In particular, the sensitivity of the pixel-based model for CD3 + CD4 + CCR5 + cells was only 60.7%.Both the StarDist and the pixel-based models had > 90.0% precision for all cell types except for CD4 + CCR5 + cells and CD3 + CD4 + CCR5 + cells (StarDist: 87.9% and 85.2%, respectively; pixel-based: 69.8% and 78.0%, respectively) (Supplementary Table 1).The StarDist model also had a lower false negative rate and lower false discovery rate than the pixelbased model all cell types.The false negative rate using the StarDist model was < 5.2% for all cell types, while the false negative rate for the pixel-based model ranged from 6.4% (total cell count) to 39.3% (CD3 + CD4 + CCR5 + cells).The false discovery rate for the pixel-based model was 1.7-fold (CD3 + cells) to 16.4-fold (CD3 + CD4 + CCR5 + cells) higher than the StarDist model.The false discover rate of both methods was < 10.0% for all cell types, except for CD4 + CCR5 + cells and CD3 + CD4 + CCR5 + cells (StarDist: 12.1% and 14.8%, respectively; for, pixel-based: 30.2% and 22.0%, respectively) (Supplementary Table 1).

StarDist does not systematically over-or under-count in different tissue regions
The epidermis and dermis regions of the foreskin are very different in terms of cell density and autofluorescence (Supplementary Fig. 1).To overcome this challenge, many pixel-based workflows (including the one presented here) process the epidermis and dermis/lamina propria regions separately, using different image analysis parameters, which increases processing time and labour.To determine if our StarDist model is also affected by systematic counting errors in different tissue regions, the 10 validation FOVs were manually split into epidermis and dermis regions and cell counts were obtained by the three methods (manual, pixel-based, StarDist) for each (Fig. 4).

Automated cell segmentation with StarDist is robust in areas with high cell density or high autofluorescence
Automated cell quantification methods often perform poorly in areas with high cell density (cell splitting) or high autofluorescence (false positives), both of which are found in foreskin and other mucosal/epithelial tissues.
To assess performance specifically in areas of high cell density and autofluorescence, image crops (70 × 70 µm) were selected from each FOV (Fig. 5A) based on having high autofluorescence (4 crops per FOV, n = 40, denoted by "aF") or high cell density (4 crops per FOV, n = 40, denoted by "hD").The performance of manual counts (Fig. 5B), pixel-based methods (Fig. 5C), and StarDist model (Fig. 5D) was assessed for each crop within the same tissue region.For high autofluorescence image crops, the number of images where autofluorescent collagen fibers were misidentified as cells (compared to manual counting, Fig. 5B,aF) was counted for the pixel-based (Fig. 5C,aF) and StarDist models (Fig. 5D,aF).For high cell density image crops, the number of images where cell merging or splitting occurred during cell segmentation (compared to manual counting, Fig. 5B,hD) were counted for the pixel-based (Fig. 5C,hD) and StarDist models (Fig. 5D,hD).
In high cell density image crops, the pixel-based model merged cells significantly more frequently than the StarDist model in all cell types except CD3 + CD4 + cells, CD4 + CCR5 + cells, and CD3 + CD4 + CCR5 + cells, for which minimal merging was observed in both models (Supplementary Table 2).The pixel-based model also split cells more frequently, splitting nuclei, CD3 + cells, CD4 + cells, and CCR5 + cells significantly more frequently than the StarDist model.In general, the number of image crops where cell merging occurred was twofold (CD4 + CCR5 + cells) to sevenfold (CD4 + cells) higher when using the pixel-based model compared to the StarDist model.Image crops with cell merging or splitting were most common for the segmentation of nuclei, likely due to the high number of nuclei per crop.
Misidentification of autofluorescence as cells occurred significantly more frequently when using the pixel-based method, with misidentification observed for all cell types except CD4 + CCR5 + cells and CD3 + CD4 + CCR5 + cells (Supplementary Table 2).For both the pixel-based approach and StarDist, misidentification of autofluorescence as cells was most prevalent for CD3 + cell segmentation (35/40 image crops for pixel-based, 5/40 image crops for StarDist) (Supplementary Table 2).

Discussion
In this report, we described the development and validation of a deep-learning approach to quantify HIV target cells in foreskin tissues.Using the Snakemake workflow management system and the StarDist plugin for Fiji, we trained a custom StarDist deep-learning model that can be used to identify HIV target cells in multi-channel immunofluorescent microscopy images of foreskin tissue 25,27,31 .The accuracy of this deep-learning model was compared to manual counting and a conventional automated cell segmentation algorithm which utilizes a pixelbased approach 29 .The reliability of the deep-learning model was also further tested in tissue regions with high cell density or high autofluorescence.
Immunofluorescent microscopy is an excellent tool for understanding the spatial distribution of immune cells in genital mucosal tissues, however, quantification of cells in IF microscopy images can be difficult.The subjective and labour-intensive nature of manual counting makes this method impractical for clinical studies.Automated approaches for cell segmentation often use a pixel-based workflow, however, the reliability of this method is poor in regions with high cell density and high autofluorescence 17,32 .
Our deep-learning approach to quantify HIV target cells in the foreskin can replace conventional pixelbased methods for immune cell quantification in genital tissues and can be easily adapted to detect of a wide variety of immune cell subsets in different types of genital mucosal tissue.Our approach utilizes multi-channel IF microscopy images as input and enables the detection of multiple cell types (based on combinations of markers) simultaneously.We used StarDist and its plugin for Fiji as the backbone for the development because of its accessibility.StarDist is open source and can easily be incorporated into existing image analysis workflows 25,27 .Plugins that allow StarDist models to be run without any programming knowledge are available for popular open-source software such as ImageJ/Fiji, Napari, QuPath, Icy, and KNIME 27,[33][34][35] .Results from StarDist models are also highly reproducible since trained StarDist models and training datasets can be easily shared.
Our custom StarDist model exceeded conventional pixel-based cell segmentation in performance metrics including percent difference from manual counts, sensitivity, precision, false negative rate, and false discovery rate.Most notably, our custom StarDist model achieved over 94.0% sensitivity and over 85.0% precision for all cell types examined.There were no significant differences between StarDist counts and manual counts for all cell types examined.The percent difference of StarDist counts from manual counts was also less than 5.0% for all cell types except for CD4 + CCR5 + cells (11.3%) and CD3 + CD4 + CCR5 + cells (13.6%).These cell types were the rarest among all cell types we examined, thus fewer instances were available for training compared to other cell types.Detection of these cell types is also more complex, as image information from 3 channels are used to detect CD4 + CCR5 + cells and image information from 4 channels are used to detect CD3 + CD4 + CCR5 + cells.While the count accuracy for these cell types is likely to increase with more training, CD4 + CCR5 + cells are less likely to be T cells and more likely to be dendritic or Langerhans cells.A limitation of our StarDist model is that the star-convex polygons shape approximation strategy used may struggle to approximate the irregular shape of sprawling cells 36  www.nature.com/scientificreports/ in a method called SplineDist 37 .This method was shown to be capable of being incorporated into the StarDist architecture to accurately capture non-star-convex objects.
Cell detection using our StarDist model was robust in challenging tissue areas with high cell density or high levels of autofluorescence.Performance of conventional automated cell segmentation can struggle in both the epidermis and dermis region of foreskin tissue for different reasons.In the epidermis, thresholding can be difficult due to high cell density.In the dermis, thresholding can be difficult due to the high abundance of autofluorescent collagen fibers.In contrast to conventional automated cell segmentation that relies on thresholding, the StarDist model did not systematically overcount or undercount in the epidermis or dermis, and detected cells with high accuracy in areas of high cell density or high autofluorescence.Based on these results, we conclude that the performance of our StarDist model is comparable to manual counting and StarDist is superior to conventional pixel-based approaches for HIV target cell quantification.
A limitation of this study is that we did not validate performance of the StarDist model in other tissue types or on other microscopes.It is likely that additional training, using the Snakemake framework 25,27,31 , would be necessary to use this model, for example, on cervical tissues or at other sites.Furthermore, our StarDist model is trained using foreskin tissue stained for CD3, CD4, CCR5, and nuclei.While this marker set allows for quantification of all cells bearing the HIV co-receptors (CD4 and CCR5), additional training would be required to analyze tissues stained for different markers.Shared repositories of training sets with various tissue types may be used in the future to increase the generalizability of our StarDist model, without the requirement of new tissue collection from varied mucosae and denovo staining.However, existing repositories mainly contain IF images of only nuclei or histology images.Improving the diversity of training images in these repositories would greatly improve the adoptability of deep-learning methods for cell segmentation 38,39 .
An additional limitation of discrete cell segmentation methods, including our StarDist model, is that positive staining is only quantified if it is associated with a nucleus.The appendages of sprawling HIV target cells such as Langerhans cells and dendritic cells can often span through multiple tissue sections, resulting in cell membrane area that is positively stained for CD4 and CCR5 but unassociated with nuclei.Cell segmentation methods do not account for these appendages, and thus ignores some membrane surface area that may bind HIV virions.While pixel-based methods for image analysis are problematic for discrete cell segmentation, they are good for measuring the total amount of tissue area that is positively stained for a particular marker.Thus, pairing StarDist HIV target cell counting results with results from pixel-based quantification of CD4 + CCR5 + tissue area could be beneficial in terms of gaining a more holistic understanding of HIV susceptibility in foreskin tissue.
To our knowledge, this study is the first to describe a deep learning method for the quantification of HIV target cells in microscopy images of mucosal tissue.This is a highly relevant outcome for clinical trials aiming to reduce HIV susceptibility, as the availability of susceptible cells in the mucosa is directly linked to the likelihood that sexual exposure will lead to infection [40][41][42] .This is an important outcome not only for trials aimed at reducing target cell availability, but all trials of mucosally applied products, to ensure there are no off-target inflammatory effects that would unintentionally increase HIV-susceptibility 43,44 .Previous studies have extensively described deep-learning methods for segmentation of CD3 + cells but these methods focus on histology images 45,46 .Existing methods for deep-learning cell segmentation in IF microscopy images are not designed to enable simultaneous quantification of multiple cell types, limiting their utility to study the distribution and composition of immune cells.Implementation of these models also require a higher-level understanding of convolutional neural networks and software packages related to neural network construction 47,48 .
In conclusion, this StarDist deep learning approach is an accurate and time saving approach that can be used in a high-throughput manner to quantify immune cells in full mucosal tissue section scans.This model is more robust than conventional pixel-based methods in regions of high cell density and autofluorescence, and accurately quantifies cells in both the epidermal and dermal regions of the foreskin.This model can be easily applied and incorporated into existing ImageJ workflows, making it an appropriate replacement for conventional pixel-based methods for automated HIV target cell quantification that have previously been used in other studies.The high accuracy of this model enables high-throughput processing of samples from large patient cohorts and provides feasibility for the use of immunofluorescence imaging as reliable method for measuring HIV susceptibility in clinical trials.

Figure 1 .Figure 2 .
Figure 1.Training images rapidly improve the accuracy of a custom StarDist model used for the quantification of HIV target cells in foreskin tissue.Percent difference of automated counts generated by StarDist from manual counts for nuclei (all cells), CD3 + cells, CD4 + CCR5 + cells, and CD3 + CD4 + CCR5 + cells after training with 10 images, 20 images, 30 images, and 40 images.Automated counting was completed on 10 validation field-of-view images (600 × 600 µm) randomly generated from a set of 232 full foreskin tissue section scans stain for CD3, CD4, CCR5, and nuclei to identify HIV susceptible cells in the tissue.

Figure 3 .
Figure 3.Comparison of manual cell counts with automated cell counts generated using a pixel-based model and the StarDist model.Cell counting was completed on 10 validation field-of-view images (600 × 600 µm) randomly generated from a set of 232 full section scans (imaged at 200 × total magnification) of foreskin tissue stained for CD3, CD4, CCR5, and nuclei.Manual counts are presented in the center (black dots), automated counts from the pixel-based model on the left (pink dots), and automated counts from the StarDist model on the right (cyan dots).Gray lines connect counts acquired by each method from the same validation FOV.The black center line and box represent the mean and SD.Differences between manual counts and the two automated counting methods was assess using two-tailed paired T-tests.Significant differences are indicated by asterisks (*p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001).

Figure 4 .
Figure 4. Automated cell segmentation using a pixel-based model systematically overcounts cells in the epidermis, and undercounts cells in the dermis.Cell counting was completed on 10 validation field-of-view images (600 × 600 µm) randomly selected from a set of 232 full section scans of foreskin tissues (inner and outer) stained for CD3, CD4, CCR5, and nuclei.Cell counting was completed on the full field-of-view images, and separately on the epidermis and dermis.The difference between automated counts and manual counts in (A) epidermis, (B) dermis, and (C) full tissue is displayed the percent difference from manual counts.Red dashed lines mark ± 10.0% difference from manual counts.

Table 1 .
List of microscope filters and their designations.LP (long pass filter), λ (wavelength in nanometers).*peak wavelength of light that passes through the filter and filter bandwidth.

Table 2 .
Input parameters for pre-processing workflow in Fiji for pixel-based cell counting.

Table 3 .
Input parameters for thresholding and segmentation in CellProfiler for pixel-based cell counting.Vol:.(1234567890)Scientific Reports | (2024) 14:1985 | https://doi.org/10.1038/s41598-024-52613-3 Training of the custom StarDist model was completed using a CUDA enabled Windows 10 workstation computer running a Window Subsystem for Linux with the Ubuntu (version 22.04) distribution of Linux installed.This workstation computer was equipped with an NVIDIA GeForce RTX 3090 graphics card to accelerate model training.The training environment was set up using the Mambaforge (version 0.24.0)distribution of Python 31ersion 3.10.5)forLinuxandTensorFlow (version 2.10.0)through a custom workflow created using Snakemake (version 7.12.0)31.The Snakemake workflow used to set up the training environment was adapted from instructions in the StarDist GitHub repository (https:// github.com/ stard ist/ stard ist) that originally described set-up of a Python training environment using the Anaconda distribution of Python.Our full Snakemake workflow, which includes all scripts used for model training and export, is available in the public domain (https:// github.com/ prodg erlab/ stard ist).