The SUSTech-SYSU dataset for automated exudate detection and diabetic retinopathy grading

Lin, Li; Li, Meng; Huang, Yijin; Cheng, Pujin; Xia, Honghui; Wang, Kai; Yuan, Jin; Tang, Xiaoying

doi:10.1038/s41597-020-00755-0

Download PDF

Data Descriptor
Open access
Published: 20 November 2020

The SUSTech-SYSU dataset for automated exudate detection and diabetic retinopathy grading

Li Lin ORCID: orcid.org/0000-0002-9789-0825^1,3^na1,
Meng Li²^na1,
Yijin Huang¹^na1,
Pujin Cheng¹,
Honghui Xia⁴,
Kai Wang³,
Jin Yuan² &
…
Xiaoying Tang¹

Scientific Data volume 7, Article number: 409 (2020) Cite this article

5653 Accesses
23 Citations
3 Altmetric
Metrics details

Subjects

Abstract

Automated detection of exudates from fundus images plays an important role in diabetic retinopathy (DR) screening and evaluation, for which supervised or semi-supervised learning methods are typically preferred. However, a potential limitation of supervised and semi-supervised learning based detection algorithms is that they depend substantially on the sample size of training data and the quality of annotations, which is the fundamental motivation of this work. In this study, we construct a dataset containing 1219 fundus images (from DR patients and healthy controls) with annotations of exudate lesions. In addition to exudate annotations, we also provide four additional labels for each image: left-versus-right eye label, DR grade (severity scale) from three different grading protocols, the bounding box of the optic disc (OD), and fovea location. This dataset provides a great opportunity to analyze the accuracy and reliability of different exudate detection, OD detection, fovea localization, and DR classification algorithms. Moreover, it will facilitate the development of such algorithms in the realm of supervised and semi-supervised learning.

Measurement(s)	diabetic retinopathy
Technology Type(s)	machine learning
Sample Characteristic - Organism	Homo sapiens
Sample Characteristic - Environment	hospital • laboratory environment
Sample Characteristic - Location	Guangdong Province

Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.13168106

Exudate identification in retinal fundus images using precise textural verifications

Article Open access 17 February 2023

Support vector machine and deep-learning object detection for localisation of hard exudates

Article Open access 06 August 2021

Automated feature-based grading and progression analysis of diabetic retinopathy

Article Open access 17 March 2021

Background & Summary

Diabetic retinopathy (DR) is one of the microvascular complications of diabetes mellitus and a leading cause of blindness among working-age adults in developed countries¹. It is estimated that currently 463 million adults in the age range of 20–79 years have diabetes, and this number will reach 700.2 million by 2045^2,3.

DR lesions include microaneurysms, hard exudates, soft exudates, hemorrhages, intraretinal microvascular abnormalities, neovascularization and so on, the most common ones of which are shown in Fig. 1. Hard exudates and soft exudates^4,5,6 typically manifest in an early stage of DR. Hard exudates are mainly composed of extracellular lipid, and are usually located in the outer layer of the retina. They can be either individual dots, continuous flaky spots, or circumferential lesions surrounding retinal edema or microaneurysm. Soft exudates are localized edema or infarcts in the nerve fiber layer. In fundus images, they appear white or pale yellow, having a round or elliptic shape, with fuzzy edges. Research has demonstrated that the area and amount of hard exudates can serve as potential discriminant indicators of the severity of DR⁷. And an increase in the number of hard exudates has been suggested to be associated with an increased risk of vision loss^8,9 as well as subretinal fibrosis in diabetic macular edema (DME)¹⁰.

In DR, an early detection and timely intervention is vital for protecting a patient’s visual function. Recent technological advancements in big data, computing power, and machine learning technologies have enabled fast and efficient computer-aided diagnoses of DR, wherein identification and quantization of exudates are essential components. During the past decade, various methods, which can be roughly divided into four categories (thresholding methods¹¹, region growing methods¹², morphology methods¹³, and machine learning methods¹⁴), have been developed for automatically detecting exudates. Machine learning methods, especially those with deep convolutional neural network architectures, have achieved overwhelming performance. Machine learning methods depend considerably on the sample size of training data and labels’ quality. Therefore, creating high-quality and large-scale training data has become a significant research direction in ophthalmic image analysis. For instance, the ORIGA^−light dataset was constructed for optic disc and optic cup segmentation^15,16. DRIVE and STARE are two classic fundus datasets for retinal vessel segmentation, and STARE also provides diagnostic information for a larger set of fundus images^17,18,19,20. In one of our previous works, we also developed a dataset containing 712 ocular staining images for corneal ulcer segmentation and classification^21,22. However, to the best of our knowledge, existing large-scale and well-annotated fundus image datasets with lesion annotations are relatively limited.

Segmentation and detection are the two most popular approaches for lesion identification. There are several differences between them: (1) Segmentation methods require pixel-level annotations, while the latter requires bounding boxes or contours; (2) Segmentation methods often require more computing resources and training-testing time; (3) The outputs of segmentation methods are often more precise. Although pixel-wise annotations have a higher labeling accuracy, the bounding or contouring approach for detection is more practically feasible and efficient. Clinically, both segmenting and detecting lesions are beneficial to quantify the severity of DR. Currently, there are several publicly-available datasets for exudate identification. The DIARETDB1_v2 dataset contains 46 fundus images with rough polygonal boundary annotations for exudates^23,24. HEI-MED^25,26, consisted of 169 samples, is constructed for detecting exudates in DME. They share common problems: the annotations are not precise enough for a segmentation purpose, and the sample sizes are relatively limited for training detection models (one fundus image is usually treated as one sample). The e-Ophtha EX and IDRiD datasets have more precise annotations on exudates at a pixel-level, but they are composed of only 47 and 81 fundus images^27,28,29,30.

In such a context, we develop a large-scale DR dataset, containing fundus images and the corresponding exudate detection annotations, left-versus-right eye labels, DR grades, the bounding boxes of OD, and fovea locations. This dataset will provide an excellent opportunity for developing and validating automated exudate detection algorithms, as well as DR classification algorithms. Furthermore, it can also be used for designing and testing OD identification and fovea localization pipelines. Overall, the dataset we construct in this paper provides a powerful resource for anatomical landmark detection, lesion detection, and DR classification based on fundus images.

Methods

Data collection

A total of 603 fundus images from DR patients and 631 fundus images from healthy people were collected from the Department of Ophthalmology, Gaoyao People’s Hospital and Zhongshan Ophthalmic Center, Sun Yat-sen University. All participants provided written informed consent complying with the approval requirements of the Medical Ethics Committee at Gaoyao People’s Hospital and Zhongshan Ophthalmic Center. This study followed the tenets of the Helsinki Declaration and was approved by the Medical Ethics Committee, Gaoyao People’s Hospital and Zhongshan Ophthalmic Center (2017KYPJ104).

DR patients with both type 1 diabetes and type 2 diabetes were included in this study. Diagnoses with diabetes were established according to the World Health Organization diagnostic criteria. Regular fundus photographs were taken from healthy people during their annual physical examinations. Exclusion criteria included: the refractive media were too cloudy to take a clear photograph; the diopter was greater than 6D; patients with systemic diseases other than diabetes that could also lead to ocular complications; patients with familial or hereditary ocular diseases; a history of ocular trauma; a history of medications that may cause ocular side effects (e.g., chloroquine, hydroxychloroquine, chlorpromazine, and rifampicin).

Before fundus photographing, participants would undertake slit-lamp and non-contact tonometer examinations. Tropicamide phenylephrine eye drops were applied for pupil dilation. When the pupil was dilated to be large enough (usually 8 × 8 mm²), a color fundus photograph would be taken for the participant using a fundus camera (Topcon, TRC-50DX, Japan). Images were saved in the JPG format (24-bit RGB), with a resolution of 2880 × 2136 pixels. Single-field central posterior 50° images, covering OD and macula, were analyzed in this study.

During the image quality control stage, we excluded 15 fundus images that are too blurry or of extremely large-area lesions (from the original selection). After image quality control, our dataset consists of 588 fundus images from DR patients and 631 fundus images from healthy people.

Image categorization

The grading of DR refers to the International Clinical DR Severity Scale³¹. The only difference was that we considered healthy fundus photographs without diabetes as stage 0 instead of “diabetes patients with no apparent retinopathy”. And considering that some patients may have been treated with retinal photocoagulation and that laser spots or scars may affect staging and detection, we grouped fundus photographs with laser spots or scars into a separate category. Typically, the presence of laser spots or scars on a fundus image indicates that the patient is of severe non-proliferative DR or proliferative DR (stage 3 or stage 4). Some lesions may disappear after receiving retinal photocoagulation, and thus the grade determined from the fundus image may be inconsistent with the patient’s actual DR severity grade, such as samples shown in Fig. 2. Three experienced ophthalmologists at Zhongshan Ophthalmic Centre of Sun Yat-sen University performed screening and grading of the fundus photographs. Specifically, every fundus photograph was read by two ophthalmologists independently, then a third ophthalmologist would re-annotate the ones with inconsistent annotations from the previous two ophthalmologists. The entire dataset was distributed as follows: 631 photographs were confirmed as normal healthy fundus; 24, 365, 73 and 58 photographs were respectively classified to be mild non-proliferative DR, moderate non-proliferative DR, severe non-proliferative DR, and proliferative DR; and 68 photographs were classified to be DR with laser spots or scars (Table 1). Representative examples in each categorization are shown in Fig. 3. Additionally, we also provided DR grading labels for each fundus image according to the protocol from the American Academy of Ophthalmology and the Scottish DR grading protocol to facilitate comparisons of our dataset with other existing datasets^32,33,34,35. Also, we provided DR grading labels for images in category 5 (fundus images with laser spots or scars) assessed by the three aforementioned protocols.

Table 1 Criteria of DR grading and the number of fundus photographs belonging to each category.

Full size table

Distinguishing whether a fundus image comes from a left eye or a right eye is one of the first steps in ophthalmic examinations. Generally, for most fundus images in the categories of stage 0 to stage 3, the left eye and right eye can be easily distinguished according to OD’s position and the direction of the retinal vessels, although there may be lesions existing. As shown in Fig. 4, in some cases of proliferative DR, the fundus images become blurry due to large-scale hemorrhages and exudates, and ODs become less prominent. In those cases, the left and right eyes can still be distinguished based on the residual blood vessel traces. Table 2 tabulates the numbers of left eye and right eye fundus images in each of the 5 stage categories as well as category 5 (DR with laser spots or scars). Overall, in terms of left-versus-right eye classification, our dataset is relatively balanced.

Table 2 he numbers of left eye and right eye fundus images within each of the 6 categories (stage 0 to stage 4 and category 5).

Full size table

Creation of annotations for exudate detection

As mentioned in the above subsection, fundus photographs of stage 0 are normal healthy fundus with no lesions and stage 1 fundus images contain only microaneurysms. Therefore, we only prepared the ground truth detection bounding boxes for exudates (including hard exudates and cotton wool spots) in fundus images of stage 2, stage 3, stage 4, as well as those with laser spots or scars, ending up with a total of 564 fundus images. In this work, we labeled the exudates according to the most common format in computer vision detection tasks, namely bounding boxes. As shown in Fig. 5, the entire annotation procedure went through the following four steps: (1) An experienced ophthalmologist from Zhongshan Ophthalmic Centre screened fundus images with exudates and identified them in the form of a coarse bounding circle, and then another ophthalmologist inspected the bounding circle and corrected if necessary, such as missing labels and incorrect labels; (2) Images identified to have exudates labels went through contrast limited adaptive histogram equalization (CLAHE) and adaptive gamma correction with weighting distribution (AGCWD) as preprocessing for the purpose of contrast enhancing and illumination correction^36,37; (3) A bounding box refining network (BBR-net) model (trained from the IDRiD dataset²⁸) was employed to refine coarse bounding boxes (generated from coarse bounding circles in step (1) into more precise bounding boxes (the four sides of the refined boxes were much closer to the boundary of each lesion area than the coarse ones); (4) A third ophthalmologist re-checked the output of the aforementioned model and made manual corrections again. Detailed information of step 2 and step 3 can be found in our previous work³⁸. Representative examples of exudate detection labels are shown in Fig. 6. All clinicians involved in exudate labeling followed the following criteria:

For relatively independent but still connected lesions, regardless of size and shape, in step (1), the boundary circle should include the entire area of the lesion. In step (4), the bounding box should be as close as possible to the edge of each exudate.
For a large and coarsely-connected lesion, there may be multiple smaller lesions inside. However, if the smaller sub-lesions are very close to each other and it is challenging to identify every single sub-lesion, they can be grouped and considered as one single lesion, as shown in exudates a and b in Fig. 7.
If the lesion label obtained from the above criterion 2 is very large such that there are a lot of background pixels included, the ophthalmologists separate it to be two exudate labels according to an appropriate boundary separation rule, as exudates b and c in Fig. 7 show.
Overlap between two exudate labels is allowed, such as exudates c and d in Fig. 7 show. The ophthalmologists only need to make sure that the boundary circle completely contains the exudate, and the bounding box contours the boundary of each exudate as close as possible.
In terms of other special cases, the ophthalmologists communicate with each other to reach consistent labeling criteria.

Creation of OD bounding box and fovea location annotations

Along with the annotations presented above, this dataset also provided center pixel locations of fovea (F_x,F_y) as well as bounding boxes of ODs (O_x1, O_y1, O_x2, O_y2) for all images. The procedure of creating those two labels consisted of the following two steps: automatic generation and manual correction. OD and fovea are two of the most important anatomical landmarks of fundus images. In one of our previous works³⁹, we trained a region proposal network and a cascaded network for automated OD detection in the form of a bounding box and fovea localization in the form of a pixel location identification. After that, an ophthalmologist visually examined the accuracy of the automatic results and performed manual corrections if necessary. The OD bounding box should be the smallest rectangle that bounds the OD and the fovea is defined to be the center of the macula. Figure 8 shows representative instances of the OD and fovea annotations.

Data Records

This dataset is publicly available at https://www.aiforeye.cn/ and https://doi.org/10.6084/m9.figshare.12570770.v1⁴⁰, which is stored as a zip file. In the unzipped folder, all the raw fundus images, the exudate annotations, the DR grading labels, and the OD and fovea location annotations are stored in three subfolders, namely “originalImages”, “exudateLabels”, and “odFoveaLabels”. In the “originalImages” folder, files are saved in the JPG format and named as “n.jpg”, with n ranging between 0001 and 1219 indicating the n^th sample. In that folder, we also provide a comma-separated-values (CSV) file named “drLabels.csv”, wherein the first column indicates the file name, the second column indicates the left-versus-right eye categories with 0 representing left eyes and 1 right eyes, the third column indicates the DR category assessed via the International Clinical DR Severity Scale (0 to 5, with 0 representing normal healthy, and 1 to 5 respectively representing mild non-proliferative DR, moderate non-proliferative DR, severe non-proliferative DR, proliferative DR, and DR with laser spots or scars), the fourth column indicates the DR grade assessed via the American Academy of Ophthalmology protocol, and the fifth column indicates the DR grade assessed via the Scottish DR grading protocol. Another CSV file named “c5_DR_reclassified.csv” provides the DR labels for images belonging to category 5 assessed via the three aforementioned protocols. The exudate detection labels, OD bounding box’s coordinates, as well as fovea location’s coordinates are saved in the XML format stored at the corresponding folders (namely “exudateLabels” and “odFoveaLabels”), following the same specifications as the Pascal Voc dataset⁴¹. Hard and soft exudates are labeled separately in this dataset. In the XML files, “ex” stands for hard exudates and “se” for soft exudates.

Technical Validation

It is worth mentioning that although some degree of automation was involved in generating all four types of labels provided in this work, expert verification was always performed as the last step to ensure the quality and correctness of the annotations.

For the OD bounding box and fovea location labels, they are relatively simple and had been labeled in a semi-automated manner. Specifically, automated OD bounding boxes and fovea locations were obtained from a deep learning model³⁹, the performance of which had been verified on a large set of fundus images. After that, one ophthalmologist checked the results and corrected if necessary. For the left-versus-right eye label, the definition is very straightforward, according to OD’s position and direction of the retinal vessels. Every fundus photograph was independently read by two ophthalmologists, and then a third one would re-annotate the ones with inconsistent judgments. For this label, the value of intra-class correlation coefficient (ICC)⁴² between the initial two ophthalmologists is 1 and thus the third ophthalmologist was not involved at all. For the DR grade label, the ICC between the initial two annotators is 0.91. The main difficulty lies in distinguishing between mild non-proliferative DR, moderate non-proliferative DR, and severe non-proliferative DR. For the exudate annotation, we calculate the Dice coefficient⁴³ between two exudate labels (boundary circle labels are transformed into binary masks, where the pixel value inside the circle is 1 and the pixel value outside the circle is 0) to assess the inter-rater agreement, and the mean Dice value between the initial two annotators is 0.89. In conclusion, for the four kinds of labels provided in our dataset, different annotators had high consistency/inter-rater agreement, ensuring the high quality of the annotations of our proposed SUSTech-SYSU dataset.

When constructing the exudate annotations, we also trained a BBR-Net model based on the exudate labels provided in the IDRiD dataset (combining soft exudates and hard exudates together). Evaluated on the IDRiD dataset, our BBR-Net can effectively refine coarse exudate annotations, with the average intersection-over-union (IoU)⁴⁴ being 0.8653 when compared with well-annotated bounding boxes (generated from the pixel-wise labels provided in IDRiD). Then, we applied the trained and validated BBR-Net to the automatic correction step in exudate label creation in this work. Additionally, experienced ophthalmologists have visually examined the quality of all 1219 fundus images used in this study to ensure adequate image quality. Our aiforeye platform also embedded a function of automated quality assessment for fundus images.

In order to quantify the relationship between lesion area and DR grade in the provided dataset, we calculate the total number, average number, total area, and average area of exudates contained in images belonging to each category of the provided dataset, which are demonstrated in Tables 3 and 4. Our entire dataset contains 15,652 exudates, and the total number of pixels inside all exudate bounding boxes are 212201128, accounting for 6.11% of the total area (3469547520). All these metrics were computed from the 564 fundus images with exudate annotations. It can be easily seen that the data in those two tables are in line with clinical knowledge. Many fundus images in the category of stage 4 had severe fibrous proliferation or severe vitreous hemorrhage, which obscured exudates. Therefore, the average area of exudates is the largest for images in the category of stage 3. The average area of either stage 2 or stage 4 is less than that of stage 3. After receiving retinal photocoagulation treatment, the number of exudates decreased and the average area is smaller than both stage 3 and stage 4.

Table 3 The total number of exudates contained in fundus images belonging to each category of this dataset.

Full size table

Table 4 The area (pixel numbers) of exudates contained in fundus images belonging to each category of this dataset.

Full size table

Even though the exudate detection labels were generated under the unanimous determination of three ophthalmologists, for exudates the edges of which are often not distinct or cover a large area, it is sometimes difficult to determine and justify which pixels should be included in a single bounding box. Therefore, there is still a certain degree of subjectivity in our exudate annotations. Finding a proper balance between pixel-level segmentation labeling and bounding box detection labeling is one of our future research directions.

Although our provided dataset is quality controlled, individual fundus images are relatively variable in terms of quality. Some are blurrier than others. With that being said, providing a large dataset containing both high-quality and relatively low-quality samples ensures more realistic model training so as to accommodate real clinical scenarios. In addition, this dataset may be also useful for advancing automated quality-enhancement techniques^45,46 for fundus images, especially in the context of DR screening.

Compared with the 81 samples in the IDRiD dataset and the 47 samples in the e-Ophtha EX dataset, the dataset we introduced in this paper has a relatively large sample size (564 samples in total) in terms of exudate detection tasks. However, in terms of DR classification and grading, this dataset is unbalanced to a certain extent, and the sample sizes of specific categories are relatively limited (mild non-proliferative DR, severe non-proliferative DR, and proliferative DR). In this case, training with machine learning, especially deep learning methodologies, may cause over-fitting problems. As such, in terms of the development of automated DR classification algorithms, this dataset may be more suitable for applying “Few-shot Learning” methods, the research topic of which has gradually received extensive attention and developments in the past few years⁴⁷. One of our future research efforts is to address this limitation.

Usage Notes

This dataset can be downloaded through the link mentioned above. Users of this dataset are expected to cite this paper in any research output generated from using this dataset as well as appropriately acknowledge the contributions of this dataset.

After copying all images from the “originalImages” folder to the “exudateLabels” and “odFoveaLabels” folders, users can directly open the provided fundus images and the corresponding exudate detection labels, OD bounding box’s coordinates, as well as fovea location’s coordinates using Labelimg⁴⁸ (a graphical image annotation tool, which can be accessed at https://github.com/tzutalin/labelImg). This tool provides functions of visualizations and modifications of annotations (according to research needs). Please note in order to display directly in Labelimg, the fovea location’s coordinates are transformed into a small box (F_x, F_y, F_x+1, F_y+1).

Code availability

In the process of constructing the dataset provided in this work, we used several automatic algorithms developed in our previous works^38,39. The source code for the bounding box refining network (BBR-net) can be accessed at https://github.com/YijinHuang/BBR-Net (or https://doi.org/10.5281/zenodo.4041331)⁴⁹ and code for OD detection and fovea localization are available upon request. Also, we have embedded all involved algorithms into a cloud platform that we developed. Users of this dataset can access the two automatic algorithms by visiting our website at https://www.aiforeye.cn/ and uploading fundus images for analysis. The functions provided by our platform include classification of left and right eyes, DR grading, lesion detection, identification of OD and fovea, as well as some additional functions such as retinal vessel segmentation and statistical analyses of vessel morphometrics and lesion abnormalities. Please note that our algorithms for segmenting and classifying corneal ulcers from ocular staining images (the dataset we published before)²¹ can also be accessed on this platform.

References

Yau, J. W. et al. Global prevalence and major risk factors of diabetic retinopathy. Diabetes care 35, 556–564 (2012).
Article Google Scholar
Saeedi, P. et al. Mortality attributable to diabetes in 20–79 years old adults, 2019 estimates: Results from the international diabetes federation diabetes atlas. Diabetes research and clinical practice 108086 (2020).
Sabanayagam, C. et al. Incidence and progression of diabetic retinopathy: a systematic review. The Lancet Diabetes & Endocrinology 7, 140–149 (2019).
Article Google Scholar
Akram, U. M. & Khan, S. A. Automated detection of dark and bright lesions in retinal images for early detection of diabetic retinopathy. Journal of medical systems 36, 3151–3162 (2012).
Article Google Scholar
Santhi, D., Manimegalai, D., Parvathi, S. & Karkuzhali, S. Segmentation and classification of bright lesions to diagnose diabetic retinopathy in retinal images. Biomedical Engineering/Biomedizinische Technik 61, 443–453 (2016).
Article CAS Google Scholar
Sidibé, D., Sadek, I. & Mériaudeau, F. Discrimination of retinal images containing bright lesions using sparse coded features and svm. Computers in biology and medicine 62, 175–184 (2015).
Article Google Scholar
Niu, S. et al. Multimodality analysis of hyper-reflective foci and hard exudates in patients with diabetic retinopathy. Scientific reports 7, 1–10 (2017).
Article ADS Google Scholar
Chew, E. Y. et al. Association of elevated serum lipid levels with retinal hard exudate in diabetic retinopathy: Early treatment diabetic retinopathy study (etdrs) report 22. Archives of ophthalmology 114, 1079–1084 (1996).
Article CAS Google Scholar
Lammer, J. et al. Detection and analysis of hard exudates by polarization-sensitive optical coherence tomography in patients with diabetic maculopathy. Investigative ophthalmology & visual science 55, 1564–1571 (2014).
Article Google Scholar
Fong, D. S. et al. Subretinal fibrosis in diabetic macular edema: Etdrs report 23. Archives of ophthalmology 115, 873–877 (1997).
Article CAS Google Scholar
Sánchez, C. I., Garca, M., Mayo, A., López, M. I. & Hornero, R. Retinal image analysis based on mixture models to detect hard exudates. Medical Image Analysis 13, 650–658 (2009).
Article Google Scholar
Li, H. & Chutatape, O. Automated feature extraction in color retinal images by a model based approach. IEEE transactions on biomedical engineering 51, 246–254 (2004).
Article Google Scholar
Sopharak, A., Uyyanonvara, B., Barman, S. & Williamson, T. H. Automatic detection of diabetic retinopathy exudates from non-dilated retinal images using mathematical morphology methods. Computerized medical imaging and graphics 32, 720–727 (2008).
Article Google Scholar
Khojasteh, P., Aliahmad, B. & Kumar, D. K. Fundus images analysis using deep features for detection of exudates, hemorrhages and microaneurysms. BMC ophthalmology 18, 1–13 (2018).
Article Google Scholar
Zhang, Z. et al. Origa-light: An online retinal fundus image database for glaucoma analysis and research. In 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, 3065–3068 (IEEE, 2010).
Zhang, Z. et al. Origa-light: An online retinal fundus image database for glaucoma analysis and research. http://imed.nimte.ac.cn/resources.html (2010).
Staal, J., Abràmoff, M. D., Niemeijer, M., Viergever, M. A. & Van Ginneken, B. Ridge-based vessel segmentation in color images of the retina. IEEE transactions on medical imaging 23, 501–509 (2004).
Article Google Scholar
Hoover, A., Kouznetsova, V. & Goldbaum, M. Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response. IEEE transactions on medical imaging 19, 203–210 (2000).
Article CAS Google Scholar
Drive: Digital retinal images for vessel extraction. grand-challenge https://drive.grand-challenge.org/DRIVE/ (2014).
Hoover, A., Kouznetsova, V. & Goldbaum, M. Stare: Structured analysis of the retina. http://cecas.clemson.edu/ahoover/stare/ (2000).
Deng, L. et al. The sustech-sysu dataset for automatically segmenting and classifying corneal ulcers. Scientific Data 7, 1–7 (2020).
Article MathSciNet Google Scholar
Deng, L. et al. The sustech-sysu dataset for automatically segmenting and classifying corneal ulcers. figshare https://doi.org/10.6084/m9.figshare.c.4526675 (2020).
Kälviäinen, R. & Uusitalo, H. Diaretdb1 diabetic retinopathy database and evaluation protocol. In Medical Image Understanding and Analysis, vol. 2007, 61 (Citeseer, 2007).
Tomi, K. et al. Diaretdb1-standard diabetic retinopathy database calibration level 1. https://www.it.lut.fi/project/imageret/diaretdb1/ (2007).
Giancardo, L. et al. Exudate-based diabetic macular edema detection in fundus images using publicly available datasets. Medical image analysis 16, 216–226 (2012).
Article Google Scholar
Giancardo, L. et al. The hamilton eye institute macular edema dataset (hei-med). GitHub https://github.com/lgiancaUTH/HEI-MED (2012).
Decencière, E. et al. Teleophta: Machine learning and image processing methods for teleophthalmology. Irbm 34, 196–203 (2013).
Article Google Scholar
Porwal, P. et al. Indian diabetic retinopathy image dataset (idrid): a database for diabetic retinopathy screening research. Data 3, 25 (2018).
Article Google Scholar
E-ophtha. http://www.adcis.net/en/third-party/e-ophtha/ (2013).
Porwal, P. et al. Indian diabetic retinopathy image dataset (idrid). IEEE Dataport https://ieee-dataport.org/open-access/indian-diabetic-retinopathy-image-dataset-idrid (2019).
Wilkinson, C. et al. Proposed international clinical diabetic retinopathy and diabetic macular edema disease severity scales. Ophthalmology 110, 1677–1682 (2003).
Article CAS Google Scholar
Kanski, J. J. Clinical ophthalmology: a synopsis (Elsevier Health Sciences, 2009).
Diabetic retinopathy (dr): management and referral. Community Eye Health 28, 70–71 (2015).
Google Scholar
Zachariah, S., Wykes, W. & Yorston, D. Grading diabetic retinopathy (dr) using the scottish grading protocol. Community eye health 28, 72 (2015).
PubMed PubMed Central Google Scholar
Flaxel, C. J. et al. Diabetic retinopathy preferred practice pattern. Ophthalmology 127, P66–P145 (2020).
Article Google Scholar
Setiawan, A. W., Mengko, T. R., Santoso, O. S. & Suksmono, A. B. Color retinal image enhancement using clahe. In International Conference on ICT for Smart Society, 1–3 (IEEE, 2013).
Huang, S.-C., Cheng, F.-C. & Chiu, Y.-S. Efficient contrast enhancement using adaptive gamma correction with weighting distribution. IEEE transactions on image processing 22, 1032–1041 (2012).
Article ADS MathSciNet Google Scholar
Huang, Y. et al. Automated hemorrhage detection from coarsely annotated fundus images in diabetic retinopathy. In 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), 1369–1372 (IEEE, 2020).
Huang, Y., Zhong, Z., Yuan, J. & Tang, X. Efficient and robust optic disc detection and fovea localization using region proposal network and cascaded network. Biomedical Signal Processing and Control 60, 101939 (2020).
Article Google Scholar
Lin, L. et al. The sustech-sysu dataset for automated exudate detection and diabetic retinopathy grading. figshare https://doi.org/10.6084/m9.figshare.12570770.v1 (2020).
Everingham, M. et al. The pascal visual object classes challenge: A retrospective. International journal of computer vision 111, 98–136 (2015).
Article Google Scholar
Bartko, J. J. The intraclass correlation coefficient as a measure of reliability. Psychological reports 19, 3–11 (1966).
Article CAS Google Scholar
Kosman, E. & Leonard, K. Similarity coefficients for molecular markers in studies of genetic relationships between individuals for haploid, diploid, and polyploid species. Molecular ecology 14, 415–424 (2005).
Article CAS Google Scholar
Rezatofighi, H. et al. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 658–666 (2019).
Zhu, J.-Y., Park, T., Isola, P. & Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, 2223–2232 (2017).
Zhao, H., Yang, B., Cao, L. & Li, H. Data-driven enhancement of blurry retinal images via generative adversarial networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 75–83 (Springer, 2019).
Snell, J., Swersky, K. & Zemel, R. Prototypical networks for few-shot learning. In Advances in neural information processing systems, 4077–4087 (2017).
Tzutalin. Labelimg. GitHub https://github.com/tzutalin/labelImg (2015).
Huang, Y. & Lin, L. Bbr-net: Bbr-net. Zenodo https://doi.org/10.5281/zenodo.4041331 (2020).

Download references

Acknowledgements

This work was supported by the Shenzhen Basic Research Program (JCYJ20190809120205578), the National Key R&D Program of China (2017YFC0112404), the High-level University Fund (G02236002), and the National Natural Science Foundation of China (81501546). We would like to acknowledge Ziqing Feng and Qian Wang from the Zhongshan Ophthalmic Centre at Sun Yat-sen University for their help with this work and all participants involved in this study.

Author information

These authors contributed equally: Li Lin, Meng Li, Yijin Huang.

Authors and Affiliations

Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, 5180000, China
Li Lin, Yijin Huang, Pujin Cheng & Xiaoying Tang
State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Centre, Sun Yat-sen University, Guangzhou, 510000, China
Meng Li & Jin Yuan
School of Electronics and Information Technology, Sun Yat-sen University, Guangzhou, 510000, China
Li Lin & Kai Wang
Department of Ophthalmology, Gaoyao People’s Hospital, Zhaoqing, 526000, China
Honghui Xia

Authors

Li Lin
View author publications
You can also search for this author in PubMed Google Scholar
Meng Li
View author publications
You can also search for this author in PubMed Google Scholar
Yijin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Pujin Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Honghui Xia
View author publications
You can also search for this author in PubMed Google Scholar
Kai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jin Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoying Tang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors have contributed to writing the manuscript. L.L. and Y.H. contributed to automatic refining, format conversion of exudate detection labels, organization of the entire dataset and all the corresponding annotations. M.L., Z. Feng and Q. Wang (the two assisting clinicians) contributed to the creation and manual correction of exudate detection labels, image classifications into the six DR grading categories according to three different protocols, and also image classifications into the two left-versus-right eye categories. P.C. and M.L. performed automatic label creation and manual modification of OD’s bounding boxes and fovea’s coordinates. H.X., K.W. and J.Y. conducted fundus images collection, collation, and image quality control. X.T. and J.Y. are responsible for the design of the entire work.

Corresponding authors

Correspondence to Jin Yuan or Xiaoying Tang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.

Reprints and permissions

About this article

Cite this article

Lin, L., Li, M., Huang, Y. et al. The SUSTech-SYSU dataset for automated exudate detection and diabetic retinopathy grading. Sci Data 7, 409 (2020). https://doi.org/10.1038/s41597-020-00755-0

Download citation

Received: 06 July 2020
Accepted: 03 November 2020
Published: 20 November 2020
DOI: https://doi.org/10.1038/s41597-020-00755-0

This article is cited by

A fundus image dataset for intelligent retinopathy of prematurity system
- Xinyu Zhao
- Shaobin Chen
- Guoming Zhang
Scientific Data (2024)
Open Fundus Photograph Dataset with Pathologic Myopia Recognition and Anatomical Structure Annotation
- Huihui Fang
- Fei Li
- Yanwu Xu
Scientific Data (2024)
Image quality assessment of retinal fundus photographs for diabetic retinopathy in the machine learning era: a review
- Mariana Batista Gonçalves
- Luis Filipe Nakayama
- Rubens Belfort
Eye (2024)
A novel vessel extraction technique for a three-way classification of diabetic retinopathy using cascaded classifier
- Saad Ather
- Aamir Wali
- Sana Fatima
Multimedia Tools and Applications (2024)
Enhanced diabetic retinopathy detection and exudates segmentation using deep learning: A promising approach for early disease diagnosis
- G. Latha
- P. Aruna Priya
- V. K. Smitha
Multimedia Tools and Applications (2024)