PIC, a paediatric-specific intensive care database

Zeng, Xian; Yu, Gang; Lu, Yang; Tan, Linhua; Wu, Xiujing; Shi, Shanshan; Duan, Huilong; Shu, Qiang; Li, Haomin

doi:10.1038/s41597-020-0355-4

Download PDF

Data Descriptor
Open access
Published: 13 January 2020

PIC, a paediatric-specific intensive care database

Xian Zeng^1,2^na1,
Gang Yu¹^na1,
Yang Lu²^na1,
Linhua Tan¹,
Xiujing Wu¹,
Shanshan Shi¹,
Huilong Duan²,
Qiang Shu¹ &
…
Haomin Li ORCID: orcid.org/0000-0002-6420-7719¹

Scientific Data volume 7, Article number: 14 (2020) Cite this article

10k Accesses
55 Citations
43 Altmetric
Metrics details

Subjects

Abstract

PIC (Paediatric Intensive Care) is a large paediatric-specific, single-centre, bilingual database comprising information relating to children admitted to critical care units at a large children’s hospital in China. The database is deidentified and includes vital sign measurements, medications, laboratory measurements, fluid balance, diagnostic codes, length of hospital stays, survival data, and more. The data are publicly available after registration, which includes completion of a training course on research with human subjects and signing of a data use agreement mandating responsible handling of the data and adherence to the principle of collaborative research. Although the PIC can be considered an extension of the widely used MIMIC (Medical Information Mart for Intensive Care) database in the field of paediatric critical care, it has many unique characteristics and can support database-based academic and industrial applications such as machine learning algorithms, clinical decision support tools, quality improvement initiatives, and international data sharing.

Measurement(s)	Demographics • Vital Signs Measurement • Medication • clinical laboratory measurement • fluid balance • length of hospital stay • survival time • microbiological information
Technology Type(s)	digital curation
Sample Characteristic - Organism	Homo sapiens
Sample Characteristic - Environment	Intensive Care Unit
Sample Characteristic - Location	China

Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.11481810

Harnessing Big Data in Critical Care: Exploring a new European Dataset

Article Open access 28 March 2024

The use of machine learning and artificial intelligence within pediatric critical care

Article 14 November 2022

Establishment of a Chinese critical care database from electronic healthcare records in a tertiary care medical center

Article Open access 23 January 2023

Background & Summary

Over the past ten years, electronic health records (EHRs) have rapidly been adopted worldwide. In the US, the percentage of non-federal acute care hospitals that have adopted basic EHRs increased from 9.4% in 2008 to 83.8% in 2015¹. In China, the healthcare system has been undergoing the New Medical Reform that has stimulated several nation-wide initiatives to encourage and assess adoption and implementation of EHRs since 2009². The analysis of data contained in EHRs, which contain a large volume of structured and unstructured information regarding patient care, is a promising avenue of research for clinicians and data scientists^3,4,5. However, there are several important barriers to the widespread adoption of big data in health care, including no strong incentives for its use within groups of clinicians, considerable privacy concerns, and limited interoperability^6,7. Moreover, barriers to data sharing in healthcare have limited the usage of clinical data and largely prevented the reproducibility of published studies^8,9.

Intensive care units (ICUs) provide care for severely ill patients who require invasive life-saving treatment. Large amounts of data are routinely collected for these critical care patients. The widely used MIMIC (Medical Information Mart for Intensive Care) critical care database has been developed for more than a decade contains comprehensive clinical data of patients admitted to the Beth Israel Deaconess Medical Centre in Boston, Massachusetts^10,11. For many years, it has been the only freely accessible database in critical care medicine, and it supports a broad range of research areas, as evidenced by many publications^12,13,14,15. Building upon the success of MIMIC-III, Philips Healthcare developed a multi-centre intensive care unit database called the eICU Collaborative Research Database¹⁶. However, all these databases have an obvious age gap in their patients. Although MIMIC-III contains data on neonates in the NICU, comprehensive paediatric patient (age from 0 to 18 years old) data are not available in it. Children are not just ‘small adults’ and often have different diseases, developmental issues, and possibly differential responses to therapies and the recovery of function when compared with adult patients in critical care^17,18,19,20. There is vast variability in age-related differences that are all linked to the developmental stage and ability of children. Therefore, all clinical evidence should also be validated in children before being applied in paediatrics. Although there are some large-scale paediatric critical care research databases based on case registries, such as the Virtual PICU Systems (VPS) database in the US²¹ and the Paediatric Intensive Care Audit Network (PICANet) in the UK²², there is a lack of a freely accessible database that contains the fidelity and completeness of raw paediatric intensive care clinical data for researchers. A freely accessible paediatric-specific critical care database will stimulate more researchers to pay attention to children’s intensive care and improve the quality of critical care for children.

Here, we report the release of the PIC (Paediatric Intensive Care) database, a freely accessible paediatric-specific critical care database (http://pic.nbscn.org). Like the widely used MIMIC database, the PIC database integrates deidentified, comprehensive clinical data of paediatric patients admitted to the Children’s Hospital of Zhejiang University School of Medicine and makes it internationally accessible to researchers under a data use agreement (Fig. 1).

The PIC database is notable for the following reasons:

It is publicly and freely available to researchers worldwide as a paediatric critical care supplement to the MIMIC-III
It encompasses a very large population of paediatric patients and spans almost a decade;
It contains high temporal resolution and high-fidelity clinical data;
It is the first freely accessible English-Chinese bilingual clinical database.

Methods

The PIC database was populated with data that had been acquired during routine hospital care The Children’s Hospital, Zhejiang University School of Medicine. This children’s hospital, with more than 1900 beds, is the largest comprehensive paediatric medical centre in Zhejiang Province and the Chinese National Clinical Research Centre of Child Health. There are more than 3 million outpatient and emergency visits per year in this children’s only medical center. It also accepts critical paediatric patients who referred from lower level hospital across 11 cities of Zhejiang province. It has 119 critical care beds in 5 intensive care units: general ICU, paediatric ICU (PICU), surgical ICU (SICU), cardiac ICU (CICU) and neonatal ICU (NICU). The clinical data of patients admitted to any of the ICUs between 2010 and 2018 was used to construct the PIC database. This project was approved by the Institutional Review Board of the Children’s Hospital, Zhejiang University School of Medicine (Hangzhou, China). The requirement for individual patient consent was waived because the project did not impact clinical care, and all protected health information was deidentified.

Database Development

Data were extracted and downloaded from several information systems in the hospital, including the following:

Hospital electronic medical record system
Laboratory information system
Computerized physician order entry system
Nursing information system
Anaesthesia information management system
Reporting system of different examination departments (radiology, ultrasound, ECG, pathology, et al.).

The core database schema followed the classic MIMIC-III database with some adjustments to adapt it for local Chinese data. This will help researchers who have experience with MIMIC to easily understand and utilize the PIC and to compare it with the MIMIC database.

All patients who transferred to any of the ICUs and had records in the hospital medical record system between 2010 and 2018 were included in this project. Based on these data, three core tables were created to describe the patients (PATIENTS), admissions (ADMISSIONS) and ICU stays (ICUSTAYS). The primary key of the three core tables was used to index all other clinical data. SUBJECT_ID in the PATIENTS table is a unique identifier that specifies an individual patient, HADM_ID in the ADMISSIONS table is the encounter number that uniquely identifies a particular hospitalization for patients who might have been admitted multiple times, and ICUSTAY_ID in the ICUSTAYS table refers to a unique admission to an intensive care unit. Each SUBJECT_ID has one or more related HADM_IDs, and each HADM_ID can have one or more related ICUSTAY_ID.

After collecting the original data, we performed post-processing and integrated medical records to ensure that each patient had relatively complete data. Since the time spans of the data in each table from the different systems are inconsistent, we used SUBJECT_ID and HADM_ID, which appear in the ADMISSIONS, PATIENTS, and ICUSTAYS table for data integration and filtering. In parallel, we carefully checked for impossible data entries (for example, the year of the visit date was 1900), erroneous characters (for example, Chinese characters appearing as numbers), and extreme outliers in the collected data and removed or updated these data.

Following the MIMIC-III database schema, structured clinical data, which included patient demographics, medications, fluid balances, comprehensive laboratory results, and microbiological information (tests performed and sensitivities) from the patients’ entire hospital stay, not only periods in the ICU, were collected from different systems. Another specially structured type of data is time-series data, such as vital signs directly collected from monitoring devices, as there are no critical care information systems to store the high-frequency automatically collected vital signs. The PIC only used the physiological measurements manually recorded by nurses. However, the PIC integrated the vital signs collected during surgery every 5 minutes from the anaesthesia information management system. To help research to distinguish these vital data from MIMIC, we created a new table named SURGERY_VITAL_SIGNS.

To make this database widely used worldwide, the largest challenge is the language barrier. All the lab test items, medication names, examinations and diagnoses were recorded using Chinese in the original information systems. To address this problem, the PIC provided English and Chinese bilingual dictionary tables to make the data can be interpretable by most researchers. For terms with widely used standard codes, such as the ICD-10 codes for diagnostics, the English terms were based on the corresponding English version of the standard codes. However, the local ICD-10 version is an extended version with 7 characters compared to the standard 5-character ICD-10 code, and therefore, the Chinese diagnosis is more specific than the English diagnosis in the PIC. Except some of lab test items with LOINC code, other clinical terms without a standard code were translated to English and reviewed manually by authors (X.Z., Y.L. and H.L.). We keep the original Chinese terms in all the tables with corresponding English terms and will help researchers identify language issues if there are any. Such a bilingual Chinese-English clinical dictionary table also creates a data environment that can serve other potential international data sharing projects.

The free-text clinical documents and reports were also recorded in Chinese. Translating a large volume of narrative clinical documents and reports is not feasible, and automatically de-identification tools and algorithms for Chinese clinical documents is also not available. We could not release the original clinical documents at this stage. However, much important clinical information is embedded in the narrative of clinical documents, so common symptoms were extracted from clinical progress notes and discharge summaries using Natural Language Processing (NLP) technology²³. These symptom terms and their negation status (present symptom with “+” and absence symptom with “−”) data are combined with the document recording time and published as the EMR_SYMPTOMS table in the first release of the PIC. To initially evaluate the accuracy of the NLP extraction. We randomly selected 100 notes including admission note, discharge note, progress note, transferred note, and so on from the original notes, extracted symptoms from them using NLP technology, and finally resulted in 3410 symptoms. Two authors independently checked the extracted results. The average accuracy is 91.9% (91.3%, 92.4%). Several NLP typical errors should be mentioned. First, wrong segmentation of long Chinese phrases which do not have natural word space such as English. Second, falsely negative detection both in positive and negative. Third, the complicated sentence that talk about the family symptoms or the risk of some condition were also extracted as symptoms of patient. Taking into account the accuracy of NLP, we remind researchers to use these data with caution.

For many examinations and their reports, the PIC only records an examination event without the content of the reports in this release. We will try to standardize most examination results in the future and publish it in the release of the next iteration of the database.

Deidentification

Before data were integrated into the PIC database, it was fully anonymized by removing all HIPAA (Health Insurance Portability and Accountability Act) protected health information identifiers such as patient name, address, telephone numbers, and any other characteristic that could uniquely identify the individual in structured data sources. For free-text fields, protected health information was not extracted in accordance with HIPAA standards. When creating the dataset, patient IDs including SUBJECT_ID, HADM_ID and ICUSTAY_ID were randomly assigned a unique number, and the original identifier was not retained. As a result, the identifiers in the PIC cannot be linked back to the original, identifiable data. In addition, dates such as birth date and death date were shifted to the future in a consistent manner by the random offset (50–100 years) of each patient to maintain the interval, resulting in stays that occur sometimes between 2060 and 2120. Time of day, day of the week, and approximate seasonality (we divided 12 months into four seasons: spring: March, April, May; summer: June, July, August; autumn: September, October, November; winter: December, January, February) were conserved during the date transformation. Shift the time in such a wide range may limit the utilization of PIC in many time-sensitive studies. At the same time there are also algorithm to obscures date information to any desired granularity²⁴. While, the proposed approach definitely can reduce the risk of re-identification of death children. The ethnicity of patients from ethnic minorities with a small population was not specified in the PIC. As all patients allow to be accepted by this Children’s Hospital must under the age of 18, we do not need to handle the birthday of patients who were older than 90 years old to hide their real age as MIMIC.

Data Records

Data available in the PIC database includes laboratory measurements, charted observations during a patient’s stay, structured symptoms extracted from notes and vital signs recorded while a patient was present in the operating room. The PIC consists of 16 tables that are linked by unique identifiers²⁵. Tables with a ‘D’ prefix are dictionary tables and provide a definition of the identifiers. For example, each row of LABEVENTS is associated with an ITEMID that represents the concept of the measurement, but it does not contain the actual name of the measurement. By joining LABEVENTS and D_LABITEMS on the ITEMID, the concept can be identified represented by a given ITEMID.

Table 1 provides summary descriptions of the data tables, and more detail data schema and technique document about the data tables is available on the PIC website²⁶. Briefly, there are three tables used to define and track patient hospitalization: ADMISSIONS; PATIENTS; and ICUSTAYS. The three identifiers described earlier (SUBJECT_ID, HADM_ID, ICUSTAY_ID) are present in all three of the above tables. The other three tables are dictionaries for defining disease codes and items appearing in the PIC database. The remaining tables contain data related to patient care during the hospital stay, such as demographics, laboratory test results, medications, symptoms, vital sign measurements, and mortality. All the tables are distributed as a collection of comma separated value (CSV) files that can be loaded into many relational database systems.

Table 1 An overview of the data tables in the PIC database.

Full size table

Technical Validation

Paediatric-specific data features

The PIC encompasses 13,499 distinct hospital admissions from 12,881 distinct paediatric patients (aged 0–18 years) admitted to critical care unit between 2010 and 2018. Of those patients, 468 (3.6%) had multiple hospital admissions. The mean age of the patients was 2.5 years (Q1–Q3: 0.1–3.3), 57.5% of the patients were male, and the in-hospital mortality was 7.1%. The mean hospital stay was 17.6 days (Q1–Q3: 7.0–21.0), and the mean ICU stay was 9.3 days (Q1–Q3: 0.9–9.2). The mean ICU stay was longest in the neonatal ICU, 21.6 days (Q1–Q3: 2.5–32.8), and the shortest stay was in the surgical ICU, 2.3 days (Q1–Q3: 0.8–1.6). The summary statistics and detailed patient demographics of the paediatric population in the PIC stratified by care unit are listed in Table 2.

Table 2 Details of the PIC patient population by the first critical care unit on hospital admission.

Full size table

The primary diagnosis of patients in the PIC shows many unique features of paediatric populations, as shown in Online-only Table 1. The top three disease categories of the discharge codes were Q00–Q99, congenital malformations, deformations and chromosomal abnormalities (25.4%); P00–P96, certain conditions originating in the perinatal period (14.1%); and J00–J99, diseases of the respiratory system (10.3%). This is completely different from what is found in adult critical care patients and also confirms the uniqueness and necessity of the PIC.

Comparing the PIC with the MIMIC-III

Table 3 provides a comparison of the PIC with the MIMIC-III database, and Fig. 2 shows the difference in age distributions between patients in the PIC and MIMIC-III. The salient database features that are compared in Table 3 include the data source, number of records and record completeness (availability of different clinical data, physiological waveforms and outcomes). Although there are many limitations in the current release of the PIC, such as that the PIC is smaller than the MIMIC-III based on the number of ICU patients, there are no high-frequency bedside monitor data, and clinical notes are described in Chinese and are not suitable for direct release. The PIC is still notable as a supplement to the paediatric intensive care unit data lacking in the MIMIC-III and will definitely encourage more researchers to focus on paediatric critical care. Furthermore, the PIC database is under continuous development with a better critical care environment, and we hope to supplement the PIC with more and more additional data in the future, including vital signs and physiology waveforms collected at bedsides, as much structured information extracted from the clinical notes as possible, and images corresponding to any medical imaging performed during the patient’s stay.

Table 3 Differences of the PIC and the MIMIC-III.

Full size table

Data quality and completeness

To maintain data fidelity, the original Chinese data were kept, and most of the translated English terms were manually reviewed except the ICD-10 English terms that were considered high-quality. All data except the symptoms that were extracted from narrative documents by NLP technology had very little post-processing performed.

As different clinical information systems were implemented at different times which means the data from some information systems was not available before it implemented, the completeness of the different data tables is different. Furthermore, the vendors of nursing systems and EMR systems have changed during the time of data collected which make the data more complicated to integrated. Table 4 shows the completion of the different tables in the PIC. As a long-term development project, we are improving and upgrading various clinical information systems so that the PIC will have progressively better data quality and completeness in future versions. The source code in python to calculate the completion of different tables was released on the Github.

Table 4 Amount of complete data grouped by table.

Full size table

Best practice for scientific computing

Based on best practice for scientific computing²⁷, the code used to build PIC was version controlled and developed collaboratively within the laboratory. The code for community to utilize and visualize the PIC dataset was hosted on GitHub. Issue tracking is used to ensure that any issues about the code and dataset can be managed and maintained. This approach encouraged and facilitated sharing of readable code and document, as well as feedback from community.

Usage Notes

Data access

The PIC is provided as a collection of comma separated value (CSV) files, along with example scripts to help with importing the data into the database system MySQL and PostgreSQL. As the database contains detailed information regarding the clinical care of patients, it must be treated with appropriate care and respect. Following the MIMIC protocol, researchers are required to formally request access via a process documented on the PIC website²⁶ and the PhysioNet²⁵. There are two key steps that must be completed before access is granted:

the researcher must complete a recognized course in protecting human research participants that includes HIPAA requirements or obtain a GCP certification from a local institute in China.
the researcher must sign a data use agreement, which outlines appropriate data usage and security standards, and forbids efforts to identify individual patients.

Any researchers that have been approved to access the MIMIC-III will be familiar with this process and can easily get started with the PIC. Approval requires approximately three working days. Once an application has been approved, the researcher will receive emails containing instructions for downloading the database.

Collaborative documentation and visualization

The core aims for publicly developing the PIC database are to allow as much free access as possible to interested researchers and to promote secondary analysis of electronic health records. Detailed documentation is available on the PIC website and includes information regarding data access, table contents, and a schematic of the relationships between the tables in the data. The PIC is a large relational database with many different data types. This method of data storage does not visually indicate the patient’s hospitalization, nor does it quickly and easily retrieve information specific to an individual patient. It is a time-consuming process for new researchers to perform joint multiple table queries to fully understand which data are useful and available. For researchers who prefer to explore the data, we provide a website that can easily be used to explore the PIC and visualize clinical data at the patient level. As shown in Fig. 3a, admission to the hospital, surgery, transfer to the ICU and discharge from the ICU and hospital is clearly shown on a timeline. There are several data buttons for users to choose clinical data of interest. Clicking on the button will show corresponding data on a timeline. It is possible to zoom in and out on the timeline to help users better explore the details and overview of the data. We also developed a mechanism to synchronize different timelines on the webpage to help users identify relationships among different data. For example, the medicine used at a time point may explain the change in vital signs at the same time as shown in Fig. 3b. The source code for this visualization is also openly available.

To help user utilize the PIC dataset, a Jupyter Notebook was released to demonstrate how to use PIC to create a machine learning project.

Code availability

The code that was used to create the PIC database, calculate statistics of this paper, demonstrate a machine learning task and source code which underpins the PIC website and documentation is openly available, and contributions from the research community are encouraged: https://github.com/Healthink/PIC.

References

Henry, J., Pylypchuk, Y., Searcy, T. & Patel, V. Adoption of Electronic Health Record Systems among U.S. Non-Federal Acute Care Hospitals: 2008–2015. ONC Data Brief, no. 35. (The Office of the National Coordinator for Health Information Technology, 2016).
Shu, T. et al. EHR adoption across China’s tertiary hospitals: A cross-sectional observational study. Int. J. Med. Inform. 83, 113–21 (2014).
Article Google Scholar
MIT Critical Data. Secondary Analysis of Electronic Health Records. Secondary Analysis of Electronic Health Records (Springer International Publishing, 2016).
Sun, H. et al. Semantic processing of EHR data for clinical research. J. Biomed. Inform. 58, 247–259 (2015).
Article Google Scholar
Finlayson, S. G., LePendu, P. & Shah, N. H. Building the graph of medicine from millions of clinical narratives. Sci. Data 1, 140032 (2014).
Article Google Scholar
Murdoch, T. B. & Detsky, A. S. The inevitable application of big data to health care. JAMA - J. Am. Med. Assoc. 309, 1351–1352 (2013).
Article CAS Google Scholar
Vayena, E., Salathé, M., Madoff, L. C. & Brownstein, J. S. Ethical Challenges of Big Data in Public Health. PLoS Comput. Biol. 11, e1003904 (2015).
Article ADS Google Scholar
Collins, F. S. & Tabak, L. A. NIH plans to enhance reproducibility. Nature 505, 612–613 (2014).
Article Google Scholar
Baker, M. 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016).
Article ADS CAS Google Scholar
Saeed, M. et al. Multiparameter intelligent monitoring in intensive care II: A public-access intensive care unit database. Crit. Care Med. 39, 952–960 (2011).
Article Google Scholar
Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. data 3, 160035 (2016).
Article CAS Google Scholar
Nemati, S. et al. An Interpretable Machine Learning Model for Accurate Prediction of Sepsis in the ICU. Crit. Care Med. 46, 547–553 (2018).
Article Google Scholar
Feng, M. et al. Transthoracic echocardiography and mortality in sepsis: analysis of the MIMIC-III database. Intensive Care Med. 44, 884–892 (2018).
Article CAS Google Scholar
Deliberato, R. O. et al. An Evaluation of the Influence of Body Mass Index on Severity Scoring. Crit. Care Med. 47, 247–253 (2019).
Article Google Scholar
Tyler, P. D. et al. Assessment of Intensive Care Unit Laboratory Values That Differ From Reference Ranges and Association With Patient Mortality and Length of Stay. JAMA Netw. Open 1, e184521 (2018).
Article Google Scholar
Pollard, T. J. et al. The eICU collaborative research database, a freely available multi-center database for critical care research. Sci. Data 5, 180178 (2018).
Article Google Scholar
Czaja, A. S. Children Are Not Just Little Adults. Pediatr. Crit. Care Med. 17, 178–180 (2016).
Article Google Scholar
Giza, C. C., Mink, R. B. & Madikians, A. Pediatric traumatic brain injury: Not just little adults. Curr. Opin. Crit. Care 13, 143–152 (2007).
Article Google Scholar
Goel, R., Cushing, M. M. & Tobian, A. A. R. Pediatric Patient Blood Management Programs: Not Just Transfusing Little Adults. Transfus. Med. Rev. 30, 235–241 (2016).
Article Google Scholar
Soleimani, T. et al. Pediatric pyoderma gangrenosum: is it just big wounds on little adults? J. Surg. Res. 206, 113–117 (2016).
Article Google Scholar
Wetzel, R. Pediatric Intensive Care Databases for Quality Improvement. J. Pediatr. Intensive Care 5, 81–88 (2016).
PubMed Google Scholar
Ramnarayan, P. et al. Interhospital Transport of Critically Ill Children to PICUs in the United Kingdom and Republic of Ireland: Analysis of an International Dataset. Pediatr. Crit. Care Med. 19, e300–3311 (2018).
Article Google Scholar
Li, H.-M., Li, Y., Duan, H.-L. & Lv, X.-D. Term extraction and negation detection method in Chinese clinical document. Chinese J. Biomed. Eng. 27, 716–721 (2008).
CAS Google Scholar
Hripcsak, G., Mirhaji, P., Low, A. F. H. & Malin, B. A. Preserving temporal relations in clinical data while maintaining privacy. J. Am. Med. Informatics Assoc. 23, 1040–1045 (2016).
Article Google Scholar
Li, H., Zeng, X. & Yu, G. Paediatric Intensive Care Database. PhysioNet. https://doi.org/10.13026/bcbd-4t11 (2019).
PIC database: documentation and website, http://pic.nbscn.org (2019).
Wilson, G. et al. Best Practices for Scientific Computing. PLoS Biol. 12, e1001745 (2014).
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China [81871456], the Chinese State’s Key Project of R&D Plan [2016YFC0901905,2016YFC0901703]; the Zhejiang Provincial Program for the Cultivation of High-level Innovative Health Talents [2016-6]; the Fundamental Research Funds for the Central Universities [2019XZZX003-16]. We also want to thank the MIMIC project and the PhysioNet especially thank Dr. Alistair E. W. Johnson and Dr. Tom J. Pollard; many of the protocols in this project were based on it.

Author information

These authors contributed equally: Xian Zeng, Gang Yu and Yang Lu.

Authors and Affiliations

The Children’s Hospital of Zhejiang University School of Medicine and National Clinical Research Center for Child Health, Hangzhou, China
Xian Zeng, Gang Yu, Linhua Tan, Xiujing Wu, Shanshan Shi, Qiang Shu & Haomin Li
The College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China
Xian Zeng, Yang Lu & Huilong Duan

Authors

Xian Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Gang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yang Lu
View author publications
You can also search for this author in PubMed Google Scholar
Linhua Tan
View author publications
You can also search for this author in PubMed Google Scholar
Xiujing Wu
View author publications
You can also search for this author in PubMed Google Scholar
Shanshan Shi
View author publications
You can also search for this author in PubMed Google Scholar
Huilong Duan
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Shu
View author publications
You can also search for this author in PubMed Google Scholar
Haomin Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

X.Z., G.Y., L.Y., Q.S. and H.L. built the PIC database. G.Y., L.T., J.W., S.S., Q.S. and H.L. collected the original clinical data. L.Y., X.Z., H.D. and H.L. developed the PIC website. All authors gave input during the database development process and contributed to writing the paper.

Corresponding authors

Correspondence to Qiang Shu or Haomin Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Online-only Table

Online-only Table 1 Distribution of primary diagnosis by care unit for patients in the PIC database.

Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.

Reprints and permissions

About this article

Cite this article

Zeng, X., Yu, G., Lu, Y. et al. PIC, a paediatric-specific intensive care database. Sci Data 7, 14 (2020). https://doi.org/10.1038/s41597-020-0355-4

Download citation

Received: 24 September 2019
Accepted: 20 December 2019
Published: 13 January 2020
DOI: https://doi.org/10.1038/s41597-020-0355-4

This article is cited by

Med-MGF: multi-level graph-based framework for handling medical data imbalance and representation
- Tuong Minh Nguyen
- Kim Leng Poh
- Jan Hau Lee
BMC Medical Informatics and Decision Making (2024)
Clinical characteristics and risk factors for poor outcomes of invasive pneumococcal disease in pediatric patients in China
- Yanan Fu
- Yingchun Wang
- Meng Li
BMC Infectious Diseases (2024)
Improved pediatric ICU mortality prediction for respiratory diseases: machine learning and data subdivision insights
- Johayra Prithula
- Muhammad E. H. Chowdhury
- Abdulrahman Alqahtani
Respiratory Research (2024)
Risk factors for venous thromboembolism in a single pediatric intensive care unit in China
- Jintuo Zhou
- Yanting Zhu
- Jinhua Zhang
Thrombosis Journal (2024)
U-shaped association between triglyceride-glucose index and all-cause mortality among critically ill pediatrics: a population-based retrospective cohort study
- Qi Gao
- Fan Luo
- Yanqin Li
Cardiovascular Diabetology (2024)

Subjects

Abstract

Similar content being viewed by others

Background & Summary

Methods

Database Development

Deidentification

Data Records

Technical Validation

Paediatric-specific data features

Comparing the PIC with the MIMIC-III

Data quality and completeness

Best practice for scientific computing

Usage Notes

Data access

Collaborative documentation and visualization

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Online-only Table

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links