Abstract
Objective:
To assess candidate neonatology EPAs taken from separate but overlapping sets from two organizations.
Study design:
Using a Delphi process, we asked that neonatology fellowship directors (1) assess importance and scope of 19 candidate EPAs, and (2) propose additional EPAs if necessary. In round 2, we sought clarification of first round responses and evaluated proposed additional EPAs.
Results:
Twenty program directors participated. In round 1, all EPAs were scored as important, but four were overly broad. In round 2, respondents rejected proposed subdivisions of one overly broad EPA, retaining it as originally proposed. Specification of entrustment criteria improved the scope of the three other broad EPAs. However, after specification, they were re-rated as insufficiently important and therefore rejected. Neither newly proposed EPA from round 1 was rated as sufficiently important.
Conclusion:
The Delphi process yielded 13 EPAs with which to assess capability to practice clinical neonatology.
Introduction
In the past 20 years, the medical education community has been increasingly interested in competency-based education to promote individual competence and guide curriculum.1, 2 This has led to the development and adoption of ‘competencies’ and ‘milestones’ to reduce complex clinical behaviors to discrete elements that facilitate assessment.3 However, competencies and milestones have been criticized for ‘anatomizing’ clinical skills and failing to account for the need to integrate multiple skills and attitudes required to provide good and safe care.4, 5
In response, ten Cate et al.6, 7, 8 proposed translating competencies into entrustable professional activities (EPAs) to acknowledge the interconnectedness of skills and behaviors in the provision of care. Taken together, EPAs describe the range of activities that a trainee must master in order to be trusted to care for patients without supervision. Since EPAs represent essential clinical work familiar to teachers and learners alike, they have been characterized as intuitive. However, there is little consensus as to the ‘right’ number and breadth of EPAs for a given specialty.8, 9 To be useful, the number of EPAs by which a trainee is assessed cannot be limitless. If they are too granular, assessment will be burdened by an impractical list of entrustment decisions.10 If they are too broad, the specific activities to be assessed will be ambiguous. In short, to be both meaningful and practical, careful consideration must be given to the construction and selection of EPAs for a given program of assessment.
Our fellowship training program in neonatal–perinatal medicine recently developed 14 EPAs that seemed neither so narrow that the number of assessments would be burdensome nor so broad as to lose meaning. Soon thereafter, the American Board of Pediatrics (ABP) proposed and adopted a separate set of five EPAs,11 several of which overlapped or duplicated ours. The goal of both sets was to delineate the essential units of clinical work in our specialty. Our purpose in this study was to use the two groups of EPAs to derive a single validated list. Using a Delphi process, we asked fellowship program directors to assess the importance and scope of each EPA as well as the inclusiveness of the lists.
Methods
Selection of Delphi panelists
Our study was reviewed by and received exemption from the Colorado Multiple Institute Review Board (COMIRB #14-2384). We recruited Delphi panelists from the pool of program directors (PDs) of Accreditation Council for Graduate Medical Education (ACGME)-certified neonatal–perinatal medicine fellowships. A letter of invitation to participate, which included a statement that participation implied consent, was sent to neonatology PDs chosen randomly from a database maintained by the Organization of Neonatology Training Program Directors12 and to elected members of the ONTPD council. This group was targeted because of their expertise in the development of competent neonatologists and because they would be charged with assessing the entrustment of each trainee. Following the recommendations for homogeneous expert panels,13 we sought a group of 15 to 30 panelists committed to participating in the multistep process. Two reminders were sent to each program director who did not respond.
Delphi items
We combined the two pools of two independently generated sets of EPAs, each of which was developed to describe the essential clinical functions of a practicing neonatologist. The first pool was the five EPAs recently adopted by the ABP11 (designated ABP 1 to 5 in Table 1 and Figures 1 and 2). The second pool was 14 EPAs that we developed for our fellowship training program at the University of Colorado ~5 years ago (designated CU 1 to 14). Because these two groups were developed independently and each was intended to be comprehensive, they overlap. However, we did not attempt to integrate or combine the two sets prior to the Delphi evaluation.
Delphi procedures
We sought consensus as to importance and scope of each EPA. In accordance with usual practice, criteria for consensus were defined in advance14, 15 as ⩾75% of participants agreeing as to importance (‘very important’ or ‘essential’) of each EPA.16 In the first Delphi round, we asked participants to indicate ‘how important it is that a fellow be entrusted to complete each activity without supervision by the time they complete the program.’ We provided a 5-point Likert scale for responses, with anchors ranging from ‘not important’ to ‘essential.’ We also asked participants to rate the scope of each EPA by considering whether it ‘describes a sufficiently discrete set of knowledge and skills that can be observed and evaluated.’ Options on a 5-point Likert scale ranged from ‘too narrowly specified’ to ‘too broadly specified,’ with the middle anchored as ‘meaningfully specified.’ For both importance and level of specification, participants had the option of indicating that they did not understand an EPA as worded. Open-ended options allowed participants to suggest wording changes to improve any EPA and to recommend additional EPAs. We asked that respondents identify their program’s university affiliation so that we could contact them again for subsequent Delphi rounds, and we asked for the number of fellows in their program. Based on results from the first round, we developed questions for a second round which focused on EPAs that were judged to be overly broad. After clarifying the meaning of each, we asked participants to reconsider importance and scope. In addition, we asked panelists to evaluate two additional EPAs that were proposed by respondents in round 1.
Results
In the first Delphi round, we sent surveys to 37 fellowship program directors. Two did not reach intended recipients because of inaccurate e-mail addresses. Of 35 PDs with correct addresses, 20 (57%) responded. In the second round, we sent the survey to 19 of the 20 respondents from round 1 (one respondent did not list a program affiliation). Sixteen of the 19 (84%) completed the second survey. Respondent program size ranged from 2 to 12 fellows.
Delphi round 1
In the first round, ⩾75% of participants rated each of the 19 candidate EPAs as ‘very important’ or ‘essential’ for entrustment without supervision by completion of training (Table 1). On that basis, we combined overlapping EPAs from the two sets (ABP 4 and CU 7, ABP 5 and CU 14). When we asked about scope, most EPAs were predominantly rated as meaningfully specified. None were rated as too narrowly specified by more than 15% of respondents. However, four (ABP 5, CU 7, CU 9 and CU 14) were rated as somewhat or too broadly specified by >50% (Figure 1).
Delphi round 2
In the second round, we focused on those four EPAs considered to be somewhat or too broadly specified. The first EPA judged to be overly broad in round 1 was CU 9: ‘Cares for Newborns with Uncommon or Unrecognized Diagnoses.’ After we proposed entrustment criteria for this EPA, the proportion rating it as meaningfully specified increased from 35 to 56%. However, when importance was reconsidered in light of clarified entrustment criteria, only 9/16 (56%) rated it as ‘essential’ or ‘very important,’ below the ⩾75% threshold set at the start of the study. Next, we combined two EPAs, ABP 5 and CU 14, which both address leadership and oversight of a NICU, and proposed entrustment criteria for this EPA. After considering the proposed criteria, 11/16 (69%) judged the combined EPA as ‘meaningfully specified,’ improved compared to round 1 responses. However, only 10/16 now rated the EPA as ‘essential’ or ‘very important.’ Based on the new ‘importance’ ratings, CU 9 and the combined ABP 5/CU 14 EPA were rejected.
We next proposed clarifications regarding CU 7: ‘Cares for Newborns Requiring Surgery,’ the last EPA that many indicated was too broad. We asked if participants favored dividing this EPA into three more focused surgical EPAs (chest/abdomen, airway and cardiac). Although these more narrow EPAs were judged as more meaningfully specified, none was judged to be as important as the broader EPA. In addition, after considering the three more narrow EPAs, 75% voted to retain the original ‘Care of the Newborn Requiring Surgery’ EPA. In our final list, this EPA was combined with ABP 4 since both specify the same activities.
Although ABP 1, ‘Manages Patients with Complex Multisystem Diseases in the NICU,’ was judged to be somewhat or too broadly specified by nearly half of respondents in round 1, we did not ask follow-up questions about ABP 1. Rather, because the activities that would be included in ABP 1 are described cumulatively by six narrower EPAs (CU 2 to 5, 8 and 11) that were all rated as sufficiently important and as better specified, we chose to replace ABP 1 with that group of six in our final list.
Lastly, in the second Delphi round, we asked participants to consider two additional EPAs proposed by panelists in round 1, ‘Care of the Stable, Convalescing Infant’ and ‘Discharge of the Medically Complex or High-Risk Infant.’ Both were rejected because ⩽50% indicated that either merited inclusion as a separate EPA.
Figure 2 summarizes our findings. We did not retain CU 9 or ABP 5/CU 14. We replaced ABP 1 with six more narrow EPAs (CU 2 to 5, 8, and 11). We combined ABP 2 with CU 1, and ABP 4 with CU 7. In sum, the process yielded 13 final EPAs.
Discussion
In this study, we sought to establish a set of EPAs that would define the clinical activities necessary for entrustment of trainees for the unsupervised practice of neonatology. We used a Delphi approach to evaluate both the importance and scope of EPAs contained in two discrete but overlapping EPA sets that had already been developed for neonatology, and solicited additional EPAs that might have been omitted from these two sets. The process yielded a final group of 13 EPAs that our panel judged to be both sufficiently important and appropriately specified to encompass the range of discrete clinical activities expected of a practicing neonatologist.
Our study is similar to others that have undertaken to develop consensus on the ‘right’ EPAs for a number of medical specialties.17, 18, 19, 20, 21, 22, 23, 24 Most of these studies have focused either on enumerating the essential activities of a profession or on evaluating the ‘importance’ of each individual EPA. Studies vary widely in the number and scope of suggested EPAs, even within the same specialty. ten Cate and Scheele originally proposed that 50 to 100 EPAs might be needed to cover a specialty’s activities, but more recently lowered that estimate to a maximum of 20 to 30 EPAs.7, 25 Two groups of internal medicine educators separately proposed that 16 and 27 individual EPAs be adopted to assess their residents.19, 22 Other studies attempting to define specialty-specific EPAs have been published for geriatrics, pulmonary and critical care medicine, gastroenterology, family medicine, psychiatry and anesthesiology.17, 18, 20, 21, 23, 24, 26 Further study of proposed EPA lists will be necessary to determine how reliably each can assess a trainee’s progression through residency or fellowship.
Our study is unique because, although we too asked panelists to rate importance of the candidate EPAs, we additionally asked that they explicitly consider the ‘scope’ or breadth, of each EPA as it would apply to assessment. Attention to the breadth of an evaluation tool like an EPA is a critical element to minimizing construct-irrelevant variation, a common threat to the validity of an assessment.27, 28, 29, 30, 31 Construct-irrelevant variation may exist when the assessment method includes elements that are irrelevant to the accurate evaluation of the subject’s competency, or when essential elements for accurate evaluation are missing from the method. If an EPA is very broad, one might reasonably be concerned that the content is not sufficiently discrete to be observed and reliably assessed. Consider, for example, a broad EPA like ‘Cares for Newborns.’ Assessing a fellow’s ability to admit and care for a baby with jaundice might not be a suitable surrogate for separately assessing her ability to care for a baby with congenital heart disease. Though both would fall under the same broad EPA, assessing the skills necessary for one might not reliably assess the skills necessary for the other. In contrast, assessment of a very narrow EPA might be reliable, yet results might not fully reflect the overall capability. A fellow who can consistently intubate a small baby may nonetheless lack the skills to dependably lead a complex resuscitation. In summary, assessment of a broad EPA might be generalizable, but not reliable; entrustment for a narrow EPA might be reliable, but not generalizable. Consideration of EPA scope can be trivialized as a distinction between ‘lumpers’ and ‘splitters,’ but the real issue is how best to achieve reproducible, meaningful and defensible assessment of capability.
One proposed solution to a broad EPA has been creation of ‘nested’ EPAs aligned with a broader, ‘umbrella’ EPA.7, 32 Because they are more discrete and easily defined, nested EPAs are more accessible to both evaluator and trainee and should be associated with greater reliability. But, as discussed above, umbrella EPAs carry risks. To be entrusted in an umbrella EPA, a trainee would have to first be entrusted in each of the nested EPAs, raising a question about the practical value of umbrella EPAs. From our study, the relationship between the candidate EPA ‘Manage Patients with Complex Multisystem Disease in the NICU’ and several more narrow, discrete EPAs illustrates the tension between breadth and depth. Consider the EPAs ‘Cares for Newborns with Life-Threatening Infection’ and ‘Cares for Infants with Cardiopulmonary Failure,’ both of which could reside beneath an umbrella EPA like ‘Manages patients with complex, multisystem diseases.’ There is overlap in competencies required to master each, particularly in the ACGME competency domains of professionalism, interpersonal communication, and systems-based practice. However, there are differences between them in knowledge and skills, which likely led Delphi respondents to judge it important that fellows be explicitly entrusted in both. It was apparently not intuitive that assessment of one can be inferred from assessment of the other. As we gather experience and data in the attainment of entrustment in these distinct but related EPAs, we may discover that trainees consistently acquire and master the knowledge and skills required for several or all of the more discrete EPAs at the same time. If that proves to be the case, one or more could be merged. We will come to understand these relationships only by intentional analysis of data gathered after EPA assessments are implemented. In the meantime, we believe that substitution of amorphous, umbrella EPAs like ‘Manage patients with complex, multisystem diseases’ for more discrete, narrowly defined EPAs risks failing to identify targeted but potentially important deficiencies.
We were surprised that respondents judged the EPAs dealing with ‘running a NICU’ (one proposed by the ABP and one from our own training program) as overly broad. When we proposed entrustment criteria in round 2—anticipating change, requesting help, projecting confidence, delegating leadership, maintaining focus on patient safety—they rejected the EPA as insufficiently important. Although we did not ask participants to justify their decisions, they may have rejected these specific EPAs on the basis that they specify activities insufficiently relevant to entrustment to practice without supervision. They may have felt that skills required to run an NICU are higher-level, synthetic skills more relevant to emerging leaders with several years of independent practice than to entrustment of graduating fellows, or they may have thought that the EPA failed to define a set of sufficiently discrete skills and activities to be reliably assessed.
We chose to limit our study to EPAs related to clinical care. In doing so, we omitted skills—most notably those related to scholarly achievement—which are essential elements of the fellowship training. Our rationale is that EPAs were developed primarily as a means to ensure high-quality, safe patient care.32, 33, 34 We believe that applying EPAs to non-clinical skills misappropriates the concept of entrustment and minimizes its impact. Furthermore, we note that junior faculty members engaged in unsupervised clinical practice are routinely not expected to achieve independence in research pursuits for several years after completion of fellowship; we question using the ‘need for supervision’ construct as a criterion by which to judge progress in the research setting. We would advocate identifying alternative approaches to assess non-clinical aspects of training.
This study has a number of limitations, many of which are inherent in the Delphi process itself.14, 35 We chose neonatology fellowship training program directors as the panelists for our study, reasoning that they are those most consistently familiar with fellow education and assessment. Though consensus criteria for the optimal number of Delphi panelists remains elusive, we targeted 15 to 30 participants, a number well within the range of other published studies.13, 16 Delphi criteria by which to accept or reject a proposition are arbitrary, but a common threshold for consensus is 75%.16 We adopted a rating of ‘very important’ or ‘essential’ among ⩾75% of our participants as sufficient for retention. The interpretation that some candidate EPAs were overly broad was based on a qualitative interpretation of the panel’s responses. The lack of standard criterion of consensus emphasizes that no approach to creating EPAs can do more than generate hypotheses that must be tested in practical application. Finally, as with all survey research, a Delphi survey depends on how the proposition is expressed. It is worth noting that the two seemingly identical EPAs dealing with care for neonates with surgical problems (ABP 4 and CU7) and two similar EPAs dealing with delivery room resuscitation (ABP4, CU1) evoked somewhat different responses.
In summary, this study identifies 13 EPAs that define the clinical practice of neonatology. However, there is much work to be done. Eventually, fellowship programs will need to create curricula that align with the EPAs, define more explicitly what is meant by the different levels of entrustment, and establish criteria by which to judge a trainee’s readiness to progress to the next level of entrustment. Rather than as a final conclusion, we view this list of EPAs as a starting point for what should be iterative and ongoing study to determine whether the list allows individual faculty members and training programs to engage in practical, meaningful and reliable assessment of their trainees that is both formative and summative.
References
Carraccio C, Wolfstahl SD, Englander R, Ferentz K, Martin C . Shifting paradigms: from Flexner to competencies. Acad Med 2002; 77: 361–367.
Hawkins RE, Welcher CM, Holmboe ES, Kirk LM, Norcini JJ, Simons KB et al. Implementation of competency-based medical education: are we addressing the concerns and challenges? Med Educ 2015; 49: 1086–1102.
Englander R, Carraccio C . From theory to practice: making entrustable professional activities come to life in the context of milestones. Acad Med 2014; 89: 1321–1323.
Huddle TS, Heudebert GR . Taking apart the art: the risk of anatomizing clinical competence. Acad Med 2007; 82: 536–541.
Lurie SJ . History and practice of competency-based assessment. Med Educ 2012; 46: 49–57.
ten Cate O . Entrustability of professional activities and competency-based training. Med Educ 2005; 39: 1176–1177.
ten Cate O, Scheele F . Competency-based postgraduate training: can we bridge the gap between theory and clinical practice? Acad Med 2007; 82: 542–547.
ten Cate O, Chen HC, Hoff RG, Peters H, Bok H, van der Schaaf M . Curriculum development for the workplace using entrustable professional activities (EPAs): AAME guide no. 99. Med Teach 2015; 37 (11): 983–1002.
Warm EJ, Mathis BR, Held JD, Pai S, Tolentino J, Ashbrook L et al. Entrustment and mapping of observable practice activities for resident assessment. J Gen Intern Med 2014; 29 (8): 1177–1182.
Jones MD Jr, Rosenberg AA, Gilhooly JT, Carraccio CA . Competencies, outcomes, and controversy—linking professional activities to competencies to improve resident education and practice. Acad Med 2011; 86: 161–165.
American Board of Pediatrics. Entrustable professional activities for subspecialties: neonatology. Available at https://www.abp.org/subspecialty-epas#Neonatology(accessed 6 July 2016.)
American Academy of Pediatrics. United States Neonatal-Perinatal Training Program Centers, 2012. Available at https://www2.aap.org/sections/perinatal/pdf/ONTPDDirectory12.pdf(accessed 12 March 2017.)
de Villiers MR, de Villiers PJT, Kent AP . The Delphi technique in health sciences education research. Med Teach 2005; 27 (7): 239–643.
Rowe G, Wright G, Bolger F . Delphi—a reevaluation of research and theory. Technol Forecast Soc Change 1991; 39: 235–251.
Rowe G, Wright G . The Delphi technique as a forecasting tool: issues and analysis. Int J Forecast 1999; 15: 353–375.
Diamond IR, Grant RC, Feldman BM, Pencharz PB, Ling SC, Moore AM et al. Defining consensus: a systematic review recommends methodologic criteria for reporting of Delphi studies. J Clin Epidemiol 2013; 67: 401–409.
Boyce P, Spratt C, Davies M, McEvoy P . Using entrustable professional activities to guide curriculum development in psychiatry training. BMC Med Educ 2011; 11: 96.
Shaughnessy AF, Sparks J, Cohen-Osher M, Goodell KH, Sawin GL, Gravel J Jr . Entrustable professional activities in family medicine. J Grad Med Ed 2013; 5: 112–118.
Hauer KE, Kohlwes J, Cornett P, Hollander H, ten Cate O, Ranji SR et al. Identifying entrustable professional activities in internal medicine. JGME 2013; 5: 54–59.
Rose S, Fix OK, Shah BJ, Jones TN, Szyjkowski RD . Entrustable professional activities for gastroenterology fellowship training. Gastroenterology 2014; 147: 233–242.
Leipzig RM, Sauvigne K, Granville LJ, Harper GM, Kirk LM, Levine SA et al. What is a geriatrician? Ameican Geriatric Society and Association of Directors of Geriatrics Academic Programs end-of-training entrustable professional activities for geriatric medicine. JGAS 2014; 62: 924–929.
Caverzagie KJ, Cooney TG, Hemmer PA, Berkowitz L . The development of entrustable professional activities for internal medicine residency training: a report from the education redesign committee of the Alliance for Academic Internal Medicine. Acad Med 2015; 90: 479–484.
Schultz K, Griffiths J, Lacasse M . The application of entrustable professional activities to inform competency decisions in a family medicine residency program. Acad Med 2015; 90 (7): 888–897.
Jonker G, Hoff RG, ten Cate O . A case for competency-based anaesthesiology training with entrustable professional activities. An agenda for development and research. Eur J Anaesthesiol 2015; 32: 71–76.
ten Cate O . Nuts and bolts of entrustable professional activities. J Grad Med Educ 2013; 5: 157–158.
Fessler HE, Addrizzo-Harris D, Beck JM, Buckley JD, Pastores SM, Piquette CA et al. Entrustable professional activities and curricular milestones for fellowship training in pulmonary and critical care medicine. Report of a multisociety working group. Chest 2014; 146 (3): 813–834.
Messick S . The interplay of evidence and consequences in the validation of performance assessments. Educ Res 1994; 23 (2): 13–23.
Downing SM . Validity: on meaningful interpretation of assessment data. Med Educ 2003; 37: 830–837.
Downing SM, Haladyna TM . Validity threats: overcoming interference with proposed interpretations of assessment data. Med Educ 2004; 38: 327–333.
Cook DA, Beckman TJ . Current concepts in validity and reliability for psychometric instruments: theory and application. Am J Med 2006; 119: 166.e7–16.
Newton PE . Clarifying the consensus definition of validity. Measurement 2012; 10 (1-2): 1–29.
Van Loon KA, Driessen EW, Teunissen PW, Scheele F . Experiences with EPAs, potential benefits and pitfalls. Med Teach 2014; 36: 698–702.
Carraccio C, Englander R, Holmboe ES, Kogan JR . Driving care quality: aligning trainee assessment and supervision through practical application of entrustable professional activities, competencies, and milestones. Acad Med 2016; 14 (2 Suppl): S38–S54.
ten Cate O . Entrustment decisions: bringing the patient into the assessment equation. Acad Med 2017; 92 (6): 736–738.
Okoli C, Pawlowski SD . The Delphi method as a research tool: an example, design considerations and applications. Inf Manag 2004; 42: 15–29.
Acknowledgements
The authors thank the members of ONTPD for participation in this study and Jennifer Gong for her help in the preparation of this manuscript.
Author information
Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no conflict of interest.
Rights and permissions
About this article
Cite this article
Parker, T., Guiton, G. & Jones, M. Choosing entrustable professional activities for neonatology: a Delphi study. J Perinatol 37, 1335–1340 (2017). https://doi.org/10.1038/jp.2017.144
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/jp.2017.144
Further reading
-
Part 5: Essentials of Neonatal-Perinatal Medicine Fellowship: evaluation of competence and proficiency using Milestones
Journal of Perinatology (2022)

