Introduction

In various areas of health care, digital applications hold great promise for expanding access to services, substituting for and/or complementing existing standards of care, and creating value and convenience for patients. Yet globally, health care systems are struggling with how to incorporate and leverage the accelerating pace of innovation and commercialization of new digital tools that could potentially improve the treatment of illness and maintain health. Mobile health apps (health apps) epitomize this challenge.

While there is an abundant supply of health apps in the commercial tech-sector, the number of options represents a problem. Studies suggest that if confronted with too many choices, people struggle to make one1. A key challenge for the health care system is to facilitate the identification of safe and effective apps for health care practitioners and patients to generate the most health benefit as well as guide payer coverage decisions, where relevant2,3,4,5. Public debates across countries reveal a demand for ‘someone’ to provide a quality stamp on the apps that fulfil basic medical and privacy criteria, that is, to provide a labelling of apps that denote which have achieved standards or endorsement of some type6. The European Institute for Standardization has announced that it recommends that all countries should develop such a framework7. There is also a general trend toward patient empowerment, and this is especially apparent in this space. Some regulation is needed, especially for apps that involve the diagnosis, treatment, and/or management of chronic or high-risk conditions.

Nearly all governments are attempting to define policy frameworks that will be effective for improving health in this context. While such frameworks will not replace market-based evaluation mechanisms (e.g., stars in app stores), they can complement them and play an important role in providing guidance in this context. Our goal in this study was to compare the national policy approaches currently in development and/or use for health apps in nine countries with well-developed health care and regulatory systems, many of which are taking different paths. We also contribute forward-looking recommendations that may be helpful for guiding future policy developments in this area.

Results

Commonalities and differences in national approaches to health app policy

We identified a variety of approaches to the policy of health apps with some commonalities across the countries studied. Table 1 outlines the status of framework development in each country. As outlined in Table 1, Belgium and Germany have overall come the furthest in operationalizing and implementing their market access and reimbursement approval framework for medical apps (excluding apps that do not fulfil criteria for medical devices). In Germany, qualifying apps (known by their German acronym ‘DiGA’) are currently being evaluated through the ‘Fast-Track’ process, and those approved for use and reimbursement are made available in a central directory of digital health applications. Further, Belgium has implemented its mHealthBelgium validation pyramid, with 3 levels, each consisting of criteria related to regulatory issues (level 1), safe communication and privacy (level 2), and to financing and reimbursement (level 3). England’s emerging national approach, which includes assessment frameworks for evidence of effectiveness together with wider technical standards, also serves as a potential role model for many countries, although it, too, is under development, and does not yet incorporate reimbursement approval. In all other countries, initiatives to replace multiple local and fragmented initiatives with little impact with national frameworks are ongoing.

Table 1 National approaches to health apps policy (policy process).

Typically, several regulatory bodies are engaged, although one has primary authority, often with others covering specific aspects. For instance, healthcare supervision agencies, agencies responsible for market access and reimbursement approval of new medical procedures and products, and standardization bodies are often coordinating the initiatives in collaboration with actors representing patients (e.g., patient associations), professionals (medical professional societies), IT-vendor perspectives (industry representatives), actors responsible for data (e.g., in Sweden: data inspectorate), and consumer product approval (e.g., in the US: the Food and Drug Administration (FDA); in Sweden: Consumer Agency). The involvement of actors representing both clinical practice and consumer products is important in those countries where the frameworks are intended to cover both wellness apps and apps classified as medical products (SE, NL).

Pre-existing regulations (hard law in terms of legally binding legislation) in different domains influence frameworks (which represent soft law) being developed. Emerging frameworks are aligned with but serve to complement national legislation (e.g., regarding medical devices (see Supplementary Note 1), medical documentation, patients’/consumer rights, and data protection), as extant legislation has not typically been adjusted to the health app context, and additional guidance, policies, and clarifying regulations are needed. International regulations and standards primarily influence emerging frameworks in European countries. The emerging international ISO standard (https://www.iso.org/standard/78182.html), the GDPR and other EU regulations, such as the MDR, which came into full force in May of 2021 impact certain types of apps in European countries, while policies and guidance from the FDA and PDPA influence the US and Singapore, respectively (Republic of Singapore Personal Data Protection Act of 2012 https://sso.agc.gov.sg/Act/PDPA2012).

Most countries envision a centralized process in which one actor/committee will use the framework to evaluate apps (such as the ‘Fast-Track’ process in Germany, which is run by the Federal Institute for Drugs and Medical Devices (BfArM) and the same is applicable for the mHealthBelgium pyramid coordinated by industry federations beMedTech and Agoria). More decentralized approaches, for instance, where one accreditation agency will ‘certify’ those actors who can evaluate apps (e.g., the national accreditation agency (SE), or where local/regional/specialized actors will use available frameworks for their domain (NL) are also discussed.

Most countries envision approved apps that will be available through different channels, e.g., national health portals where they exist (e.g., DK, SE, NO), websites providing ‘catalogues’ or ‘directories’ of approved apps for specific domains (e.g., G, SI, NL), and commercial App stores—namely the App and Play stores (US), or a combination of these sources (e.g., in G and B where approved apps are centrally listed in the national portal but where access is given through the App and Play stores, though often only after a patient receives an activation code).

Supplementary Table 4 summarizes the content of existing/emerging frameworks in relation to five criteria for evaluating health apps suggested by Levine et al.8.

Existing/emerging frameworks include transparency criteria in terms of requiring information be made available to end-users about the intended use and purpose of the app, medical trials used to evaluate the app, compliance with GDPR (European countries), and/or national data legislation (e.g., G, B, EN). Some countries also ask for information about the manufacturer and its value proposition (e.g., EN). The emerging ISO standard (used as inspiration in e.g., SE, NL), suggests users should consent to advertisements and use of data and requires the description of the app (in e.g., an App store) to be ‘accurate and clear’ (https://www.iso.org/standard/78182.html (2021)).

Regarding health content, evidence supporting the intended use is required (sometimes both trials and ‘evaluations’ are allowed, and in some countries, several kinds of ‘positive effects’ (medical, structural, and procedural effects (Germany)), are allowed. EN requires different levels of evidence depending on the app’s purpose. The emerging ISO standard refers to the use of ‘appropriate’ peer-reviewed scientific literature in the development of the health app.

Regarding technology, emerging criteria include robustness and interoperability with EHRs (EN, SE, ISO). Some countries have used self-evaluation (i.e., app producers self-rate their qualities) for this but intend to move towards external evaluation (NL). The emerging ISO standard includes the criteria of application size (SE).

Security/Privacy is operationalized as compliance with national and EU legislation governing privacy and data-security (GDPR) (DK, G, EN, SE), and may be further specified at the national level (e.g., as in Belgium’s level 2). Further, Germany has implemented data protection laws that encompass, but go above and beyond the requirements of the GDPR. In the US, the Health Insurance Portability and Accountability Act (HIPAA) governs many forms of patient data, but is far less comprehensive than the GDPR, which takes a far broader definition of ‘personal data’ and ‘data concerning health’. Some countries are discussing requiring the use of industry standards in risk models for vulnerability testing and the implementation of ISO/IEC 27001 or recognized equivalents by the health app manufacturer and all organizations providing associated services (NO). The emerging ISO standard suggests criteria such as protection against theft and viruses, signalling of breaches, authentication, data sharing, and maintenance (ISO, SE).

Usability is considered in EN in terms of demonstration of user-centred design, accessibility standards (WCAG), and development with iterative/agile principles. Proof of ‘ease of use’, defined as ‘intuitive usability and learnability of the [app] for the target groups addressed’ is a requirement for all apps in the German DiGA directory. The emerging ISO standard (SE, NL) considers functionality, aesthetics, and availability in multiple languages.

Discussion

We compared health policies regarding health apps across nine countries. Our results demonstrate that countries are at different stages of development and vary in the degree to which the evaluation of health apps is envisioned to be centralized, but there are several commonalities, including ongoing initiatives involving a set of national agencies, and the use of relevant existing and emerging international regulations (Table 1). There is great interest in the use of apps in all the countries evaluated, but even Belgium, Germany and the UK, which are relatively far along in their operationalization of frameworks, are struggling with efficient implementation. We acknowledge, though, that this is a space in which the power of individual governments may be limited and there is ongoing disintermediation of traditional gatekeepers such as healthcare organizations. Below, we discuss implications of the current developments and focus on points around which we believe international collaboration might be beneficial. We focus our discussion on two major types of future challenges: first, challenges tied to the implementation of frameworks; and second, challenges tied to specific criteria dimensions.

While previous work has highlighted the need for frameworks2,3,4,5 or focuses on evaluations in single countries6,9,10,11, we focus on showing the actual status of digital health app regulation in nine countries and three regions of the world. There are many reasons that international collaboration among researchers and policy makers could be valuable in this area, with an obvious one being that health apps do not respect national boundaries. The potential users of English-language apps, for instance, are not only consumers/patients in nations where English is the primary language but also individuals anywhere in the world who have English language proficiency. Thus, the expanding use of health apps within and beyond formal healthcare—and the information health exchanges this involves—implies a diminishing of the power of traditional (national) gatekeepers of medical information (and national governments). Standards and soft policy can play an important role here–but we expect the disintermediating to continue irrespective of standards. Monitoring, adjusting, experimenting with how to address this to ensure safe and effective health apps is thus a key challenge.

As regards challenges related to the implementation of the emerging frameworks, most countries have opted for centralized approaches to evaluation. This is preferable to self-evaluation. However, centralized approaches also run the risk of creating bottlenecks, a risk that seems important given the vast and growing supply of health apps but a low number of apps that are currently ‘approved’ in countries having a well-developed process (for example, only 20 apps and 1 app had been approved for reimbursement in G and B respectively, as of the beginning of September 2021). For this reason, a more decentralized approach, such as the ‘accreditation’ of evaluation agencies may be a viable solution (as discussed in Sweden).

Countries that are further along in their operationalization limit their approvals to health apps meeting criteria for being defined as medical devices. This necessarily excludes certain products that are not classified by the International Devices Regulators Forums as ‘Software as a Medical Device’ (non-SaMD) products and raises the question of how wellness apps—that do not fulfil these criteria but can still create demonstrable value for patients—should be vetted (see Definitions below and Supplementary Note 1 for additional detail). The emerging ISO standard, which provides visual symbols representing different aspects and degrees of quality to be displayed within apps (thus providing guidance to users independently of through what store/website the app is found), and other creative approaches, such as ‘nutrition labels’ for direct-to-consumer apps12, much like other food product regulation which have been proposed by digital health researchers but have not yet been implemented, can provide valuable guidance here.

In most countries, a combination of approaches will most likely emerge, wherein some apps with intended use in clinical/self-care practice will be formally evaluated and displayed as formally ‘approved’ or ‘authorized’, while other wellness-oriented apps may be subject to market-based evaluation by potential consumers. Yet a basic level of quality check is important even for wellness-oriented apps. The platforms that enable sales of these such as the established App and Play stores, therefore, hold an important role in conveying information about apps—if not formally ‘approving’ them—before displaying them in their stores. While these intermediaries have several criteria in place (Supplementary Table 5 outlines the criteria used by Apple’s App and Google’s Play store), research shows that an alarming number of low-quality apps pass through them13,14,15,16. Hence, having a third party play this role, using for instance the emerging ISO standard, or potentially through standardized labelling requirements or crowdsourcing, would be an alternative.

Overall, a middle road may make sense in many cases, for example when risk is low, the bar for approval is low—although, in Europe, the MDR makes it likely that most apps that meet the definition of SaMD will be (more stringently than previously was the case) classified as at least Class IIa medical devices and therefore subject to regulation17. For apps that help to manage or treat chronic conditions which are responsible for a large proportion of healthcare costs and require long-term use or involve high-severity short term conditions an additional certification of some sort, perhaps by a third party, is almost certainly advisable and regulatory approval should be required for those that are high risk (such as suicide prevention apps13). This approach is consistent with the ‘risk-based framework’ outlined in U.S. Food and Drug Administration (FDA)’s Precertification (‘Pre-Cert’) Program for Digital Health Products. Although Pre-Cert is still in its pilot phase, it is expected that it will expand to include other companies and software products beyond those meeting the definition of SaMD in the future (https://www.fda.gov/medical-devices/digital-health-center-excellence/digital-health-software-precertification-pre-cert-program (2021)).

More broadly, as countries are expected to use different combinations of evaluations by central ‘trusted’ actors and market-based approaches, there will be an opportunity for cross-country knowledge exchange among researchers and policy makers, focusing on how centralized and market-based approaches can co-exist and complement each other, and the throughput vs. trust in evaluations achieved in countries using different approaches.

Regarding the specific criteria for app evaluation in specific domains, several questions warrant further attention among actors developing, implementing and using such frameworks at the national and international levels. For instance, while most countries consider ‘transparency’, which is a prerequisite for informed consent, this domain may need further specification. As noted in the recent piece by Grundy et al.18, apps currently provide alarmingly low levels of information to consumers about data use. Against this backdrop, ‘use of data’ (as suggested by the emerging ISO standard) is thus welcome but could mean many things. While most discussions concern the need for informing users about the immediate use of the data the apps collect and generate about them, allowing patients to consent to reuse by additional actors may further be equally important, as patients may want to make their data as impactful as possible while ensuring privacy. Transparency regarding ‘how the app achieves its decisions’ (suggested by ISO) is similarly critical but needs to be better operationalized in all countries, not least in relation to the increasing incorporation of AI-based algorithms in apps.

With respect to health content, approvals in pioneering countries such as Germany have been based mainly on medical trials. As many apps aim at creating structural and procedural improvements to care (e.g., patient literacy, improved interaction between physicians and patients, improved sense of control among patients in self-management of their disease, etc.), different types of evidence beyond RTCs may be required. Real-world data (RWD) and real-world evidence (RWE) are expected to play a role in the ongoing evaluation of apps in practice5. Further, whether medical evaluations done in one country will be considered valid evidence in others is a further reasonable and legitimate source of uncertainty. So far, national approval has been based on medical trials in a given country, but, for small countries or products with a small patient population (e.g., those targeting individuals who suffer from a rare disease) this may not be practical, and such a requirement also creates a significant burden for companies.

Another dimension of health content involves the degree to which information is presented in a clear and accurate way on the app to its different audiences (purchasers, patients, or health care professionals). This aspect, which refers to information quality more than the evidence supporting the clinical value of the app, is largely non-operationalized in existing/emerging frameworks, although Germany does require manufactures to present proof of the basic quality of medical content and usability, which combined can address this challenge for regulated apps (see Supplementary Table 4). This aspect is pertinent to address more in frameworks given that recent studies19,20,21 identify numerous safety concerns relating to the quality of information presented by apps.

A key issue regarding technological interoperability is whether data can be exchanged with electronic health records. This is considered in several existing/emerging frameworks (EN, SE, G, ISO). Yet, so far, most apps do not exchange data, even though this could potentially be highly beneficial, especially for chronic disease management. Most EHRs now do have open APIs, which should make exchange possible. In Germany, interoperability is a requirement of the Fast-Track and further compatibility with the newly introduced e-prescription system and the electronic patient record is also planned. B and EN also have this requirement, but all countries experience struggles with how to implement automated tests in this area.

Security/privacy in Europe is defined in terms of GDPR compliance, while in the U.S. HIPAA is the relevant statute. While most apps do ask patients whether they consent to secondary use of their data, often this consent is buried. Further, while criteria such as protection against theft and viruses, signalling of breaches, authentication, data sharing, and maintenance work are discussed, there is uncertainty regarding their operationalization. Indeed, among traditional (regulated) medical devices that contain software, it is already known that there is a deficit in the provision of information about cybersecurity in publicly available documents, suggesting that more regulatory guidance and/or public policy is likely needed in this area22.

Usability represents a critical concern, as many apps score poorly in this regard, and many are especially inaccessible to patients with low language skills or literacy, even though this group may particularly stand to benefit. It is also unclear what group(s) or institution(s) should evaluate app usability, as perceptions of this could differ markedly among patients with different conditions and experiences. This is perhaps where an ongoing crowdsources/market-based rating system could particularly complement initial evaluations performed by centralized actors, even though it would create a risk of gaming.

Little attention to date is given to criteria and standards for continuous updating of health apps, in terms of software performance, content, as e.g., new guidelines emerge and evidence for latest best evidence for mobile interventions, e.g., on human-computer interaction, AI or meaning of sensor data. This domain requires attention especially as this is an emerging and dynamic field where not only medical evidence progresses but even faster mobile phones, their operating systems, features and functionalities. This is critical also considering the evidence19 suggesting that apps on the market exhibit many flaws in software functionality, which could potentially be addressed in updates.

Finally, accountability (for not only medical effects but also for side effects of apps and changes in consumption patterns triggered by the app)23 and the related issue of the integration of apps into care pathways24 are two dimensions of critical importance for the effective implementation of apps. These issues are lacking in most emerging frameworks and warrant future attention. ISOs criteria for ‘social benefit’ (5.2.5) (https://www.iso.org/standard/78182.html (2021)) and B:s requirements for level 3 can here be used as inspiration, as it encourages app developers to include economic analyses that consider healthcare savings in other healthcare settings than where the cost is generated, and benefits for society more broadly.

Overall, a balance needs to be achieved between detailed evaluation criteria on the one hand, and the applicability of frameworks on the other. As most emerging/existing frameworks represent soft law (not legally binding but based on voluntary use) and/or early-stage programs or pilot projects, usability (also considering the time and other costs of using them) of the frameworks is important too. This again, speaks for a staged model with different ‘levels’ of approval, in order to strike a good balance between product risk and information/regulatory requirements.

In summary, while ongoing initiatives in many countries are ambitious and continue to make progress in the service of bringing better products to patients, the effectiveness of approaches in use is uncertain, as the operationalizations of criteria to date are rarely sufficiently specific to offer providers and patients the guidance they need to make evidence-based decisions about apps.

Current end-user interfaces are especially immature. Ideally, a clinician seeing a patient with a chronic or acute condition might have available a portfolio of favoured/vetted/approved apps that they could pick from, based on the patient’s characteristics. A clinician should also be able to prescribe an appropriate app for the patient, making it easy for the patient to access the intended product, which in most instances would interoperate with their electronic health record24. Indeed, this is the goal (and a key early success) of the German system for regulating and approving ‘prescribable apps’ for individuals in the statutory health insurance system.

Patients would like to be able to search for certified apps (fulfilling basic criteria) for both health and wellness as well as chronic conditions that they feel best meet their needs. For example, a patient with diabetes might opt to try several apps, ultimately settling on the one they find most motivating. Thus, the presentation of app quality to end-users is critical. If apps are to be provided through multiple channels, the apps themselves may need to include information about their quality level and other features such as data privacy and security. The emerging ISO standard provides a symbol system with visual ‘labels’ to be displayed in the app, to guide users. This may be viable—assuming it becomes a standard implemented globally. Search functions need to complement this, allowing users to search for apps for specific intended uses, with well-established minimal levels of quality.

Payers are another group facing the selection challenge. Like the above, guiding interfaces facilitating the identification of meaningful apps from the perspective of this groups are also needed. These actors act as gatekeepers since apps that do not pass their filters will not be widely used while those that do may have large audiences.

From the app manufacturer’s perspective, such criteria may or may not be welcome, and many developers may be concerned about barriers to entry that they perceive to be ‘too high’. But manufacturers would likely welcome approaches that enabled approvals in one country to be transferrable or expedited in others. Regarding transparency, privacy, and security, compliance with GDPR provides some degree of transferability across European countries. But in other contexts, criteria and associated regulations are sometimes primarily available in the native language (e.g., Germany). This is a key issue for app producers in small countries.

In this study, we have explored the current policies around the app marketplace across nine countries and based on this evaluation provide a set of recommendations and common issues for consideration. Health apps and the exchange of data between new actors will continue to undermine the power of traditional gatekeepers. Nonetheless, national standards could play an important role in creating awareness in markets, setting norms, and safeguarding basic quality dimensions. Clearly, this market is increasingly global and international collaboration could be beneficial in many ways, for example around issues of app transparency, health content, technology, and security/privacy. Further efforts of international researchers, practitioners, and users to identify and articulate common issues across countries as well as important settings for policy evaluation will therefore be vital to the ongoing growth and development of this nascent setting with great potential to improve care in new ways.

This study has limitations. This is an exceptionally broad area of health technology and we elected to focus on apps rather than digital health more broadly. We included only a small number of countries, but we intentionally selected a diverse group of settings that are some of the most advanced with respect to the regulation and use of health apps.

Methods

We evaluated seven European countries, as well as the United States and Singapore. We purposively selected countries that have made at least some progress in this area, but with varying approaches. For pragmatic reasons, we excluded countries to where we had no direct access. The following nine countries were included: Sweden, Norway, Denmark, Netherlands, Belgium, Germany, England, the United States, and Singapore. See an overview of participating countries in Supplementary Table 1. Drawing on previous studies involving cross-country comparisons of regulatory (policy) approaches to health IT25, we initially approached the nine different national policy contexts based on an analytic model for policy analysis26 that distinguishes between policy context, policy process, and policy content (see Supplementary Table 2). This study focused on policy content: who/what is regulated26. We limited our focus to national-level policy developed by the government, governmental agencies, and national standard-setting bodies. We excluded frameworks developed by regions or local, specialised communities.

Definitions

Health app

We define a health app as a computer program or software application (designed to run on a mobile device) “intended to be used specifically for managing, maintaining, or improving the health of individual persons, or the delivery of care” (ISO https://www.iso.org/standard/78182.html (2021), p 5). Health apps may target specific medical conditions and clinical practice areas, or they may be generic, aiming to improve health and wellness more generally—for example, by facilitating communication between patients and clinicians, etc. There are professional as well as patient-facing apps. We incorporate both software (SW) and data generated/stored by the app in our definition of a health app. Hardware was excluded from consideration, although health apps can be used on various devices including smartphones and personal computers. A health app may be categorized as ‘Software as a Medical Device’ (SaMD) if it meets relevant criteria and may thus be subject to medical device regulations17, in particular the EU’s Medical Device Regulation (MDR, see Supplementary Note 1 for additional detail). The International Medical Devices Regulators Forum defines SaMD as ‘software intended to be used for one or more medical purposes that perform these purposes without being part of a hardware medical device’ (IMDRF http://www.imdrf.org/docs/imdrf/final/technical/imdrf-tech-131209-samd-key-definitions-140901.pdf (2013)). We focus here on the policies intended to complement extant international and national medical device regulations. We excluded frameworks developed by specialist/patient communities and frameworks developed at local and regional levels.

Approval—market access

the permission to introduce the app to the market. (For instance, through an app store, independently or whether it is used by a clinician or not—it could be used by a citizen for private purposes only). Reimbursement approval: the permission and/or mandate to reimburse for use of the app by (typically tax-funded) third party purchasing actors such as insurers/regions/states. Both market access and reimbursement approval are part of what we refer to as the policy approach to apps. We exclude the various efforts made by interest groups, industry stakeholders, and associations, focusing solely on governmental initiatives here.

Policy

We distinguish between ‘soft’ and ‘hard’ regulation. While there is no easy dividing line between soft versus hard regulation, we defined hard law as national legislation that is mandatory and absolute (also referred to as binding or rule-based governance), and soft law as alternative forms of governance, which are conditional or voluntary. Soft law refers to rules that are not legally binding, for example, recommendations, agreements, national action plans, or policy documents. Soft law entails normative commitment and may have political effects27,28. This implies that soft law shall be considered politically binding rather than legally binding. While soft law is sometimes referred to broadly as regulation that relies on open-ended processes such as benchmarking and peer group audit, we have only included national recommendations here.

Data collection

Document review and analysis constituted the primary source of data in the study. Based on the guiding definitions above, participating researchers gathered documents from their respective countries describing hard and soft laws relating to each country’s policy approach to health apps, as of mid-2021 (Q2–Q3). Dr. Stern who is currently based in Germany collected data about it given her expertise in their Fast-Track process. Documents included healthcare legislation, national strategies and e-Health reports, technology and e-service standards. We also performed interviews with key individuals involved in developing frameworks (N = 14) (See Supplementary Note 3). Selected parts of the vast amounts of information gathered about each country were translated by the authors from the local language into English and sorted in terms of a framework with separate categories for policy context, process and content, and sub-categories representing the target of the regulation (possible criteria dimensions). Each author completed the framework for their respective country, which was double-checked and revised by the first and last author at each stage of data collection and revision. Work on the table content was iteratively completed by all authors and updates were communicated through September of 2021. Subsequent changes in regulations were not included.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.