Health data is becoming big news and big business. Such data has tremendous potential to support more-efficient health-service planning and delivery, help clinicians monitor patient safety, improve treatments and prevention, and advance the understanding of disease. It is the fuel for many of the developments in machine learning that some argue will revolutionize health, care and medical science in the coming years.

I lead Understanding Patient Data, an initiative based at Wellcome in the UK. We seek to understand people’s views, concerns or questions about health data and advocate for policy and practice to be responsive to these public views and values, thereby building trustworthiness in practice. This work enables me to work closely with the patient-data community, ranging from patient-advocacy groups to clinicians, charities, researchers, policymakers, data holders, media and industry, to identify emerging themes in people’s interests and attitudes about data use.

These data often come from health records, originally collected in the context of a confidential relationship between clinician and patient. The data are personal and sensitive, and people care what happens to them, even if stripped of identifying information. This is easy to forget when we are seeking out patterns in rows of numbers and developing lines of codes. Data tables do not tell the story of a journey through cancer, an exhausting battle with pain, the uncertainty and ambiguity of a diagnostic odyssey, or the anxiety of an acute trauma.

Many important debates are happening among technologists, policymakers, researchers and clinicians about how to harness the power of data to drive innovation in health. However, one vital set of perspectives is absent: that of the patients on whom this all depends.

The trust of patients and the public cannot be taken as a given. A vital starting point for the research community is to recognize that we cannot simply expect people to trust those who are using health data, or assume that people just need to be educated and will then accept the benefits. This approach will not succeed in a world of increasing mistrust, misinformation and legitimate concerns about data-brokering as a business model.

Some initiatives try to resolve this challenge by promising to put control of all data in the hands of patients. However, as a patient, I have little interest in the complex dynamics of how data move through the healthcare system, let alone the research and life-sciences sector. I do not want to have to navigate daily decisions about my data; I want to know the system responsible for managing my data has my interests at heart, so I do not have to worry. The solution is not to eliminate the need for trust.

Instead, we should focus on building a system that is worthy of people’s trust. That means fundamentally shifting how we talk about data, but also, critically, making sure the people on whose data we rely are at the heart of how the data are managed, used and protected.

Taking a patient-centered approach forces us to find new ways to engage people in how health data is used, at a time and in a way that works for them. Health data is not the thing people want to talk or learn about when they see their clinician: they are likely to be in pain, or anxious or frustrated. Data can also be a complex, technical topic, and it is often hard to provide straightforward responses to people’s questions. As anyone working with data about people will know, the question “Can I be identified from the data?” will not have a simple answer. Researchers therefore need to find novel ways of articulating the value and benefits of what they do with health data. We encourage our research partners to use a data citation when showcasing their work, to express gratitude to patients, as a simple way to increase the visibility of data use (the useMYdata group has developed a citation to acknowledge when patient data underpins research:

Given the complexity of the subject matter, it may seem counter-intuitive to involve patients and the public in decisions about data. But our research at Understanding Patient Data has identified a real appetite for greater citizen participation in governance and decision-making, especially when the questions resonate with people’s lives, health and care ( Openness about the risks as well as potential benefits is vital; this grounds the conversation in realistic trade-offs rather than relying on platitudes about privacy protection. These are not new ideas, and there are plenty of research initiatives developing radically democratized decision-making for data use, but we have yet to see these implemented at a systems level.

Centering the data narrative on patients and the public reminds us of two things. First, it reminds us that data use is a means to an end, not an end in itself. It is a means to building a better health system with richer research to inform better care and treatment for people. As an advocate from the patient group useMYdata powerfully stated: “It may not help me, but may help others….It could save future generations from suffering some of the diseases in today’s world.” (useMYdata, personal communication).

Second, data use relies on an implicit exchange between patients and the people using data derived from their records. It is only fair that those using the data acknowledge the value of this resource and seek to embed patient and public views and values in the decisions they make.

Ultimately, health data is collected by people, from people and for people. If researchers want to be trusted with data, we should trust people to help us shape the rules so that the data revolution in healthcare benefits everyone.