It may take a patent lawyer to fully understand the scope of US patent number 7,777,022, but one thing is clear: at first glance, it certainly appears broad. The patent includes 4.2 million genetic sequences, some of which were identified computationally in a fishing trip for sequences that have applications in virology.

In June, the US Supreme Court determined that patents should no longer be granted for ‘inventing’ naturally occurring human genes, ending 30 years of the practice at the US Patent and Trademark Office. The decision will probably affect the growing genetic-diagnostics industry, and its influence will extend to patents on genes from other organisms. But it did not abolish all claims on DNA sequences — some have estimated that the case will affect only about 8,000 of the at least 72,000 US patents that mention DNA sequences of one sort of another.

That leaves businesses with the unenviable task of sifting through the remainder to determine which, if any, will affect the commercialization of a given invention. Patent 7,777,022 highlights the growing difficulty in doing so: although it lists millions of sequences, it lays claim to only a few. A firehose of data and limited search tools make it impossible for all but highly trained patent specialists to make sense of the landscape around any technology. Highly trained patent experts do not come cheap: companies invest millions each year to keep track of the shifting intellectual-property landscape. Those that cannot afford the fee take the risk of being unable to patent their discoveries, or of being sued.

On 6 December, a study published in Nature Biotechnology took an important step towards rectifying that problem by revealing an open-source database that allows interested parties to map out the patent landscape around a technology without racking up exorbitant legal fees (O. A. Jefferson et al. Nature Biotechnol. 31, 1086–1093; 2013).

The database, called the Lens (www.lens.org/lens), was created by Cambia, a non-profit organization in Canberra dedicated to facilitating innovation. It pulls together information from more than 90 patent jurisdictions worldwide. The Lens can be used to investigate patents of any ilk. But it has dedicated tools to analyse patents on DNA and protein sequences, and has plans to develop similar tools for other classes of patents, including those for circuits, software and chemicals.

The Lens is a bold effort to bring clarity and parity to the analysis of patents. It is also an innovation in need of support. Powered by eight busy software engineers, and funded by a patchwork of foundations and the Queensland University of Technology in Brisbane, Australia, it is tackling big-data problems that few have dared to take on. It will work best when it has cultivated a wiki-style following of users willing to take the time to annotate content, develop tools and share analyses.

Such a following can be hard to come by when academics and business leaders are already juggling busy schedules and scrambling for funding of their own. Cambia founder and chief executive Richard Jefferson is quick to acknowledge that some previous open-source efforts met with much enthusiasm but little participation from the academic community. It would be worth the effort for funders and institutions to find ways of incentivizing participation in an open-source patent effort.

Many patent systems do not post their patents in a machine-readable format.

Technology-transfer offices can help by logging the allocation of licences and changes in ownership in patent-assignment databases, where possible. A recent study led by Arti Rai, a specialist in intellectual-property law at Duke University in Durham, North Carolina, found that many universities fail to comply with basic requirements to acknowledge the contribution of federal funding to inventions in patent databases (A. K. Rai and B. N. Sampat Nature Biotechnol. 30, 953–956; 2012). Such information is important to track the history of the patent and the impact of federal research funding, as well as to allow the federal government to pursue its rights regarding such patents.

In the United States, the push to boost patent transparency has gained much-needed attention from on high. Revelations that some businesses, sometimes known as ‘patent trolls’, have been amassing large patent estates and using them to threaten other firms with litigation has caught the attention of the US Congress and the administration of President Barack Obama. Lawmakers are now considering legislation to rein in patent trolls, in part by creating reporting requirements that will help to clarify who owns a given patent — information that is currently hard to come by.

But the US patent system, troubled though it is, is not the only system that makes it difficult to track patents. In a survey published along with the Lens analysis, Cambia researchers noted that many patent systems do not routinely post their patents in a machine-readable format, making it difficult to search and analyse them. Where possible, it is time for such systems to address these flaws.

On the first day of many introductory patent-law classes, students are taught about the ‘patent bargain’. This is the foundation upon which the patent system is built: in exchange for protection for an invention, the inventor agrees to publicize their creation so that others may build upon it. The idea behind patenting was thus to put innovation into the public domain — yet the patent system has developed too many nooks and crannies in which information can be hidden away.

It is time to return to the bargain at the root of the patent system, and to use the computational and social-media tools at our disposal to publicize inventions, rather than obscure them.