TOOLBOX
24 September 2018

Machine learning gets to grips with plankton challenge

Marine biologists are using artificial intelligence to help them identify objects in millions of images.

Jeffrey M. Perkel

Jeffrey M. Perkel

View author publications

You can also search for this author in PubMed Google Scholar

Marine biologist Kelly Robinson sits at a computer while members of her team look over her shoulder at the screen — Kelly Robinson (left) and the ISIIS team, led by Bob Cowan (right).

When they think about big data, most researchers probably imagine genomics, neuroscience or particle physics. Kelly Robinson’s data challenge involves plankton.

“A lot of things that we enjoy seafood-wise — from fish to oysters to mussels to shrimp — almost everything starts their lives as plankton,” says Robinson, who studies marine ecosystems at the University of Louisiana at Lafayette. In photographs, they look like floating specks of dust, and her research involves quantifying and mapping their distribution and predator–prey interactions. The problem is, she must do so in millions upon millions of images.

Robinson collects data by towing a remote-camera platform called ISIIS — the In Situ Ichthyoplankton Imaging System — behind a boat. ISIIS captures about 80 photos per second, or 288,000 images (660 gigabytes) per hour. For one project in the Straits of Florida, when Robinson was a postdoc, she generated 340 million pictures; a colleague working in the Gulf of Mexico generated billions.

“You start to learn about things that you never thought you would learn,” Robinson says, “like the number of files that you can store on an individual computer. It’s 30 million, by the way, on your regular PC.” On her most recent cruise, Robinson sailed with 52 2-terabyte hard drives, which a student had to monitor and replace as they filled up. Someone then must get that collection to the university, convert the files to Linux formatting, and upload them to a server — a process that takes 24 hours per drive.

The team uses machine-learning software to automatically pick out and identify objects in the images. But the algorithms must be taught what to look for — this is a starfish, that is a prawn. Such features are relatively rare in the water, so finding pictures for the training set takes time. Over two months, Robinson and her team manually sorted through 2 million images to find enough to feed the algorithm. “It’s a little mind-numbing, but if you’re under the gun you can do it,” she says.

Naturally, the team is looking to optimize the process. Working with colleagues at Oregon State University in Corvallis, where she was a postdoc, Robinson is testing whether she could accelerate her work by processing the images on multiple video card graphical processing units (GPUs) running in parallel. She is also looking into cloud computing as an alternative to Earth-bound clusters.

But infrastructure goes only so far; what the team really needs, she says, is more people to crunch the numbers. Unfortunately, data scientists are in high demand, and industry jobs are lucrative. “We have a lot of turnover,” she says.

Nature 561, 567 (2018)

doi: https://doi.org/10.1038/d41586-018-06792-5

Subjects

Latest on:

It’s time to talk about the hidden human cost of the green transition

Correspondence 16 APR 24

The world needs a COP for water like the one for climate change

Correspondence 16 APR 24

Ghost roads and the destruction of Asia-Pacific tropical forests

Article 10 APR 24

Climate policy must integrate blue energy with food security

Correspondence 09 JAN 24

Forecast warns when sea life will get tangled in nets — one year in advance

News 05 DEC 23

With the arrival of El Niño, prepare for stronger marine heatwaves

Comment 06 SEP 23

The surprising history of the Southern Ocean’s super current

News & Views 27 MAR 24

Deep-sea mining plans should not be rushed

Editorial 26 MAR 24

Climate models can’t explain 2023’s huge heat anomaly — we could be in uncharted territory

World View 19 MAR 24

Jobs

Computational Postdoctoral Fellow with a Strong Background in Bioinformatics

Houston, Texas (US)

The University of Texas MD Anderson Cancer Center
Locum Associate or Senior Editor (Immunology), Nature Communications

The Editor in Immunology at Nature Communications will handle original research papers and work on all aspects of the editorial process.

London, Beijing or Shanghai - Hybrid working model

Springer Nature Ltd
Assistant Professor - Cell Physiology & Molecular Biophysics

Opportunity in the Department of Cell Physiology and Molecular Biophysics (CPMB) at Texas Tech University Health Sciences Center (TTUHSC)

Lubbock, Texas

Texas Tech University Health Sciences Center, School of Medicine
Postdoctoral Associate- Curing Brain Tumors

Houston, Texas (US)

Baylor College of Medicine (BCM)
Professor

Energy AI / Grid Modernization / Hydrogen Energy / Power Semiconductor Concentration / KENTECH College

21, Kentech-gil, Naju-si, Jeollanam-do, Republic of Korea(KR)

Korea Institute of Energy Technology

Machine learning gets to grips with plankton challenge

Subjects

Latest on:

Jobs

Computational Postdoctoral Fellow with a Strong Background in Bioinformatics

Locum Associate or Senior Editor (Immunology), Nature Communications

Assistant Professor - Cell Physiology & Molecular Biophysics

Postdoctoral Associate- Curing Brain Tumors

Professor

Search

Quick links

Related Articles

Subjects

Latest on:

Jobs

Computational Postdoctoral Fellow with a Strong Background in Bioinformatics

Locum Associate or Senior Editor (Immunology), Nature Communications

Assistant Professor - Cell Physiology & Molecular Biophysics

Postdoctoral Associate- Curing Brain Tumors

Professor

Search

Quick links