Lung cancer prediction using machine learning and statistical techniques. Credit: Samarendra Das and Braja Bruti Das, CC-BY 4.0

Volatile organic compounds (VOCs) in exhaled breath samples can act as biomarkers for diagnosing lung cancer1. These VOCs can be used to identify lung-cancer patients from healthy smokers and non-smokers.

This approach paves the way for a non-invasive method to detect lung cancer, says an international research team.

Current lung cancer screening techniques are expensive and complex. To find a cost-effective alternative, scientists measured molecular concentration of VOCs in breath samples of untreated cancer patients, patients with benign pulmonary nodules, and healthy individuals. They captured and analysed the VOCs using silicon microreactor technology. Each microreactor has thousands of micropillars coated with a specific chemical.

Broadly, the researchers found that 16 VOCs were statistically significant. Of these VOCs, butyraldehyde and butyric acid ranked first and second respectively for distinguishing cancer patients from healthy participants.

The cross-validations of the data were repeated 500 times by taking different combinations of VOCs. Top three VOCs such as butyraldehyde, butyric acid dicyclohexyl ketone helped detect lung cancer with 92% accuracy.

The team, which included researchers at the ICAR-Indian Agricultural Statistics Research Institute in New Delhi, developed machine-learning-based classification models with some of the VOCs and established their relevance in lung cancer patients’ classification. This saved time and lowered the cost of experiments, and helped for the early detection of lung cancer, which is key to survival rates.

This technique may be extended to other diseases, including COVID-19 detection, the researchers note.