Introduction

Female breast cancer (BC) is the most commonly diagnosed malignancy and one of the leading causes of cancer-related deaths, worldwide; an estimated 2.26 million new cases and 680,000 deaths were recorded in 2020 alone [1]. Although significant advances have been made over the past few decades in the prevention, diagnosis and treatment of BC [2], it still remains a major global health burden. Distant (metastatic) recurrence is a significant clinical issue and is responsible for the majority of BC deaths [3]. Five-year survival of patients with localised BC is 99%; decreasing to 27% when diagnosed with late-stage disease [4]. Early detection and diagnosis, when the tumour is still localised [5], is imperative for improved patient outcome.

Various clinical methods exist for BC detection and diagnosis. Mammography is the gold-standard method for early BC detection and forms the basis of Mammographic Screening Programmes, introduced globally to detect small malignant tumours before symptoms develop. Although existing evidence suggests that mammography can reduce risk of mortality [6, 7], there has been considerable debate regarding the overall efficacy of mammography [8] both as a screening and diagnostic tool. Mammography is far less sensitive in women with dense breast tissue and there is also debate surrounding the appropriate age for screening [7, 9]. Moreover, overdiagnosis and subsequent over-treatment is now recognised as a major issue surrounding screening mammography [10]. However, imaging cannot be used solely for the diagnosis of BC; currently the only way to make a definitive diagnosis is through histopathological analysis of the patients’ tissue. Obtaining tissue involves invasive procedures such as surgical excision or needle core biopsy. Approximately 80% of biopsies are negative for BC thus rendering these invasive procedures unnecessary in most cases. Moreover, the morphological and molecular heterogeneity of breast tissue also presents challenges and may lead to interpretive disparity amongst pathologists. These limitations provide the motivation to develop new techniques capable of rapidly and accurately detecting and diagnosing BC in its early stages. Spectroscopic methods are emerging as powerful tools within biomedical research [11, 12] as they are non-invasive and have a high real-time spatial resolution. Of the many spectroscopic techniques available, Raman spectroscopy (RS) is particularly appealing as it has demonstrated the potential to rapidly, non-destructively and objectively, provide clinically relevant diagnostic information in a variety of specialities [13,14,15]. This review will explore the application of RS in BC diagnosis.

Raman spectroscopy for biological applications

RS is an optical spectroscopic technique that can probe the vibrational modes associated with chemical bonds in a sample; as different samples have distinct chemical compositions, a sample-specific spectral fingerprint, or ‘Raman spectrum’ can be obtained. This spectrum contains numerous bands at various frequencies, which are characteristic of the structural features and functional groups of a particular molecule. A more detailed description of the fundamental principles of RS, is in Supplementary File 1 and Supplementary Fig. S1.

Different cells and tissues are made up of distinct combinations of proteins, carbohydrates, lipids and nucleic acids; each of which have a number of associated vibrational modes that can be probed by RS to reveal information on their structure and composition. Accordingly, RS can provide details on the current state and activity of cells and tissues. Fig. 1a displays an exemplar of a Raman spectrum recorded from MCF-7 BC cells, highlighting the three main regions of the spectrum: fingerprint (200–1800 cm−1), silent (1800–2700 cm−1) and high wavenumber (2700–3300 cm−1). The high wavenumber region is dominated by C–H, CH2, O–H, N–H vibrations of lipids, proteins and water whereas the ‘silent region’ is mostly free of bands from biological material; cellular Raman vibrations arising from triple bond functional groups like the alkynes are an exception [16]. Most peaks for biological samples can be found within the ‘fingerprint’ region where the richest molecular vibrational modes of proteins, lipids, carbohydrates and nucleic acids are found. A typical biological spectrum of this region is shown (Fig. 1b), with examples of peak assignments from studies centred around the detection of BC based on specific Raman signatures.

Fig. 1: Examples of Raman spectra and peak assignments.
figure 1

a Extended scan from the Luminal A breast cancer cell line, MCF7 depicting the three main regions of the Raman spectrum: fingerprint, silent and high wavenumber regions are illustrated. Whilst the majority of biologically relevant molecular vibrations exist within the fingerprint, both the silent and high wavenumber regions may also contain molecular vibrations from a limited number of biomolecules. b Representation of the fingerprint region of a cellular Raman spectrum (left) with a variety of peaks that correspond to molecular vibrations of amino acids, proteins, lipids, nucleic acids and carbohydrates. Highlighted are examples of peak assignments from studies centred around the detection of breast cancer based on changes to specific Raman signatures (right). Image kindly provided by Renishaw. c Raman spectrum obtained from normal breast tissue displaying sources of spectral interference including noise, fluorescence and cosmic spikes (arrows).

Biological samples are highly complex and heterogeneous; they contain an extensive number of Raman-active vibrational modes, particularly within this fingerprint. Thus, extracting relevant information from such data-rich spectra is challenging. The many discrepancies, within the literature, associated with characterising spectral peak position reflect this [17]. Raman spectra of biological samples contain wide peaks that represent a combination of different molecules and neighbouring molecular interactions may cause a shift in peak position from the isolated functional group. Furthermore, most molecules within biological samples that contain Raman active vibrational modes, also exhibit fluorescence upon excitation; cellular autofluorescence is several orders of magnitude more intense than the Raman signal and thus can hinder spectral interpretation. Additionally, noise sources, either generated internally within the Raman system itself or exteriorly from other sources, include cosmic rays; dark current; shot noise and readout noise which further contaminate the spectra and complicate subsequent data processing [18]. Therefore, computational data pre-processing, to remove undesirable background noise (Fig. 1c), should be carried out prior to analysis. Computational spectral denoising techniques are beyond the scope of this review and reviewed elsewhere [19]. Moreover, if spectra, of the same sample, are not obtained simultaneously, under identical experimental parameters, then they may be subject to a shift in peak intensity. Normalisation is applied to address this disparity and a number of different approaches are available [19]. Prior to spectral analyses, it is important to identify any outliers within the dataset that could adversely affect interpretation and classification. Principal component analysis (PCA) and Signal to noise ratio based thresholding methods are two examples of outlier-detection algorithms [20].

Following pre-processing, corrected Raman spectra are subject to analysis to recognise vibrational signatures. Univariate methods can be used to analyse biological samples, but these are somewhat restricted to small datasets, for example, in studies on rare diseases or where samples are obtained from surgical resection with only a few suitable samples for Raman analysis. Moreover, univariate methods only make use of a limited number of variables like relative band positional or intensity changes from the whole spectrum hence resulting in a massive loss of information. Typically, Raman datasets in biological studies are large and multifaceted, and require more efficient, multivariate approaches, such as chemometrics, for a truly exploratory and comprehensive analysis. These methods involve an in-depth analysis of many spectra, simultaneously, to allow useful trends and patterns to emerge from the data, which may otherwise have been missed if analysed, only, at the single level. Moreover, such patterns can be modelled and subsequently used as a predictor for similar newly acquired data.

Multivariate methods can be classified into either unsupervised or supervised methods, depending on the objective of the analysis and any priori-knowledge of the sample [20]. Unsupervised methods, including: PCA [21], k-means clustering [22] and Hierarchical clustering analysis [23] do not require any previous knowledge of the sample and are thus exploratory in nature. They can assist in identifying patterns and trends within the dataset and the objective creation of groupings. Conversely, supervised methods, including: partial least squares (PLS) [24] linear discriminant analysis (LDA) [25] and multiple linear regression [26] rely on pre-existing class labels within the dataset, for example, the histopathological diagnosis, and thus are concerned with pattern recognition.

Raman variants in breast cancer studies

RS, in its simplest form, implies the phenomenon of Spontaneous Raman scattering (Fig. 2). Spontaneous Raman scattering has seen increasing popularity in biomedical research, but the Raman effect is inherently weak and is essentially competing with stronger signals from Rayleigh scattering and tissue autofluorescence. Experimentally, the use of higher-powered lasers and longer signal acquisition times can improve the Raman signal and photobleaching can quench the autofluorescence. However, the potential damage that can be induced following application of these experimental approaches poses significant challenges both to biological and clinical applications. Consequently, numerous Raman-based techniques have been developed to enhance the weak Raman signal and overcome issues with fluorescence interference, whilst minimising photodegradation (Table 1). Furthermore, the development of techniques such as Spatially offset RS and Transmission RS make it possible to probe biomolecules in deep tissue layers for potential non-invasive in vivo measurements. The basic principles and examples of where these variants have been applied in BC detection and diagnosis are described in Table 1.

Fig. 2: Schematic representation of the main components found within a typical spontaneous Raman scattering micro-spectroscopy system.
figure 2

Laser light is guided through a beam expander onto a series of mirrors that focus the light onto the sample through a microscope objective lens. Scattered light is collected with this same objective lens in a 180° backscatter sampling geometry. Rayleigh scattered light is reduced through the use of edge filters and then the Raman scattered light is focused through an entrance slit and dispersed by a diffraction grating onto the detector. Variations in this general setup typically exist between manufacturers.

Table 1 Examples of Raman variants with applications in breast cancer.

Raman spectroscopy in breast cancer detection and diagnosis

Morphological and functional changes that occur during the malignant transformation of cells and tissue are accompanied by significant biochemical changes. For example, unchecked cellular proliferation, a hallmark of cancer [27], increases the production of DNA, RNA and proteins and disrupts lipid metabolism. These changes, at the biochemical level, occur much earlier than the onset of clinical symptoms. As such, Raman-based methods can be used to probe and quantify these altered molecular signatures, with spectra serving as biomarker profiles for early disease classification and tumour grading.

Moreover, this information can be obtained without destroying the material since RS does not require extensive sample preparation or labelling. Consequently, additional processing and analyses are possible following the Raman acquisition, which in the case of diagnosing cancer, is essential to build a complete picture of individual tumours. Its non-destructive nature, in addition to its ability to be adapted to optical fibre techniques, demonstrates its potential for in-vivo intra-operative or bed side applications. Moreover, samples in a solid state are not a requisite for RS hence there has been increasing interest in analysing circulating markers in different body fluids [28,29,30,31,32] for early cancer detection and diagnosis. The following includes discussion on ex-vivo tissue, in vivo and liquid biopsy Raman diagnostic applications in BC.

Ex-vivo applications

Early Raman studies on breast tissue focused on the qualitative analysis of spectra; identifying specific features associated with the appearance of benign or malignant breast changes. The first observations made on human breast tissue demonstrated the ability of Fourier-Transform (FT) RS, with Near-Infrared (NIR) excitation (1064 nm), to discriminate normal and abnormal tissue [33]. Since then, considerable effort has focused on discerning the biochemical changes that accompany these spectral changes associated with breast pathology.

The most prominent biochemical differences between normal and abnormal breast tissue concern lipids, carotenoids and proteins. Spectra obtained from normal, benign and malignant breast tissue using visible excitation revealed marked reduction in the intensity of peaks corresponding to vibrational modes of β-carotene and lipids in abnormal versus normal breast tissue [34]. This same group demonstrated this reduction in peak intensity of lipids again in diseased breast tissue using NIR excitation [35]. They also found similarities between the spectra of human collagens and that of infiltrating ductal carcinoma specimen. The relevance of the vibrational modes of lipids, carotenoids and proteins as potential spectral biomarkers for distinguishing pre-cancerous/cancerous and noncancerous breast tissue has been further corroborated [36,37,38,39,40,41,42,43].

A common observation is that spectral differences between benign and malignant breast tissue are less obvious. For RS to be applied as a diagnostic tool, its ability to distinguish benign and malignant breast lesions, with a high sensitivity and specificity, is imperative. Indeed, benign and malignant breast tissue could be distinguished, for the first time, after performing PCA and logistic regression on spectra [44]. This diagnostic algorithm allowed correct classification of normal, benign and malignant lesions with 93%, 87% and 100% accuracy, respectively [44]. The coupling of multivariate statistical methods with RS has since proven to be indispensable for discriminating benign and malignant breast lesions [45,46,47,48,49,50,51,52,53,54].

Breast tissue abnormalities are inherently heterogeneous, both at the morphological and molecular level. This is reflected by the variety of benign and pre-cancerous/cancerous conditions that can affect the breast. Indeed, there has been considerable focus on distinguishing the wide variety of breast pathologies using RS. Efficacy of FT RS in differentiating benign, pre-invasive and malignant breast tissue has been demonstrated [55]. Through qualitative spectral analysis, a number of key differences were observed, including intensity changes of bands corresponding to amino acids, nucleic acids, proteins, carbohydrates and lipids. These differences could form the basis of spectral biomarkers for classifying different breast tissue states. Similarly, spontaneous RS and shell-isolated nanoparticle-enhanced Raman spectroscopy (SHINERS) have been used to distinguish normal, benign, precancerous, and pre-invasive and invasive malignant breast tissue [56, 57]. In keeping with other studies, they found that the main spectral features of normal breast tissue were associated with lipid vibrational modes, whereas the diseased tissues showed stronger peaks corresponding to that of protein and nucleic acids. They observed both a gradual increase in protein and nucleic acid concentration and a decrease in lipid concentration, as the breast tissue became more malignant. PLS-discriminant analysis demonstrated a high level of classification accuracy for five different tissue types. In addition to its ability to distinguish the morphological subtypes of BC, RS has also been able to classify the different molecular subtypes- Luminal A, Luminal B, HER2+ and triple negative [58]. Four hundred Raman spectra were obtained from breast tissue microarrays of various morphological classifications but stratified into their molecular subtypes, using a dispersive micro-Raman system operating at a 532 nm wavelength. They observed differences, between the subtypes, in the intensity, position and shape of certain peaks; including, the 1583 cm−1 peak of tryptophan and the 1667 cm−1 Amide I peak. LDA was able to correctly classify the BC subtypes with a specificity of 70%, 100%, 90% and 96.7% for luminal A, luminal B, HER2+ and triple negative BC, respectively.

Microcalcifications are small calcium deposits commonly observed on mammograms, as a result of benign cystic [59], or cancerous/early precancerous breast changes. Mammographic microcalcification classification relies on morphology and distribution rather than chemical composition and thus represents a major source of unnecessary biopsies in benign cases. Microcalcifications can be classed as either Type I or Type II. Type I microcalcifications are composed of calcium oxalate and are most often associated with benign lesions while Type II; are composed of calcium phosphate, mainly hydroxyapatite [60, 61] and are associated with both benign and malignant lesions. Calcium oxalate and calcium hydroxyapatite scatter photons effectively and have different spectral signatures, thus RS is powerful for differentiating type I and II microcalcifications [62,63,64,65]. The utility of RS in BC diagnoses, based on microcalcification- status, is also dependent on its ability to distinguish benign and malignant type II calcifications, where subtle chemical differences between both types of type II microcalcification were reported; malignant breast ducts contained lower levels of carbonate and higher amounts of protein compared to benign ducts [65]. The significance of carbonate as a spectroscopic marker for differentiating type II microcalcifications has been further demonstrated [66]. More recently, a Raman mapping approach was used to study the whole area of each microcalcification for a more intuitive description of all components within the lesion [61]. In addition to hydroxyapatite, whitlockite and amorphous carbonate were found in some benign type II microcalcifications. Raman profiles from microcalcifications could be correlated to their respective diagnostic categories with high sensitivity and specificity. In a pilot study, type II microcalcifications were also identified in benign and malignant canine mammary tumours with good discrimination between benign and malignant lesions supporting its use in the clinical diagnosis and management of human and canine BC [67].

Yang, et al. [68] combined spectral and morphological features in their analysis of type II microcalcification for improved discrimination of benign and malignant cases [68]. They extracted the relative abundance of the main chemical constituents of the microcalcifications- hydroxyapatite, carbonate and protein- using spontaneous and stimulated Raman spectroscopy (SRS) and found correlations between their abundance and tumour malignancy, similar to previous reports. Next, using SRS microscopy, they extracted features related to geometry (circularity, area, perimeter and Fourier descriptor) and texture from each calcification, and found there to be statistically significant differences between the benign and malignant cases. They then selected the most informative spectral and morphological features and analysed them with a support vector machine-based classification algorithm. Interestingly, it was the combination of extracted spectral and geometric features, as opposed to either pure spectroscopy or imaging-based methods, that yielded the best accuracy (99.05%) and precision (98.21%) for diagnosing tumour malignancy.

As mentioned previously, breast abnormalities are highly heterogenous and thus a binary classification system of either benign or malignant type II microcalcifications does not consider the broad spectrum of existing breast pathologies. This is necessary to better define calcification types. SHINERS, in combination with PCA, was used to accurately distinguish type II microcalcifications between fibroadenoma, atypical ductal hyperplasia and ductal carcinoma in situ [69]. More recently, hyperspectral SRS was combined with Second harmonic generation imaging to separate specific microcalcification diagnostic categories, such as: fibroadenoma, atypical ductal hyperplasia, ductal carcinoma in situ and invasive ductal carcinoma [70].

A tissue microarray study of 79 normal and 499 malignant breast tumours confirmed RS was a powerful technique in distinguishing normal from malignant mammary tissues with a sensitivity and specificity of 90% and 78%, respectively [71]. Lipid, particularly fatty acid spectra, dominated the normal tissue whereas protein spectra characterised the cancerous samples. While several previous studies used fresh and frozen breast samples to preserve the chemical composition of tissue, this study confirmed that RS can be easily used on paraffin wax embedded tissues thus widening its applicability.

A recent, albeit small study, of 36 frozen and paraffin embedded breast lesions, including normal, in situ carcinoma and invasive cancer, confirmed the reliability of the technique in differentiating benign from cancerous tissue in paraffin sections, despite some alteration of the chemical composition of the tissues due to dewaxing [72]. Nodal assessment, traditionally done by ultrasound imaging, with or without nodal sampling, is an integral component of the triple assessment of BC. Using frozen sections of axillary lymph nodes and comparison with the histological findings on paraffin sections, Raman imaging has provided promising potential in establishing nodal metastasis and the chemical composition of the metastatic lymph nodes in comparison with the primary BC [73].

In-vivo applications

Several proof-of-concept studies using portable, hand-held Raman probes for classifying ex vivo cancerous and non-cancerous breast tissue [22, 35, 47, 50, 74,75,76,77,78]; and for axillary lymph node assessment [79, 80] have been reported. This demonstrates the translational potential for possible in vivo applications for diagnosing BC intraoperatively; delineating tumour/normal tissue margins; eliminating residual tumour and intraoperative sentinel lymph node assessment.

The first in vivo Raman spectra from breast tissue were obtained from patients undergoing a partial mastectomy. Using a clinical Raman system and optical fibre Raman probe, surgical margins of the remaining tumour cavity following excision were assessed to investigate its feasibility for real-time, intraoperative margin assessment [81]. When comparing results to those from traditional histopathological assessment, Raman achieved an overall accuracy of 93%. Subsequently, spectra were acquired using a dispersive Raman spectrometer (785 nm) coupled to a fibre-optic probe, from breast tumour bearing Sprague-Dawley rats [82]. Probing eight regions of the tumoral mammary gland, a continuous decrease in band intensity, mainly of the band 1446 cm−1, which most likely corresponds to vibrations of CH2 [83], was noted as they moved from the histologically normal tumour marginal region to the actual lesion itself. A hand-held spectroscopic device, termed the ‘SpectroPen’, that can record both fluorescence and Raman signals has been developed [84]. This device, in combination with the fluorescent and SERS contrast agents, was able to detect cancer in 4T1 tumour bearing mice in vivo and pre/intra operatively. Moreover, this could evaluate, in real-time, the positive and negative tumour margins of the remaining tumour cavity. The same approach has been successfully applied to human BC for the intraoperative margin assessment in breast conserving surgery [85]. This is a promising application that could optimise patient management and reduce the need for further surgical procedures to achieve clear margins.

A strategy for image-guided surgical resection of murine breast tumours and intra-operative eradication of residual microtumours using a nanoprobe that combined photoacoustic imaging with SERS detection and photo-dermal therapy has been reported [86]. This achieved complete ablation of microtumours without local recurrence in a breast-tumour induced mouse model. If extended to human subjects, this could drastically improve patient outcomes, as residual tumour cells drive disease relapse.

Also in mouse models, the feasibility of obtaining spectra from the breast, transcutaneously, has been demonstrated [87,88,89,90] with 99% efficiency for the classification of transcutaneous normal and transcutaneous breast tumour tissue reported [90]. In vivo transcutaneous spectra were acquired using a commercial Raman spectrometer coupled to a fibre-optic probe. Although mouse models were used, spectra were comparable to that obtained from both ex vivo and intraoperative in vivo human breast spectra [90].

As mentioned previously, the detection of microcalcifications in mammograms can be an early sign of BC. The capability of RS in characterising type I and II microcalcifications in ex vivo tissue samples has been reviewed in the previous section. However, analysis in this way requires an invasive biopsy, which in patients who only harbour non-cancerous microcalcifications, is unnecessary as only a small proportion of microcalcifications detected with mammography are malignant [91]. This, therefore, necessitates the development of new techniques capable of characterising microcalcifications in vivo and in real time. Several groups have applied fibre-optic Raman sampling to probe the elemental constituents of microcalcifications. A portable clinical Raman system, delivering NIR excitation was used to examine freshly excised tissue [62]. A logistic regression diagnostic algorithm detected microcalcifications with a sensitivity of 86% and specificity of 96% and was able to characterise type I and II microcalcifications based on the presence or absence of the 912 cm−1 and 1477 cm−1 (calcium oxalate) or 960 cm−1 (calcium hydroxyapatite) bands. Using the same clinical Raman system, this group later examined biopsy samples (normal and lesions with/without microcalcifications) and developed a single-step algorithm to determine both microcalcification status and overall clinical diagnosis [92]. Their diagnostic algorithm yielded a sensitivity and specificity of 62.5% and 100% for diagnosing BC and an overall accuracy of 82.2% for classifying normal, benign and malignant breast samples.

A non-invasive Raman method for breast microcalcification characterisation would involve the use of instrumentation that can probe these calcifications, at a considerable depth, through different skin layers. However, the Raman signal weakens as the tissue sampling depth increases because the superficial Raman and fluorescence signals overwhelm the system. Thus, the use of Raman methods, capable of removing this fluorescence and enhancing the deep Raman signals, will overcome this and potentially have a role within the detection and evaluation of microcalcifications in the clinic. The application of Kerr-gated RS for depth profiling of microcalcification beneath the surface of chicken breast and fatty tissues as well as normal and cancerous breast tissue has been investigated [63]. Spectra were obtained from type I and II calcification standards through the different types of breast sections at depths of 0.9 mm. This same group also demonstrated the efficacy of Spatially offset Raman spectroscopy and transmission Raman Spectroscopy (TRS) to detect and characterise the chemical composition of calcified material through a 2–10 mm and 16 mm thick block of chicken breast tissue [64, 93]. Coherent anti-Stokes Raman micro-spectroscopy has also been applied to successfully image and distinguish type I and II calcifications buried at a depth of 2 mm between chicken tissue [94]. Using a breast phantom, constructed from porcine skin, adipose and muscular tissue at a clinically relevant depth of 27 mm, an improvement on the original TRS approach, it was possible to detect and identify the composition of calcification standards [95]. More recently, an advanced TRS instrument capable of detecting calcifications similar to those seen in patient samples, in soft tissue at a 40 mm tissue depth has been developed [96].

These studies have established that deep Raman can identify and distinguish both type I and II microcalcifications through up to 40 mm of tissue. However, to have utility in a clinical setting, it must also be able to distinguish benign and malignant type II microcalcifications. Transmission Raman methods can differentiate benign and malignant type II microcalcifications based on carbonate content [97]. The level of carbonate substitution in type II microcalcifications is known to vary significantly, depending on whether these are found within benign or malignant tissues [65]. Calcification standards, with different percentages of carbonate substitution, were inserted into a cuvette and buried within porcine soft tissue samples (5.6 mm). The group were able to determine the level of carbonate substitution through the 5.6 mm sample, by monitoring both the band position and width of 960 cm−1 (phosphate), and thus differentiate and probe the composition of type II microcalcifications.

Biofluids

Breast tumours exhibit marked intra-tumoral heterogeneity [98]; tumour cells within one compartment are often molecularly and genetically distinct from those in another. It is difficult to capture the complete tumour landscape in a single tissue biopsy. Accordingly, there is growing interest in the use of liquid biopsies in the early detection and diagnosis of BC as this facilitates the rapid, real-time analysis of the tumour as it evolves.

The first proof-of-concept study for BC diagnosis through fluid biopsy analysis using RS was performed on serum [99]. Using conventional Raman methods combined with PCA and LDA, BC patients’ samples were discriminated from healthy controls with a sensitivity and specificity of 97 and 78%. Using PCA loading vectors, they were able to characterise relevant differences in band position between both groups in seven band ratios, corresponding to proteins, phospholipids and polysaccharides. Using a similar micro-Raman spectroscopy set-up, analysis of serum from healthy and ductal carcinoma BC patients identified bands that differentiated the two types of serum sample; K-means clustering was applied to investigate their significance [100].

Whilst conventional RS methods have demonstrated efficacy for serum-based BC detection, a recent study compared its performance to SERS [101]. Normal Raman and SERS spectra were acquired from each serum sample; samples included those obtained from BC patients of Stage II-IV disease and normal individuals. SERS afforded a better overall diagnostic performance with improvements in both sensitivity and specificity over conventional Raman. A number of additional studies have since been undertaken to demonstrate the efficacy of SERS to accurately classify and discriminate serum (and serum-albumin) from BC patients and healthy controls [24, 102, 103]. In addition to distinguishing normal and cancerous samples, SERS analysis was also able to differentiate patients at different stages of disease, which agrees with an earlier study that used SERS, in combination with multivariate analysis, to distinguish serum samples from localised and locally advanced Luminal A BC patients with a sensitivity and specificity of over 80% [104]. For serum Raman analysis to be adopted for BC screening regimes, serum from BC patients must have its own unique biochemical signature relative to other malignancies. Indeed, serum-based Raman analysis has been able to distinguish BC patients from other malignancies, including: colorectal, lung, ovarian, oral, liver, leukaemia and cervical [105,106,107].

The potential for Raman micro-spectroscopy to aid in the diagnosis and staging of BC based on blood plasma composition has also been explored [108]. When comparing normal and BC samples (stage I-IV), a number of differences between certain Raman bands, corresponding to vibrational modes of numerous biomolecules in the fingerprint, have been observed. Stage II and stage III samples appeared biochemically similar, whereas the spectra from stage IV patients were more distinct. Nonetheless, the group applied PCA-Factorial Discriminant Analysis to their data and were able to differentiate all stages (II, III and IV) of the BC samples from each other as well as from the normal controls with very high sensitivity and specificity. This ability to detect BC through analysis of blood plasma by RS has recently been corroborated [109].

RS has also shown efficacy in probing whole blood samples to distinguish BC patients from normal controls. A PLS regression multivariate model, based on the Raman spectra from BC positive and healthy participants, was not only able to identify potential spectral biomarkers (vibrational modes of lycopene, phosphatidylserine, quinoid ring, calcium oxalate and calcium hydroxyapatite) for BC detection but also differentiate the two groups with a 90% sensitivity and 75% specificity [31].

Saliva is known to host a variety of biomarkers in many conditions [110,111,112,113]; recently, there has been growing interest in salivary biomarkers for early BC detection [114]. The appeal of saliva sampling is strongly attributed to the fact that it is a simple, non-invasive and low-cost procedure. Indeed, several groups have applied SERS to overcome this issue and shown its potential as a medium for BC diagnosis. SERS spectra of purified salivary proteins from normal controls and patients with benign and malignant breast tumours [115]. identified six prominent peaks, corresponding to different bond vibrations of salivary proteins [115]. Most of these peaks’ intensity was significantly different between the three groups. Furthermore, PLS-discriminant analysis, combined with the leave-one-patient-out cross-validation method, allowed them to discriminate salivary proteins with high sensitivity and specificity in all groups.

Elevated salivary levels of sialic acid have been shown to correlate with the presence of BC [116]. The feasibility of quantifying sialic acid levels by applying SERS to healthy and BC saliva samples showed the median sialic acid concentration of the healthy samples (3.5 mg dL−1) was significantly lower than that observed within BC samples (18.5 mg dL−1) [117]. SERS may have utility in the future as a simple test for quantifying sialic acid concentrations for BC diagnosis.

Urine-based RS has also been explored in the context of BC diagnosis. Raman spectra, acquired from the urine (unprocessed and concentrated) of adenocarcinoma-bearing and normal Sprague-Dawley rats, using a fibre optic Raman microprobe (785 nm), revealed intensity changes of several Raman bands between the two groups, including those corresponding to bond vibrations of urea and creatine [118]. PCA and PC-LDA were applied to the spectra of both unprocessed and concentrated urine; these algorithms were able to classify unprocessed urine with a 72% sensitivity and 80% specificity and concentrated urine with a sensitivity of 91% and specificity of 80%. The study also explored the feasibility of urine-based Raman for the early detection of BC by obtaining spectra from rats, prior to any breast tumour development. Indeed, they were able to classify cancerous urine samples, in the very early stages, with 72.5% sensitivity and 83% specificity [118]. Subsequently, its feasibility for BC diagnoses in human subjects has been demonstrated [30, 119].

Lacrimal fluid (tears) is perhaps the most unexpected potential source of biomarker for BC detection. One might assume that changes to the composition of tears only accompanies ocular disorders, however, recently there has been growing interest in investigating tear fluid in the context of systemic diseases, including BC [120]. In comparison to other body fluids, tears are relatively simple in composition [121], hence will require highly sensitive methods to detect their low-abundance analytes. As such, the practicality of SERS-based analysis of human tears for the discrimination of BC patients and healthy controls has recently been demonstrated. Using a portable Raman system, and a leave-one-out cross-validation-assisted PC-LDA identification method, a clinical sensitivity and specificity of 92% and 100%, respectively for detecting the presence of BC was achieved [29].

Current challenges and future outlook for clinical implementation

Capabilities of RS hold promise for future applications in BC detection and diagnosis. However, there are several barriers that remain with regards to its widespread clinical translation.

Good general information is currently available for obtaining spectra from biological samples [122], but, there is a real lack of uniformity regarding protocols amongst studies. This includes: the way the samples are prepared; the substrate on which they are mounted; the spectrometer instrument settings and the computational pre-processing methods applied (Table 2). The use of different protocols for analysing the same samples can result in significantly different spectra. As spectral differences between cancerous and non-cancerous samples can be very subtle, experimental variability may be responsible for disparity amongst studies with regards to spectral biomarkers for the same disease [123] or even lead to the discovery of false biomarkers. Therefore, it is of paramount importance to optimise and standardise the experimental setup, as well as validate its robustness, for future analysis of biological samples.

Table 2 Examples of published data applying Raman spectroscopy in breast cancer, showing lack of uniformity in sample preparation, Raman spectrometer parameters and spectral pre-processing technique.

As we have highlighted, RS has significant potential as a rapid in vivo tool to probe the biochemical changes that accompany malignancy. Over the past few decades, a variety of innovative fibre-optic based Raman probes have been developed to transform Raman from benchtop to bedside [123]. These can be inserted into the working channel of an endoscope for assessment of both hollow and solid organs. To date, there have been a number of large clinical studies conducted for skin [124], oral [125], GI [126], and cervical cancer [127]. However, research outcomes from studying BC, have relied on small sample sizes and thus validation in randomised large-scale clinical studies is warranted.

Moreover, analysing live tissues presents its own challenges. As discussed earlier, the Raman signal is inherently weak, which can be compromised by background emanating from the measurement device or the tissue itself. Most probes that have been developed have NIR excitation, which can limit autofluorescence interference. Furthermore, many of these probes are based on silica optical fibres, signals from which can overwhelm the tissue spectrum. Prolonging integration times or the use of higher laser powers are often used to improve signal to noise, when analysing resected tissue samples. However, this can be impractical and potentially unsafe for in vivo applications. Indeed, one of the most important considerations is the laser-tissue interaction. RS is generally considered non-destructive, however, the long-term effects of repeated irradiation on tissue architecture are unclear; rigorous safety testing is imperative, prior to clinical implementation. In addition to safety, its clinical performance value must also be demonstrated; either comparable with or outperforming the existing standard of care and if there would be economic value to health care systems in adopting such Raman devices into routine practice.

The coupling of RS with multivariate analysis provides a means to processing and analysing the rich information that is contained within the Raman spectra. However, RS is somewhat unknown to many clinicians and thus the complex and laborious task of analysing the large volume of data may serve as a hinderance and disrupt the standard workflow. Therefore, it is essential that fully automated spectral diagnostic frameworks are created using machine learning and/or artificial intelligence approaches that can be easily interpreted to help in clinical decision making. Not only is the ease of interpreting the output imperative for translation to the clinic but so too is obtaining this data and operating the instrument to ensure repeatability and reproducibility amongst centres. Moreover, it is imperative that clinicians are competent in interpreting this data in context. Every patient is different; some may have underlying fibrocystic breast disease and others may have had previous radiotherapy or surgery, leading to tissue damage and scarring. As such, to embrace the new era of personalised medicine, the clinician must be able to interpret the Raman results on an individual basis.

The COVID-19 pandemic has seen widespread disruption to multiple aspects of cancer care including the suspension of mammographic screening programmes and the deferral of routine diagnostics. Consequently, cancers that would otherwise have been detected early could now be allowed to progress and become more difficult to treat. Even before the pandemic, remote health monitoring, in the form of home and wearable optical technologies, has been rising in popularity. COVID-19 may have heightened their appeal, particularly in the medical space. Currently, these technologies use only a fraction of the physiological data that can be accessed by optical sensors, such as for heart rate and blood glucose monitoring. However, the potential exists to leverage portable and remote optical technologies for oncological applications [128] in the COVID and post-COVID landscape. Low-cost, handheld portable Raman spectrometers have already demonstrated capability for in vivo detection of skin cancers [129]. Moreover, a proof-of-concept study demonstrated the potential of leveraging a mobile phone camera and computing system to incorporate a Raman spectroscope into a phone [130]. Using a standard mobile phone camera as a detector for a Spatial-Heterodyne (miniature) Raman spectrometer (SHRS), spectra from ammonium nitrate and sodium sulphate could be obtained. The spectra obtained by this device were comparable to that obtained with a miniature SHRS with high quality optics and a CCD detector, although, as expected, with a lower SNR and signal intensity in the former. Refinements to this type of device could lead to the development of miniature, affordable, high-throughput Raman spectrometers that could eventually be trialled in the medical realm to transform at-home health care delivery. As highlighted recently, the management of BC has been modified during the pandemic, potentially impacting on delays to diagnosis [131]. Although not a replacement to conventional diagnosis, RS has potential to be used in easily accessible regions, like the breast, as a supplementary screening tool to identify, prioritise and streamline the most at-risk individuals for further testing.