Introduction

The conventional tissue genotyping in cancer diagnosis is traditionally considered to be the standard method for identifying genomic alterations. However, this methodology is useless in the case of insufficient tumor tissue, and in follow-up of treatments. Liquid biopsy-based genotyping from serum/plasma or other body fluids coupled with advanced molecular technologies can provide non-invasive method for genetic analyses. These samples contain cell-free nucleic acids (cfNAs) which are valuable markers in different diagnostic protocols like prenatal diagnosis of genetic diseases, detection of cancer and cardiovascular diseases (Fig. 1).

Fig. 1
figure 1

Liquid biopsy scheme. Human blood contains cells, extracellular membrane vesicles (EMVS), cell-free DNA (cfDNA), cell-free RNA (cfRNA) and proteins that can be used as biomarkers for various diseases

History of cell-free nucleic acids

The cfNAs are present in biological fluids independent of the cells. They were first described by Mandel and Métais [1] in 1948, but somehow their discovery was not noticed at that time. Almost 20 years later, Tan et al. [2] observed high concentration of cfNAs in blood samples of patients with systemic lupus erythematosus. Steinman [3] estimated the cell-free DNA (cfDNA) concentration to be 10–30 ng/mL in healthy adults. Leon et al. [4] detected cfDNA in malignant diseases and they proposed to use it as a prognostic factor to determine the efficacy of cancer treatments. Unfortunately, at that time, molecular methods were not advanced enough for that purpose. Decreasing concentration of cfDNA was detected in those cancer patients who were cured [4]. Later, Sorenson et al. [5] isolated cell-free DNA from healthy individuals and cancer patients in 1994. In oncology, cfNAs did not catch the attention of the scientists massively. It was different for prenatal diagnosis, when Lo et al. [6] demonstrated circulating fetal DNA (cffDNA) in the maternal plasma, detecting the presence of Y chromosome-specific DNA fractions and RhD in 1997, introducing a novel field in prenatal care—the non-invasive prenatal diagnostics. Although determination of aneuploidies from cffDNA was set as the ultimate goal in the field of non-invasive prenatal diagnosis, we had to wait another 18 years for the possibility of non-invasive detection of trisomy 21, until next-generation sequencing (NGS) technologies were introduced and became widely available [7]. It opened a new horizon for wider use of NGS for other trisomies, subchromosomal aberations and even monogenic disorders. The success of use of NGS-based sequencing of cffDNA in prenatal testing raised the interest of researchers and clinicians dealing with oncology, cardiovascular diseases etc., and there are numerous applications in the diagnosis of these diseases [8].

Most of the reviews about cfNAs deal only with DNAs, only few with RNAs (microRNA (miRNA), non-coding RNA (ncRNA)) or discuss it only from the aspect of a single disease or one type of cfNA. In contrast, there are numerous cell-free nucleic acid molecules participating in different physiological processes without knowing their exact physiological role yet. Experiments on graft hybrids have shown that cfNAs can go from somatic cells to germinal cells and thus transmit to offspring. This means that the acquisition of new features through cfNA is a reality and could contribute to evolution. Anker and Stroun [9] lean on these facts and suggest that cfNAs may play a role in the evolution mechanism like Darwin’s gemmules. During our literature search, we realized that a comprehensive review of this field is missing and we collected the available information. CfNAs contain nuclear DNA, mitochondrial DNA (mtDNA) and various kinds of RNA molecules, which are listed in Tables 1 and 2.

Table 1 Cell-free DNA molecules in serum/plasma
Table 2 Cell-free RNA molecules in serum/plasma

Cell-free DNA

The cfDNA is highly fragmented molecule to the size of 166 bp, and there are other peaks observable on the electropherograms at around 300 bp and 450 bp size. Volik et al. [10] reported that 90% of total cfDNA is present in low molecular weight band. Duque-Afonso et al. [11] found that majority of cfDNA fragments range between 80 and 200 bp. Diehl et al. [12] reported that the percentage of mutant molecules of APC (adenomatous polyposis coli) gene in colorectal patients increased 5–20-fold and the average fragment size decreased. Enrichment of DNA fragments containing the mutant sequences were reported in cancer patients by several authors, suggesting that they could be used in diagnostic protocols [12].

With idea to uncover the full diversity of cfDNA in the circulation, including ultrashort (<100 bp) double-stranded DNA (dsDNA), single-stranded DNA (ssDNA) and dsDNA with nicks in both strands, Burnham et al. [13] have applied a ssDNA library preparation method to sequenced the cfDNA in the plasma of lung transplant recipients. This method yields a greater portion of sub-100 bp nuclear genomic cfDNA and an increased relative abundance of mitochondrial and microbial cfDNA. Furthermore, they found that the fragmentation profiles of microbial and mitochondrial DNA in plasma are highly similar, indicating that they are exposed to similar degradation processes [13]. The origin of cfDNA is not fully explored, but there are many mechanisms by which the DNA can get into the circulation. The cfDNA fragments contain signals from which the tissue or cellular origin of this DNA can be derived. Using the analysis of these signals, it was proved that cfDNA in the blood plasma of healthy individuals is primarily lymphoid and myeloid origin [14]. This is consistent with finding that main source of cfDNA is apoptosis of hematopoietic cells and the contribution from other tissues is minimal [15]. In oncological patients, a major proportion of cfDNA is formed by apoptosis and necrosis of tumor cells, but the release of DNA from necrotic cells occurs only through phagocytosis [16]. In addition to cell death, neutrophils can mediate the immune response by releasing neutrophil extracellular traps (NETs) that can trap and kill various pathogens [17]. These are extracellular network structures composed of both nuclear and mitochondrial DNA fibers, which are covered by various proteins such as histones and proteases [18]. High levels of NETs have been shown to correlate with levels of circulating DNA, suggesting that DNA released from cells during NETosis is involved in the formation of circulating DNA [19]. Another way of releasing DNA to the circulation is active release of newly synthesized DNA via vesicles and lipoproteonucleotide complexes [20]. Based on the appearance in the circulation, cfDNA molecules can be divided into basic categories: free DNA fragments, vesicle-bound DNA and DNA-macromolecular complexes, which are described here.

Free DNA fragments

Free DNA fragments are naked sequences that are not bound to any other molecules or surface. During cell death, genomic DNA is cleaved and released into circulation, but only DNA that is associated with proteins can resist cleavage by DNase while free DNA fragments are cleaved until completely lost in body fluids [12].

Vesicle-bound DNA

Nucleic acids can be contained in extracellular membrane vesicles (EMVs) where they are protected from degradation [21]. These membrane structures mediate intercellular communication. Based on their size and origin, they are divided into exosomes, microvesicles and apoptotic bodies (ABs) (Table 3 and Fig. 2). Vesicles can be released from both healthy and tumor cells. EMVs that are released from tumor cells contain various oncogenic factors in the form of oncoproteins and nucleic acids and therefore have been named oncosomes. When oncosome is introduced into the recipient cell, these factors can transfer the phenotype of tumor cells to idle cells [22].

Table 3 Main characteristics of the extracellular vesicles
Fig. 2
figure 2

Formation of extracellular membrane vesicles (EMVs). Based on the size and origin, EMVs are divided into exosomes (40–100 nm), microvesicles (50–3000 nm) and ABs (800–5000 nm). EMVs are loaded with cellular content like nucleic acids and proteins. ABs as a product of apoptosis can also carry cellular organelles, e.g., nucleus, mitochondria and endoplasmatic reticulum (ER)

DNA-macromolecular complexes

Circulating nucleosomes are DNA–protein complexes that result from the cleavage of chromosomal DNA during apoptic cell death [23]. These are dsDNA fragments of 180–200 bp that are wrapped around the octameric histone protein complex. Nucleosomes are linked to each other by 20–90 bp DNA and the sequence that is wrapped around the octamer is 147 bp long [17]. DNA that is directly bound to nucleosomes is protected from nuclease digestion, and therefore in circulation it occurs in the form of mono- and oligo-nucleosome fragments [24]. Nucleosome spacing has been shown to vary between different cell types depending on tissue-specific gene expression. This is due to the fact that the nucleosomes are located near the start of transcription, at the sites where the transcription factors bind to the DNA. These factors protect the DNA from nuclease cleavage during apoptosis, resulting in a specific fragmentation pattern or transcription factor footprint that provides information about the tissue of origin of this cfDNA [12].

In the year 1978, Felgner et al. [25] designed DNA/cationic lipid complexes as a highly efficient transfection procedure. They showed that lipid particles spontaneously interact with DNA through the electrostatic interaction and facilitate the delivery of functional DNA into the cells [25]. It was shown that DNA/lipid complexes increased DNA half-life in plasma of mice, suggesting that lipids protect DNA from degradation [26]. Thierry et al. [27] proposed that the formation of DNA/lipid complexes results from the universal tendency of nucleic acids to self-organize with cationic lipids and suggested possible existence of a ubiquitous self-organization of genetic materials.

There are also macromolecular structures made of DNA and lipoproteins that form complexes, called virtosomes. Lipoproteins bounded in virtosomes protect the nucleic acids from degradation by nucleases. Nucleic acids, proteins and lipids involved in the formation of these complexes are newly synthesized molecules that synthetize at approximately the same time. Subsequently, they are actively released from living cells and it is supposed that they play a role in intercellular communication [17]. Virtosomes can enter the recipient cells, incorporate their DNA into the genome and subsequently modify the biology of these cells. These modifications include various immunological changes or transformation of normal cells into tumor cells.

Cell-free mtDNA

Mitochondria play a central role in energy production, cell proliferation and apoptosis. Mitochondrial genome is represented as 100–10,000 copies of circular molecules with the size of 16,569 bp. It codes 37 genes for 13 polypeptides of respiratory chain, 22 transfer RNAs (tRNAs) and 2 ribosomal RNAs (rRNAs). Cell-free mitochondrial DNA (cf-mtDNA) analysis should take into account structural differences between mitochondrial and genomic DNA. Since mtDNA is a small molecule not protected by histones, it is clear that even cf-mtDNA fragments will be different [17]. The size of cf-mtDNA in biological fluids ranges from 30 to 80 bp with peaks in 42–60 bp and some authors report even higher size (Table 1) [28]. Like genomic DNA, mtDNA can also be transported by EMV. However, it is interesting that EMV can harbor full mitochondrial genome that can be transfered to cells with damaged metabolism and thus can restore their metabolic activity. On the other hand, it has been shown that such horizontal transfer of mtDNA by EMV can awaken dormant tumor cells and induce resistance to therapy [29]. Variants and mtDNA copy number variations have been associated with various diseases including type-2 diabetes, atrial fibrillation and cancer. It was demonstrated that these variations are a contributing factor to colorectal, breast and lung cancer and, in addition, these changes can be analyzed by cf-mtDNA [17, 30]. It suggest that cf-mtDNA analysis has the potential as a biomarker for some diseases.

Cell-free RNA

RNAs are relatively unstable molecules that are susceptible to degradation by ribonucleases [31]. Therefore, circulating cell-free RNAs (cfRNAs) are encapsulated within EMV or they form ribonucleoprotein complexes that protect them from nuclease activity [32, 33]. Even though cfRNA is relatively poorly studied, several studies have confirmed that they can be used as appropriate biomarkers for various diseases, including cancer [31]. Several types of coding and non-coding cfRNAs that are involved in the translation, processing and regulation of gene expression are present in body fluids (Table 2).

RNA in the circulation is protected from degradation by EMVs and ribonucleotide complexes, whereby molecules of nucleic acid are selectively packaged according to the viability and origin of the cells [33]. During programmed cell death, ABs containing RNA, DNA, proteins and cellular organelles are released. ABs are the largest EMVs that are released from all types of cells, they are involved in phagocytosis and they play a role in horizontal transfer of oncogenes) (Table 3) [33]. In living cells, RNA is released via exosomes and microvesicles. Exosomes are the smallest EMVs, which could carry DNA, RNA, proteins and lipids and play a role in intercellular communication (Table 3) [34]. Microvesicles are produced by the outward budding of the plasma membrane and they also participate in intercellular communication and contain proteins, lipids, different RNAs and DNAs (Table 3) [33]. It was shown that circulating miRNA can bound to high-density lipoprotein (HDL) or RNA-binding proteins and it is supposed that these nucleoprotein complexes can be released from both living and dying cells [35]. In addition to listed mechanisms, it has been reported that extracellular RNA of exogenous origin from food and human microbiome can get into the circulation [36].

Protein-coding RNA

Cell-free mRNA (cf-mRNA) is fragmented and less abundant and hence its detection is difficult. This is the reason of why a majority of studies focus on the analysis of small non-coding RNAs, especially miRNA that are more stable and abundant, making their detection easier [37]. However, despite these limitations, the presence of extracellular mRNA in circulation of oncological patients was confirmed in 1999 and these results indicated that cf-mRNA may play an important role in the diagnosis and monitoring of cancer. It has been shown that the level of cf-mRNA is increased in oncology patients, and many studies have confirmed that cf-mRNA analysis can serve as a suitable marker for cancer [38].

Non-coding RNA

Long non-coding RNA

Long non-coding RNAs (lncRNAs) are transcripts with length more than 200 nucleotides that exhibit tissue-specific expression and are involved in epigenetic regulation [39]. They have been shown to form a relatively stable secondary structure, and therefore lncRNAs are found in body fluids [40]. Some lncRNAs are selectively loaded into exosomes and can be transferred into other cells where they modulate their functions and viability [41]. In several studies, circulating lncRNAs have been described as suitable biomarkers for cardiovascular and tumor diseases [42].

MicroRNA

The miRNAs are small single-stranded non-coding RNA molecules with length of 18–24 nucleotides. They are important regulators of gene expression, which play a key role in many biological processes [43]. The regulation of gene expression by miRNA proceeds at the post-transcriptional level with the Argonaute 2 (AGO2) protein. It has been shown that extracellular miRNAs can be enclosed in exosomes but also can be associated with AGO2 protein or nucleomorphin 1 (NPM1) which protects them from degradation [35]. Several studies have shown that most cell-free miRNA (cf-miRNA) is present in these ribonucleoprotein complexes and not in vesicular form [44]. In addition to these proteins, RNA can be associated with HDL that is involved in intercellular communication. HDL carries miRNA molecules that can transport to recipient cells, where these miRNAs perform their regulatory function. Although the mechanism of association of HDL with miRNA is not investigated, it is clear that HDL protects bound miRNAs from RNase activity [45]. Experimental studies suggest that the major source of circulating miRNAs in plasma are blood cells, where the level of these miRNAs largely depends on the total number of blood cells and hemolysis. These factors can change the level of known plasma cancerous miRNA biomarkers by up to 50-fold and should therefore be taken into account in the analyses [46].

Transfer RNA

The tRNAs are 73–93-nucleotide-long molecules, which transfer amino acids to the site of protein synthesis. By deep sequencing analysis, 30–33-nucleotide-long tRNA fragments were identified in the blood serum of human and mice [47]. Comparable results were also observed in serum of cattle where 28–40-nucleotide-long tRNA fragments were identified, while 98.7% of circulating tRNAs ranged from 30 to 34 nucleotides [48]. Most of circulating tRNA fragments contain the 5′ end of tRNA, but only a trace amount of 3′ tRNA fragments were detected. It was shown that most of these molecules were present in nucleoprotein complexes and originate from the hematopoietic and lymphoid tissues [49]. Consequently, it is supposed that these circulating 5′ tRNA fragments serve as signaling molecules involved in intercellular communication between hematopoietic and lymphoid tissues [50]. Haussecker et al. [51] suggest that tRNA fragments function as miRNA competitors and thus can affect RNA silencing and subsequently regulation of gene expression.

YRNA

The YRNA with length of 84–112 nucleotides belongs to a group of short non-coding molecules that create stem-loop structures [50]. These molecules associate with Ro60 and form ribonucleoprotein complexes, but their function is poorly studied [52]. By deep sequencing, fragments of 27–33 nucleotides derived from YRNA were identified in human blood serum and plasma. YRNA has been shown to undergo cleavage, resulting in stable fragments derived from the 5′ end and less stable 3′ fragments of the YRNA. Therefore, up to 95% of all detected YRNAs are fragments of the 5′ YRNA. In the circulation, these fragments were detected only in nucleoprotein form but not enclosed in exosomes or microvesicles. Further research is needed to verify whether changes in the expression of individual YRNAs are associated with some diseases and can be used as biomarkers [47].

PIWI-interacting RNA

Piwi-interacting RNAs (piRNAs) are short non-coding 26–31-nucleotide-long RNA molecules, which mostly originate from genomic regions enriched with repetitive transposon sequences [49]. These molecules associate with the Argonaute protein PIWI to form a complex that plays a role in transposon silencing through CpG methylation, chromatin remodeling and RNA transcript degradation [53]. In addition, piRNA is involved in gametogenesis, development and maintenance of germ cells and epigenetic regulation [54]. It has been reported that piRNA also participates in the epigenetic regulation of cancer and various diseases [55]. Several studies have confirmed that some of the piRNA molecules have altered expression in tumor tissue compared to healthy samples [56]. It has been shown that piRNA is stable in blood serum and plasma inside of exosomes and it is thought that piRNA is also present in the form of nucleoprotein complexes as well as miRNA [57]. Therefore, piRNA could serve as a valuable blood-based biomarker for the detection and monitoring of cancerous disease [55].

Circular RNA

Circular RNAs (circRNAs) are endogenous non-coding RNA molecules that are very stable and evolutionarily conserved [58]. These molecules exhibit tissue-specific expression, which also depends on the developmental stage. Sequence analysis has shown that more than 90% of circRNAs originate from exon sequences and are likely to have an important function in post-transcriptional regulation of gene expression [35, 59]. This process is carried out in such a way that the circRNA complementarily binds to the miRNA molecules and thus affects their regulatory activity [60, 61]. Bioinformatic analysis has shown that circRNAs are not produced by chance, but genes coding for kinases are the most often subjected to circularization. Interestingly, in many cases circularization is more frequent than production of mRNA [62]. circRNAs are located predominantly in the cytoplasm, but it has been shown that human serum contains a large number of intact and stable circRNA molecules. Like other RNAs, circRNA retains its biological activity in exosomes, which can be transferred to recipient cells. It has been reported that circRNAs are involved in the development of multiple diseases such as atherosclerosis, nervous system disorders or cancer [61]. Research on stomach and colon cancer has shown that circRNA released from tumor tissue can be used for diagnostic purposes and therefore appear to be potential biomarkers for cancer [61].

Other non-coding RNAs

Most studies focus on miRNA, and other short sequences are considered as degradation products of different RNAs and hence they are excluded from the analysis. However, some small non-coding RNAs have been shown to undergo processing to even smaller fragments that can perform various biological functions [63]. In addition to the listed types of RNAs, other non-coding RNA molecules such as rRNA, small nuclear RNA (snRNA) and small nucleolar RNA (noRNA) have been identified in human blood. However, these nucleic acids are present in amounts less than 1% of all small RNAs [47]. This suggests that there are many other types of RNA molecules in human blood that could be valuable biomarkers for various diseases (Table 2).

Future perspectives

CfDNAs are already in the clinical practice as non-invasive prenatal testing. A large portion of the invasive diagnostic sampling (amniocentesis and chorionic villus sampling) has already been replaced and the fetal loss and maternal psychological stress have been tremendously reduced. Fetal mRNA-based determinations have been published (placental lactogen, human chorionic gonadotropin, chromosome 21-encoded mRNAs), and they also have potential clinical application [8].

Diagnosis of monogenic disorders and prediction of congenital heart diseases, preeclampsia, intrauterine growth retardation and gestational diabetes using microRNAs or other non-coding RNAs are already in research phase and they could serve in the clinical practice soon.

Tumor-derived plasma cfNAs are available by liquid biopsy. Commercial kits produced by different manufacturers seems to provide valuable information to the clinicians for tumor classification and treatment monitoring. Plasma DNA tissue mapping or CancerLocator gives new perspectives in cancer diagnosis [64]. More specific and sensitive platforms are under development, or in trial phase. Application of this technology is revolutionizing cancer diagnosis and treatment.

Cardiovascular diseases are the main cause of mortality and morbidity worldwide. The cfNA-based diagnostic methods give new avenue in their clinical use. There is an intensive research on this area, but large population-based studies are missing. Further new clinical application and expansion in this area will likely be developed.