Agreement between commercially available ELISA and in-house Luminex SARS-CoV-2 antibody immunoassays

Serological diagnostic of the severe respiratory distress syndrome coronavirus 2 (SARS-CoV-2) is a valuable tool for the determination of immunity and surveillance of exposure to the virus. In the context of an ongoing pandemic, it is essential to externally validate widely used tests to assure correct diagnostics and epidemiological estimations. We evaluated the performance of the COVID-19 ELISA IgG and the COVID-19 ELISA IgM/A (Vircell, S.L.) against a highly specific and sensitive in-house Luminex immunoassay in a set of samples from pregnant women and cord blood. The agreement between both assays was moderate to high for IgG but low for IgM/A. Considering seropositivity by either IgG and/or IgM/A, the technical performance of the ELISA was highly imbalanced, with 96% sensitivity at the expense of 22% specificity. As for the clinical performance, the negative predictive value reached 87% while the positive predictive value was 51%. Our results stress the need for highly specific and sensitive assays and external validation of diagnostic tests with different sets of samples to avoid the clinical, epidemiological and personal disturbances derived from serological misdiagnosis.

In the context of an ongoing pandemic, the importance of accurate diagnostic methods for disease control and elimination has been underpinned. Coronavirus disease 2019 (COVID- 19) serological diagnosis based on antibody detection against the severe respiratory distress syndrome coronavirus 2 (SARS-CoV-2) is a valuable tool that allows for the determination of immunity development and surveillance of exposure to the virus 1,2 . In the case of COVID-19, it is thought that antibodies mediate protection via a myriad of functions 3 . Therefore, good serological tests are required not only for epidemiological surveillance and policy implementation, but also for helping elucidate the mechanisms involved in protection and the susceptibility to reinfection after exposure to the virus 1,2 . Additionally, serological tests may also be useful for establishing vaccine-induced protection and finding blood donors that qualify to obtain plasma that could be used as a treatment for severe COVID-19 patients 1,2,4 .
The definition of a good serological test in terms of technical performance is based on the values of specificity (SP) and sensitivity (SE). The clinical relevance of a serological test is defined by the -negative (NPV) and positive (PPV) predictive values. These last two parameters depend on the prevalence of the disease, while theoretically SP and SE do not.
Since achieving a high score of both SP and SE is generally difficult, prioritizing one of them during the test development process has personal, social and clinical implications, especially in the context of a pandemic where the prevalence of disease might not be correctly defined.
Several antibody detection tests for SARS-CoV-2 are available in the market, such as rapid diagnostic tests (RDT), enzyme-linked immunosorbent assays (ELISA), neutralization assays, and chemiluminescent immunoassays (CLIA) 5 . Here we compared the performance of a commercially available COVID-19 ELISA (Vircell Microbiologists, Granada, Spain) with the performance of an in-house fluorescence-based, high-throughput and multiplex Luminex immunoassay 6 . The Vircell COVID-19 ELISA (from now on ELISA) detects specific IgG or IgM and IgA together (IgM/A) against the nucleocapsid (N) and spike (S) antigens adsorbed on solid phase 7 . The Luminex immunoassay has been optimized for the detection of specific IgG, IgM and IgA separately against a multiplex panel of 5 antigens adsorbed on magnetic beads that are in suspension 6 . www.nature.com/scientificreports/ Thanks to the simplicity of the method and its automatization, ELISA is used in the clinical practice [8][9][10][11][12] . Internal validation from Vircell, S.L. reports a 88% SE and 99% SP for IgM/A assay and a 85% SE and 98% SP for the IgG assay 7 . External validation of widely used tests during this COVID-19 pandemic is essential to assure correct diagnosis and epidemiological estimations. We have previously reported that the in-house Luminex assay reaches up to 100% SP and 95.78% SE 6 . Our aim is to report the performance of the ELISA IgG and IgM/A externally assessed using our highly specific and sensitive Luminex immunoassay.

Methods
Study population and sample selection. We analyzed 283 samples from peripheral blood from pregnant women and cord blood (Table 1). Of those, 168 mothers belonged to a larger cohort of pregnant women whose samples were collected in the context of a study focused on the effects of COVID-19 on pregnancy outcomes. The study population and sample collection methods have been described elsewhere 13 . The selection of samples in this report was based on the results obtained by ELISA and the suspicion of low SP by the IgM/A assay. In particular, it was the detection of 3 IgG and IgM/A positive tests in cord blood samples paired with mothers who had evidence of past and/or present infection that led to further investigation. While IgG crosses the placenta efficiently, a very limited amount of IgA passes from mother to fetus and IgM does not cross the placenta during normal pregnancies 14 . As a result, the presence of an IgM/A positive test suggested congenital COVID-19 or false-positive results. This suspicion led to further analyze these 3 cord blood samples with a more specific and sensitive immunoassay such as Luminex. The discovery that these 3 samples were negative by Luminex suggested that maternal samples could have also been affected by a lack of SP. Therefore, the selection included the aforementioned IgG + and IgM/A + cord blood samples (n = 3) together with IgG-and IgM/A-(n = 23), IgG + and IgM/A + (n = 23), IgG + and IgA/M-(n = 2), and IgG-and IgM/A + (n = 120) samples from pregnant women. The remaining samples were from healthy pre-pandemic pregnant women (n = 73) and cord blood (n = 39) ( Table 1) Briefly, samples were diluted and, in the case of the IgM/A assay, they were then incubated with an IgG sorbent to eliminate IgG from plasma and any possible interference. Then, in both assays, samples and controls from the kit were incubated at 37 °C for 45 min. After a washing step, they were incubated with peroxidase-conjugated detection antibodies (anti-human IgG or anti-human IgM + IgA) at 37 °C for 30 min. After another wash, they were incubated with a substrate solution for 20 min in the dark. Finally, a stop solution was added and the optical density was read at 450 nm. The cut-off value was established based on the manufacturer's procedure. The ELISA IgG and IgM/A assays are based on the detection of specific antibodies against the N and S antigens adsorbed on a solid surface. Quantification is based on optical density after the reaction of the enzyme-linked secondary antibodies in contact with a substrate, and it is detected in a spectrophotometer.
Luminex. The in-house Luminex immunoassay has been optimized for the detection of specific IgG, IgM and IgA separately against a multiplex panel of antigens adsorbed on magnetic beads that are in suspension 6 . Each magnetic bead region is characterized by a unique mix of two fluorochromes that allows their identification by laser excitation. Each antigen is coupled to a specific bead region and the multiplex panel included: the full-length N (N FL) antigen, a specific C-terminal region of N (N CT) 15 , the full-length S, the subunit 2 from the S antigen (S2) and the receptor binding domain (RBD) in S1. Quantification is based on the detection of fluorescence emission by a phycoerythrin-labelled secondary antibody and it was detected in a Luminex FLEXMAP 3D instrument (Luminex Corporation, Austin, Texas, United States of America).

Results and discussion
Moderate to high agreement between specific IgG measured by ELISA and Luminex. Most of the IgG positive samples detected by ELISA were also classified as IgG positive by each of the antigens included in the Luminex panel ( Fig. 1 and Table 2), and many of the IgG negative samples classified by ELISA were also classified as IgG negative by each of the antigens included in the Luminex panel.  Table 2). The antigen from the Luminex panel with the highest percentage of both positive and negative agreement for IgG was S2, which is part of the S antigens included in the ELISA.
Considering the classification obtained by the combination of IgG responses against all antigens included in the Luminex multiplex panel, the positive agreement was 54% while the negative agreement was 84.03% (Table 2). This resulted in 23 samples (all from mothers) with discrepancy for IgG: 1 sample Luminex negative and ELISA positive (LM-E +) and 22 samples Luminex positive and ELISA negative (LM + E−) ( Fig. 1   www.nature.com/scientificreports/ not PCR tested, 1 sample was asymptomatic and PCR positive, 1 sample was symptomatic 1-2 months prior to sample collection and PCR negative, and 1 sample was symptomatic during the previous 7 days to sample collection and PCR positive. The only LM-E + sample was asymptomatic and PCR negative. For IgG, it is plausible that 19 patients without symptoms or a positive PCR showed detectable IgG levels by Luminex since this immunoglobulin generally appears later in the immune response once the viral load has decreased and could be under the limit of PCR detection. Additionally, the fact that they were asymptomatic also suggests lower viral loads 16 . Increased SE of the Luminex technology compared to ELISA might be the reason for this discrepancy. The 3 cord blood samples had concordant results between Luminex and ELISA. This most certainly reflects transplacental transfer of maternal IgG, since the 3 mothers had detectable levels of specific IgG detected by both tests and IgG is able to cross the placenta 14 . Since the in-house Luminex assay has previously been assessed and demonstrated an excellent performance 6 , we used the results obtained by Luminex as the gold standard to evaluate the performance of ELISA (Table 3). The usage of Luminex as gold standard is further supported by the inclusion of several antigenic regions and three immunoglobulin isotypes that allow capturing a variety of immunological responses that cannot be attained by a PCR test, for example. Therefore, we evaluated the ELISA serological test based on a serological reference (Luminex) assay. For IgG, the SE was 55%, while the SP was 99%. As for the PPV and NPP, the percentages were 96% and 85%, respectively. The performance for IgG, especially the SE (90%), improved when the analysis was performed in a subset of samples with > 15 days since onset of symptoms (Supplementary Table S1). Taking into   www.nature.com/scientificreports/ account only the subjects with a PCR test done (N = 110), for IgG, the SE was 62%, while the SP was 90%. As for the PPV and NPP, the percentages were 44% and 95%, respectively.

Low agreement between specific IgA/M measured by ELISA and IgA and IgM measured by
Luminex. Almost all the IgM/A negative samples detected by ELISA were also classified as IgM or IgA negative by each of the antigens included in the Luminex panel, with 8 exceptions that correspond to 3 samples ( Fig. 1 and Table 2). On the contrary, although a proportion of IgM/A positive samples classified by ELISA were also classified as positive by each of the antigens included in the Luminex panel, many of them were classified as IgM or IgA negative by Luminex ( Fig. 1 and Table 2). The percentage of agreement for positive samples ranged from 4.79% for N CT to 16.44% for N FL, and from 8.90% for N CT to 21.23% for S2, for IgM and IgA, respectively. The percentage of agreement for negative samples ranged from 15.19% for S2 to 17.01% for N FL and from 15.60% for S to 17.12% for N FL, for IgM and IgA, respectively ( Table 2). In this case, the antigens from the Luminex panel with the highest percentage of both positive and negative agreement were N FL for IgM and N FL and S2 for IgA (Table 2).
Considering the classification obtained by the combination of all the antigens included in the Luminex multiplex panel, the positive agreement was 23.29% for IgM and 30.14% for IgA, while the negative agreement was 17.65% for IgM and 17.74% for IgA ( Table 2). As a consequence, there were 113 (LM-E + : 109 mothers and 3 cords; LM + E−: 1 mother) and 105 (LM−E + : 99 mothers and 3 cords; LM + E−: 3 mothers) discrepant samples for IgM and IgA, respectively ( Fig. 1 and Table 2). Among the LM-E + discrepant samples from mothers, 95 out of 109 for IgM and 86 out of 99 for IgA were asymptomatic and either PCR negative or were not PCR tested. For both IgM and IgA, 1 sample was PCR positive and asymptomatic, and the remaining discrepant samples were all symptomatic with either positive PCR (N = 3 for IgM and N = 3 for IgA), negative PCR (N = 8 for IgM and N = 7 for IgA) or a PCR was not performed (N = 2 for IgM and N = 2 for IgA). The only LM + E− sample for IgM was asymptomatic and PCR negative. Among the 3 LM + E− samples for IgA there were 2 asymptomatic, one with a negative PCR and one without PCR, and 1 symptomatic with a negative PCR. The 3 cord samples had discrepant results. There were no detectable levels of specific IgM and IgA against any of the antigens measured by Luminex, while ELISA gave a positive result. The corresponding mothers were: 1 LM + E + for both IgM and IgA, 1 LM-E-for IgM and LM + E− for IgA, and 1 LM + E− for both IgM and IgA. In normal pregnancies, transplacental transfer of IgA is very limited and IgM does not cross the placenta 14 but, in cases with severe infections, significant levels of IgA and IgM can cross the placenta 17 . In this study, none of the 3 mothers had severe COVID-19, which, together with the Luminex negative results, suggests that the results obtained by ELISA are false positive. However, vertical transmission of SARS-CoV-2 and the subsequent induction of immunoglobulins in the fetus should not be completely ruled out in cases of positive mothers, since transplacental transmission has been demonstrated in a severe COVID-19 pregnant case 18 .
Regarding the performance of ELISA (Table 3) using Luminex results as gold standard, the SE was 97% for IgM and 94% for IgA, while the SP was as low as 18% for both IgM and IgA due to the many false-positive results. As for the clinical performance parameters, PPV was 23% for IgM and 30% for IgA, while the NPP was 96% for IgM and 88% for IgA. In the subset of samples with > 15 days since onset of symptoms, the performance for IgM and IgA supports a poor performance in terms of SP (31% for IgM and 27% for IgA) and PPV (47% for IgM and 53% for IgA) (Supplementary Table S1). Regarding the performance of ELISA based on subjects with a PCR test done, IgM/A reached a 100% SE while the SP was as low as 23% due to the many false-positive results. As for the PPV and NPP, the percentages were 15% and 100%, respectively.
The vast majority of discrepant samples (LM−E +) were asymptomatic and either PCR negative or not tested. In other words, the pretest probability of infection was very low and supports the classification of these discrepant samples as false-positive results. At the moment, there is no recognized standard serological test to assess the sensitivity and specificity of other serological assays. Generally, neutralization assays and PCR tests have been considered as the gold standard in other studies [19][20][21][22][23][24][25] , but other assays such as CLIA or immunofluorescence assays (IFA) have also been used 26,27 . In this case, the use of PCR results as gold standard showed a very similar technical performance (SE and SP) compared to that obtained by using Luminex as stated above, although the set of samples used was slightly different since not all samples were from individuals with a PCR test done. In fact, the use of an assay as a surrogate "gold standard" is accepted once it has shown a high SE and SP 2 .
Highly imbalanced technical and clinical performance for ELISA. Considering the results based on seropositivity by any of the isotypes included in the immunoassays (Table 3), the calculation of technical performance metrics resulted in a highly imbalanced performance for ELISA, with 96% SE at the expense of a 22% SP. As for the clinical performance, the NPV reached 87% while the PPV was 51%.
When the prevalence of infection is low, the PPV of a test strongly relies on a high SP. For example, given a 95% SE and 5% prevalence, a decrease in SP from 100% to 95% would result in PPV dropping from 100% to 50%. On the contrary, in a scenario of 95% SE and 20% prevalence, a decrease in SP from 100% to 95% would result in a less steep fall of PPV from 100% to 83% 28 . Conversely, changes in SE within the same context of prevalence do not affect so markedly the PPV and NPV.
In the case of ELISA, the low SP and PPV that we report here have important implications if this test is to be used in hospitals for their screenings. In particular, a 51% PPV implies that almost half of the people with a positive serological test would not have passed the disease. Therefore, besides the overestimation of exposure to the virus, this would have an impact on the behavior of the people. Although people with positive serological tests are advised to keep the self-isolating measures, it is likely that they get a feeling of (false) protection and expose themselves more easily to the virus, increasing the chances of infection. In addition, false-positive diagnoses may lead to unnecessary treatment and psychological distress. To avoid false-positive results and their  24 , and 71-100% 23 . The SE increased with days since onset of symptoms and disease severity 23,24,30 . In our case, the SE for IgG does not reach the highest value reported by any of these papers even after stratification by > 15 days since onset of symptoms, although it improved, as expected, from a 55% to 90% SE.
A limitation of this study was the small sample size of women with information about onset of symptoms. Ideally, performance validation should be done stratifying by days since onset of symptoms due to the kinetics of antibody production, which is delayed with respect to the onset of the infection. This is especially relevant in the case of IgG, which is the last immunoglobulin to develop during the course of an immune response. Therefore, the performance of a test highly depends on the time passed since the onset of infection, which can be monitored by the onset of symptoms or time since positivity of PCR in asymptomatic cases. In fact, our Luminex assay and others have demonstrated that performance reaches excellent levels at > 10 to > 14 days since onset of symptoms 6,23,24,30 . In this report, despite the small sample size, IgG SE assessment after stratification by days since onset of symptoms also reached an acceptable level of 90%.
Concerning IgG SP, the same studies report the following ranges: 53% 24 , 90% 25 , 83-95% 23 , 96% 22 and 97% 30 . None of these values are as high as the 99% SP reported here, although most of them reach a considerably high SP.
With reference to IgM/A, the reported SE are as follows: 76-96% 24 , 29-100% 30 and 77% 22 . In this case, SE also increased with days since onset of symptoms and disease severity 24,30 . The SE that we report here for IgM/A is concordant with the highest levels of those results. As for the SP, only one study reports a relatively high SP of 83% 22 , while the other two inform an SP of 23% 24 and 46% 30 . These values are not as low as what we report here but also show a very poor performance due to an elevated number of false-positive samples. In fact, IgM is wellknown for being a source of false-positive results in immunoassays for many other infectious diseases due to its high non-specific reactivity, caused by cross-reactivity with other pathogens and the formation of rheumatoid factor 31 . However, the latter is solved in the ELISA IgM/A from Vircell by the use of an IgG sorbent that removes rheumatoid factor complexes.
One explanation for the disagreement observed in some cases could be the source of biological samples. One study has reported an unsatisfactory performance of an immunochromatographic IgM/IgG rapid test in pregnant women due to an elevated false-positive rate, and argues that the complexity of the immunological changes during pregnancy might be the underlying reason 26 . Additionally, detection of cross-reactive antibodies generated by other coronaviruses or infectious diseases has been described and could be a source of false-positive results not only for IgM 15,24 .
In conclusion, we show a very low SP for the ELISA IgM/A compared to the high values reported by Vircell. In addition, the SE for the ELISA IgG was also lower than expected, although this value would probably be higher if only samples with > 14 days since onset of symptoms were considered. Our results stress the need for highly specific and sensitive assays and external validation of diagnostic tests with different sets of samples.

Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.