PIGG defines the Emm blood group system

Emm is a high incidence red cell antigen with eight previously reported Emm− probands. Anti-Emm appears to be naturally occurring yet responsible for a clinically significant acute hemolytic transfusion reaction. Previous work suggests that Emm is located on a GPI-anchored protein, but the antigenic epitope and genetic basis have been elusive. We investigated samples from a South Asian Indian family with two Emm− brothers by whole genome sequencing (WGS). Additionally, samples from four unrelated Emm− individuals were investigated for variants in the candidate gene. Filtering for homozygous variants found in the Emm− brothers and by gnomAD frequency of < 0.001 resulted in 1818 variants with one of high impact; a 2-bp deletion causing a frameshift and premature stop codon in PIGG [NM_001127178.3:c.2624_2625delTA, p.(Leu875*), rs771819481]. PIGG encodes for a transferase, GPI-ethanolaminephosphate transferase II, which adds ethanolamine phosphate (EtNP) to the second mannose in a GPI-anchor. The four additional unrelated Emm− individuals had various PIGG mutations; deletion of Exons 2–3, deletion of Exons 7–9, insertion/deletion (indel) in Exon 3, and new stop codon in Exon 5. The Emm− phenotype is associated with a rare deficiency of PIGG, potentially defining a new Emm blood group system composed of EtNP bound to mannose, part of the GPI-anchor. The results are consistent with the known PI-linked association of the Emm antigen, and may explain the production of the antibody in the absence of RBC transfusion. Any association with neurologic phenotypes requires further research.

www.nature.com/scientificreports/ Here we report the serologic and molecular investigation and whole genome sequencing (WGS) of samples from a South Asian family in which two brothers were found to be Emm− with anti-Emm present in the plasma. The apparent genetic cause for the Emm− phenotype in this family was found to be a 2-bp deletion in the PIGG gene which causes a frameshift and predicted premature stop codon [c.2624_2625delTA, p.(Leu875*)]. PIGG encodes an enzyme, ethanolamine phosphate transferase 2, that modifies the second mannose of the GPI-anchor by adding ethanolamine phosphate (EtNP). PIGG deletion mutations were also present in samples from two additional unrelated probands of Japanese and European extraction previously investigated in our laboratory, and in two generations of a North African family with two sisters who also had Emm− RBC phenotypes with anti-Emm identified during pregnancy. Mutations included homozygous deletion of PIGG Exons 2 and 3, deletion of Exons 7 thru 9, and insertion deletion in Exon 3. Lastly, two siblings reported previously and known to have loss of function PIGG mutations 11 were found to have Emm− RBC phenotypes.

Methods
Human participant statement. All methods were carried out in accordance with relevant guidelines and regulations. Serologic typing and antibody identification had been previously performed as part of routine clinical testing. All other experiential protocols including targeted sequencing and WGS were performed on archived samples with approval from the Mass General Brigham HealthCare Human Research Committee (IRB), which is the umbrella organization IRB that oversees the Brigham and Women's Hospital. The IRB approved access to previous clinical results and for new experiments on archived samples under an excess clinical sample protocol, which was deemed to be minimal risk given no direct patient involvement and thus was exempt from obtaining informed consent.
Samples and serologic reactivity of anti-Emm. Blood samples were collected in EDTA. RBC antigen typing and antibody identification was performed by standard tube methods 12 . Genomic DNA was isolated by standard methods (QIAamp, QIAGEN, Inc. Valencia, CA) from fresh samples and frozen RBC samples.
Briefly, interaction between antibody and red cells is observed as agglutination and often requires use of a secondary anti-human globulin reagent. Serum or plasma containing the antibody and red cells of known phenotypes are incubated at 37 °C, centrifuged, and examined for agglutination. For the indirect antiglobulin test (IAT), following incubation the cells are washed with phosphate buffered saline to remove unbound immunoglobulins, and antihuman globulin is added, centrifuged, and examined for agglutination. For in vitro enhancement of the interaction of red cell antigens and antibodies, low ionic strength solution (LISS) or Polyethylene glycol (PEG) are added prior to incubation. Treating the test cells with proteolytic enzymes such as papain, ficin, or trypsin can be used to enhance antibody-antigen reactions or, in addition to dithiothreitol (DTT), can be used to identify the specific antibody target based on known sensitivity or cleavage pattern of the red cell protein.
Proband 1, a 65-year-old group AB, D+ Indian male with heart disease, presented with an antibody that reacted with all panel cells tested but did not react with autologous RBCs. The reactivity in saline at 4 °C and by the IAT with LISS enhancement was 2+, with 3+ reactivity by IgG gel test. The antigen being detected was resistant to treatment with papain, trypsin α-chymotrypsin and dithiothreitol (DTT); papain treatment enhanced the reactivity. Testing of the RBCs with antibodies to high prevalence antigens from our collections showed that his RBCs were Emm-. His plasma was nonreactive with three examples of Emm-RBCs from our collections: two Caucasian U.S. patients 4 and one from the Japanese Red Cross 5 . Testing of the family revealed a compatible younger brother whose RBCs were Emm-and although he too had never been transfused, his plasma contained anti-Emm (1+ by IgG gel). RBCs of the proband's daughter, his son, and wife were Emm+ and incompatible when tested with the proband's plasma.
DNA was extracted from stored RBCs of Proband 2, who was previously reported in abstract form in 2013 5 . This 58-year-old Japanese man with total blindness and renal carcinoma, who had never been previously transfused, was in urgent need of transfusion due to massive bleeding. An antibody reacting 2+ with all cells tested was detected in the plasma, but his clinical condition required transfusion of crossmatch-incompatible blood. He experienced an acute hemolytic transfusion reaction (HTR); his RBCs reacted in the DAT: 1+ on day 1, 2+ on day 3, and negative on day 7, suggesting complete removal of the transfused RBCs from circulation. The antibody was identified as anti-Emm and was shown to have both IgG1 and IgG3 components and to fix complement. The titer of the antibody increased from 16 (saline-IAT) pre-transfusion to 128 by day 10.
DNA was extracted from stored RBCs of Proband 3, who was previously reported by our laboratory in an abstract in 1998 4 . This 70-year-old untransfused male of European ancestry was admitted for transurethral resection of the prostate (TURP) and, like other Emm-probands presented with an antibody that was reactive with all cells tested but nonreactive with autologous RBCs. The antibody reactivity was strong: 3+ to 4+ by albumin and PEG IAT and 4+ when tested against enzyme treated and DTT treated RBCs.
Proband 4 and her family are of North African origin 6 . In 2012 this 26-year-old female presented at week 25 of her first pregnancy with an antibody against a high frequency antigen resistant to trypsin, ficin, DTT, chymotrypsin and AET. The antibody reacted 2+ in the PEG IAT, and weakly positive in the LISS gel technique, with a negative auto control. The antibody was identified as anti-Emm and her RBCs were Emm−. She had never received a blood transfusion. Her plasma was incompatible with RBCs from her husband, parents, and six of her seven siblings. One sister was found to be compatible and was also Emm−.
Proband 5 and her brother are of Palestine origin, previously reported to have a loss of function PIGG variant and nonprogressive severe generalized ataxia and tonic clonic seizures with moderate delayed development 11 , were also investigated. Neither child had ever been transfused. Their RBCs typed as Emm−. Anti-Emm was not detected in the plasma. Sanger sequencing. PIGG Exons 1-13 were amplified (HotStarTaq Master Mix Qiagen, Hilden, Germany) using primers (Table S1)  Sequencing reads were aligned to GRCh38/hg38 human reference genome using BWA-MEM v0.7.12-r1039 21 .
The resulting SAM file was then transformed through the use of a series of steps: samblaster v0.1.24 22 to addMa-teTags and removeDups, samtools v1.7 23 to convert to BAM and sort file by genomic coordinates, Picard v2.5.0 13 to AddOrReplaceReadGroups, and then samtools v1.7 23 to create BAM index file. The Integrative Genomics Viewer (IGV) 15 was used to view the sequence depth of coverage and paired sequence reads.

Results
Variant identification. Figure 1A shows the pedigree of the South Asian family of Proband 1 and the RBC phenotypes including his brother, spouse, and children. Figure 1B shows the approach to identify the genetic variants (SNVs and indels) that might be associated with the Emm− phenotype. Variants present in WGS were enriched through a series of filtration criteria that included the prediction of the impact of the genomic change as none, low, moderate, or high using VEP (Fig. 1C). Initial analysis identified 5,149,248 variants in the Emm− proband, with 1,942,938 homozygous, including 504 of predicted high impact. Variants were then filtered for homozygous variants present in both the Emm− proband and his brother which left 1,427,285 variants with 360 of high impact. Filtering for those heterozygous in his Emm+ children reduced the number of variants to 425,464 with 106 of high impact. The remaining variants were then filtered for variants either not present in the gnomAD variant database or with a frequency < 0.001 which resulted in 1,818 with 1 predicted to have a high impact. The high impact variant was a 2-bp deletion in Exon 12 of the PIGG gene, which results in a frameshift and predicted premature stop codon designated c.2624_2625delTA, p.(Leu875*) in the cDNA (rs771819481) with chromosomal location hg38:chr4:533870_533871delTA. Figure 1D shows the reference sequence and biallelic WGS of the affected region of Exon 12 for each family member. The Emm− brothers were homozygous, the spouse was wild type, and daughter and son were heterozygous. Primers were designed (Table S1) to amplify and Sanger sequence all PIGG Exons to confirm the WGS results. Figure 2A shows the amplification and Sanger sequence results for Exon 12 which confirmed the absence of a deletion in a Emm+ control and in the spouse, homozygosity for the 2-bp deletion in the Emm− proband and his brother, and heterozygosity in the two Emm+ children.  Figure 2B shows the amplification products for the 13 PIGG Exons in a Emm+ control, and from Proband 2 and  . Also shown are the Proband's spouse and two children who are Emm+ and were shown to lack the antibody. (B) Study Outline. Emm serologic RBC typing was performed with in-house reagents and short read whole genome sequencing (WGS). (C) Variant Identification. Illustration of the strategy used to enrich for loss of function mutations associated with familial inheritance and variant effector prediction (VEP) scoring to identify the genetic cause of the Emm− phenotype. PIGG was the only candidate gene that passed our filtering strategy. (D) WGS alignments. IGV genomics viewer shows the wild type sequence (upper) and PIGG Exon 12 biallelic sequence for each family member (below). Individual sequence reads are shown in gray for positions corresponding to the hg38 reference sequence with bar plots above (dark grey) reflecting the relative number of reads at that position. Emm− family members were homozygous for a 2-bp deletion in Exon 12, predicted to cause a frameshift and premature stop, designated in as c.2624_2625delTA, p.(Leu875*), rs771819481, and chromosomal location hg38:chr4:533870_533871delTA. The spouse was wild type, and the daughter and son are heterozygous. www.nature.com/scientificreports/ Long range amplification targeting Exons 6 and 10 in the sample from Proband 3 (Fig. 2D) and compared to a control confirmed the presence of a deletion in PIGG. Targeted NGS of the product identified a 6.3 kb deletion spanning Exons 7, 8, and 9 and located the breakpoint regions in Intron 6 and 9 [g.24216_30561del, p.(Asp372Alafs*18)]. Sanger sequencing confirmed the specific breakpoint sequence (Fig. 2D). Figure 3A shows the pedigree of the North African family of Proband 4 and the RBC phenotypes of her parents, three brothers, and four sisters. Amplification of PIGG Exons 1-13 from genomic DNA from the proband found all Exons were present (Fig. 3B left). However, amplification of Exon 3 resulted in a smaller product compared to control in the proband and her Emm− sister (Fig. 3B right) and suggested the presence of a deletion in Exon 3 in PIGG as the cause of the Emm− phenotype in this family. The results identified the parents as potential heterozygote carriers along with three of the seven siblings, and revealed the three wild-type individuals in the family. Products obtained by PCR amplification of Exon 3 were sequenced by both targeted NGS and Sanger and showed the Emm− phenotype was associated with homozygosity for a deletion/insertion (indel) event with deletion of 74 bp consisting of 51 bp of Intron 2 and the first 23 nucleotides of Exon 3, along with insertion of 5 bp, GACTT (Fig. 3C), designated c.361-51_383delinsGACTT, p.(Ala121_Pro128delinsAspPhe). Sanger sequencing confirmed the specific breakpoint sequence.
RBCs from Proband 5 and her brother, previously reported to have a PIGG loss of function variant [c.1640G > A, p.(Trp547*), rs547951371] identified by WES 11 , were found to be Emm− and Sanger sequencing confirmed the variation.
Rare PIGG mutations and Emm− phenotypes. Figure 4A illustrates the five different PIGG mutations associated with the Emm− phenotypes reported here. The PIGG 2-bp deletion mutation, c.2624_2625delTA, was found in gnomAD v3 with an allele frequency of 0.000014 (2/143,344 alleles; query 4-533,869-TTA-T, Table S2) also in two heterozygotes South Asians (2/3,052 alleles; 0.0006553 frequency), the same ethnicity as Proband 1 who was homozygous for this mutation. The PIGG stop codon found in Proband 5, c.1640G > A was found in one heterozygous Non-Finish European in gnomAD (1/143,298 alleles; 0.000007 frequency). gnomAD SVs v2.1 contains no deletions paralleling those found in the probands here, nor large deletions encompassing other regions of PIGG. Figure 4B (and Table S2 Figure 5 illustrates steps in the complex GPI anchor biosynthesis pathway in mammalian cells. GPI-anchoring is a post-translational modification, with the core GPI assembled on the endoplasmic reticulum (ER) membrane. Building of the GPI-anchor begins with inositol phosphate to which glycans are added, including glucosamine and three mannose residues (Man1, 2, 3), along with the addition of ethanolamine phosphate (EtNP) side chains 7,9 . PIGG along with PIGF encode the GPI-ethanolamine transferase II (GPI-ETII) enzymatic complex which adds EtNP to Man2, converting the GPI precursor H7 to H8 (Fig. 5) 24 . PIGG loss of function in Emm− individuals would cause loss of addition of EtNP to the second mannose residue (Man2) in the GPI link (shown on H8). The protein is then linked to the GPIanchor, with remodeling by PGAP1, which removes the acyl chain linked to inositol, and PGA5 which removes some of the EtNP on Man2. This remodeling is thought to regulate transport to the Golgi in some cell types 11,25 . However, it has been reported that RBCs keep the acyl chain removed in other cells by the deacetylase PGAP1 7 . To gain additional insight into the structure of the potential target of anti-Emm, we searched the literature for studies measuring expression of PIG genes and PGAP genes in erythroid cells. Merryweather-Clarke et al. 26 performed high resolution transcriptome analysis of cultured erythroid cells during maturation. Although not the main topic of that report, the supplemental data confirms no to very low expression of PGAP1 and absence of expression of PGAP5 in erythroid cells. As the function of PGAP5 is to remove EtNP on Man2 in the GPI cellular pathway, this suggests that EtNP on Man2 remains and is robustly expressed on RBCs and is the target of anti-Emm.

Discussion
We show here, using family inheritance analysis and a genome wide search, that PIGG is responsible for the Emm blood group system. The absence of Emm on PNH cells had strongly suggested that Emm was on a GPIanchored protein 4,8 , but the finding that Emm is a component of the anchor itself is an unexpected discovery. The presence of rare loss of function mutations in five unrelated and ethnically diverse families or individuals with Emm− phenotypes that correlate with the inheritance of unique mutations, paired with the role of this transferase in the biosynthesis of GPI-anchors, supports PIGG as the gene responsible. PIGG encodes ethanolamine phosphate transferase 2 which modifies the second mannose (Man2) of the GPI-anchor by adding ethanolamine phosphate (EtNP) (Fig. 5). Loss of function of PIGG in Emm− individuals would result in absence of EtNP on Man2 on the GPI-link (Fig. 5, H8). The finding that Emm is part of the GPI-anchor supports previous results showing that Emm− individuals have GPI-anchored proteins, yet anti-Emm does not bind to PNH RBCs which lack both GPI-anchors and their linked proteins.

NGS Breakpoint Breakpoint Sequence C o n t r o l P r o b a n d 2 N o D N A C o n t r o l P r o b a n d 1 N o D N A S o n D a u g h t e r S p o u s e B r o t h e r
paired split reads spanning deletion  TGGCTAACACAGTGAGATTCTCCTGTCTCA TGGCTAACACAGTGAGATTCTCCTGTCTCA

NGS Breakpoint Breakpoint Sequence
C o n t r o l P r o b a n d www.nature.com/scientificreports/ The association of the Emm− phenotype and PIGG loss of function mutations may have important biological relevance. PIGG loss of function mutations have only recently been reported to be responsible for neurologic phenotypes including seizures, developmental delay, intellectual disability, and hypotonia 10,11 ; this is an active area of investigation. The South Asian family members investigated here by WGS, and their physician, indicated there was no history of neurological or related symptoms. Regarding previous reports of individuals with Emm− RBC phenotypes, many are years if not decades old and were tested here from archived samples and medical histories are largely unknown. Only the first proband discovered, who was living in France and born in Madagascar, was reported to be suffering from an unknown neurological disease 3 . The Japanese proband (Proband 2) had a diagnosis of total blindness and renal carcinoma, and shown here the previously reported siblings with congenital neurologic symptoms have a Emm− phenotype. It is possible that the other Emm− probands had additional clinical phenotypes, but such information is generally not shared with the Transfusion Service whose main concern is ensuring transfusion compatibility. The association of this rare RBC phenotype with mutation in PIGG, suggests further medical history and evaluation of the probands and families reported here may shed important information on the biological effects of different rare PIGG mutations. It has been suggested that fibroblasts are more sensitive to pathogenic variants in GPI synthesis, as relates specifically to PIGG 11 , and are well suited to screen for GPI anchor deficiencies. Typing of the RBCs for Emm antigen, and or screening of the plasma for anti-Emm, would offer an accessible and rapid alternative for screening samples for pathogenic defects in GPI synthesis.

A G T G A G A T T C T C C T G T C T C A T G G C T A A C A C A G T G A G A T T C T C C T G T C T C A A A C A C A G T G A G A T T C T C C T G T C T C A G A T T C T C C T G T C T C A T G G C T A A C A C A G T G A G A T T C T C C T G T C T C A T G G C T A A C A C A G T G A G A T G G C T A A C A C A G T G A G A T T C T C C T G G C T A A C A C A G T G A G A T T C T C C T G T C T C
The prevalence of the Emm− phenotype in populations is of medical relevance for blood transfusion, as anti-Emm has been shown to be both naturally occurring and to cause hemolytic transfusion reactions. PIGG mutations are very rare, but there appears to be a large number of different PIGG loss of function variations in current genomic databases. The 2-bp deletion mutation, (rs771819481, c.2624_2625delTA) found in the South Asian family, is in gnomAD v3 with an allele frequency of 0.000014. However, the mutations found in the Japanese (deletion of Exons 2, 3), the European (deletion of Exons 7 thru 9), and the North African family (indel in Exon 3) have not been previously reported and not found in gnomAD SVs v.2.1. Homozygous loss of function mutations in PIGG are very rare, with only one found in gnomAD v3 [rs150259543, c.1515G > A, p.(Trp505Ter)]. This allele is found in 102/143,304 alleles, with a frequency of 0.000712. Rare heterozygous carriers are found in every ethnic group with exception of Amish and Ashkenazy Jewish, and with the highest allele number in Non-Finnish Europeans. The allele frequency of PIGG loss of function is 0.00107, which predicts a homozygous Emm− prevalence of 1/878,969 individuals.
Limitations to this study include the lack of in vitro expression studies and structural studies to determine the precise epitope that composes Emm on the GPI-anchor. Further studies are needed to define the structure of Emm. Nevertheless, the results of these five unrelated families already provide strong evidence linking PIGG with the phenotypic expression of the Emm antigen.
The findings here illustrate the power of using WGS with family cohorts to uncover the genetic basis of blood group systems. The observation that Emm− RBC phenotype reveals underlying mutations in PIGG, provides an approach to find other individuals with Emm null phenotypes, and may offer a potential diagnostic. The ISBT Red Cell Immunogenetics and Blood Group Terminology working party has designated Emm as the 42nd blood group system.