Repeated inversions at the pannier intron drive diversification of intraspecific colour patterns of ladybird beetles

How genetic information is modified to generate phenotypic variation within a species is one of the central questions in evolutionary biology. Here we focus on the striking intraspecific diversity of more than 200 aposematic elytral (forewing) colour patterns of the multicoloured Asian ladybird beetle, Harmonia axyridis, which is regulated by a tightly linked genetic locus h. Our loss-of-function analyses, genetic association studies, de novo genome assemblies, and gene expression data reveal that the GATA transcription factor gene pannier is the major regulatory gene located at the h locus, and suggest that repeated inversions and cis-regulatory modifications at pannier led to the expansion of colour pattern variation in H. axyridis. Moreover, we show that the colour patterning function of pannier is conserved in the seven spotted ladybird beetle, Coccinella septempunctata, suggesting that H. axyridis’ extraordinary intra-specific variation may have arisen from ancient modifications in a conserved elytral colour patterning mechanisms in ladybird beetles.

axyridis' extraordinary intra-specific variation may have arisen from ancient modifications in 1 2 a conserved elytral colour patterning mechanisms in ladybird beetles. 1 3 colour elements (mosaic dominance 10 ). Whether all of the supposed alleles linked to the h 3 6 locus correspond to a single gene or multiple genes is yet unknown. Elucidating the DNA 3 7 structure and the mechanisms underlying the evolution of this tightly linked genetic locus 3 8 that encodes such a strikingly diverse intraspecific colour pattern polymorphism would 3 9 provide a case-study that bears upon a major evolutionary-developmental biology question; 4 0 how does morphology evolve? Here we show that the gene pannier is responsible for 4 1 controlling the major four elytral colour patterns of H. axyridis. Moreover, we illustrate how 4 2 modification to this ancient colour-patterning gene likely contributed to an explosive 4 3 diversification of colour forms. 4 4 4 5 Enriched DNA binding motifs of Drosophila transcription factors involved in wing formation are listed (p < 0.05). a. Allele-or Specie-specifically enriched motifs strain upstream intergenic region upstream region of 1st intron downstream region of 1st intron h C EcR, foxo Myc,en,exd,HLH106,h,ss,pan Mad,Mnt,pad,Poxn,sd,tgo,tai,ab,crc,crol,Dr,rn,sd h A exd,pan,vvl ab,Spps,brk,usp,pnr usp h kni,rn,sqz ato,ab,Ets21C,rn,sqz al,lms,nub,sqz,tup C. sep ato,EcR,h,ss,eg,tai,pnr,tgo E(spl)mβ,ken,brk,tgo,h,tai b. Commonly enriched motifs group upstream intergenic region upstream region of 1st intron downstream region of 1st intron

Figure 1. Intraspecific genetic polymorphisms of elytral colour patterns in
Harmonia Mad, brk, ab, taxi, tgo lms, exd, hth, tgo, tai, Mad al, brk, tx, E(spl) identify specific regulatory inputs to pannier will help clarify the regulatory mechanisms 3 2 5 underlying the generation of highly diverse intraspecific polymorphism at the interspecific 3 2 6 level. Another important issue to clarify whether pannier is indeed the hotspot of 3 2 7 morphological evolution in ladybird beetles is whether pannier is responsible for the 3 2 8 remaining >20 minor colour patterns in H. axyridis.

0
Insects 3 3 1 Laboratory stocks of H. axyridis and C. septempunctata were derived from field 3 3 2 collections in Japan. They were reared at 25°C and usually fed on artificial diet 50 , or fed on 3 3 3 the pea aphid Acyrthosiphon pisum (kindly provided by Dr. T. Miura) for egg collection.

4
Larvae and pupae analysed in this study were not sexed. Pupa elytral discs were dissected in a potassium phosphate buffer (100 mM 3 3 8 KH 2 PO 4 /K 2 PO 4 , 150 mM NaCl, pH 6.3) on ice. PO staining was performed using 0.4 mg/ml 3 3 9 dopamine as a substrate for 2 hours at room temperature as previously described 13  times in 100% methanol and stored in 100% methanol at −20°C until use. After dehydration, 3 5 0 the elytra were embedded in 4% carboxymethyl cellulose (CMC; FINETEC), and were 3 5 1 frozen in hexane cooled with dry ice. The freeze-embedded elytra were stored at −80°C until 3 5 2 use. The 6 μm frozen sections were prepared as described previously 51 using an adhesive film 3 5 3 (Cryofilm Type 1; FINETEC). Sections of the PO activity stained elytra were mounted in 3 5 4 PBS and photographed using IX70. For nuclear and F-actin staining, sections were treated 3 5 5 with 2.5 μg/ml propidium iodide, 1 mg/ml RNase A and 5 U/ml AlexaFluor 488 phalloidin 3 5 6 (Molecular Probes) for 1 hour at 37°C under a dark condition. After washing three times in 3 5 7 PBS, the sections were mounted in an antifade reagent (FluoroGuard TM ; Bio-Rad), and 3 5 8 images were captured with a confocal laser-scanning microscope (LSM 510; Carl Zeiss).

5 9
For localisation of carotenoid, elytra at 96 h AP were embedded and sectioned as 3 6 0 described above. All procedures were rapidly performed to prevent diffusion of carotenoids.

6 1
The sections were dried for 1 min, mounted in PBS and immediately photographed using  (ORF) of pannier were determined through direct sequencing of the PCR products treated 3 7 9 with ExoSAP-IT (Affymetrix). Sequencing was performed by DNA sequencing service 3 8 0 (FASMAC) using the primers listed in Supplementary Gene expression analysis by RT-PCR 3 8 7 For the gene expression analysis in each developmental stage, elytral tissues of 3 8 8 three individuals of H. axyridis (h C ) and C. septempunctata were dissected as described 3 8 9 above. Six elytral tissues from each sampling stage were pooled in one tube. Total RNA 3 9 0 extractions and the subsequent first-strand cDNA syntheses (using 425 ng and 267 ng of total 3 9 1 RNA for H. axyridis and C. septempunctata samples, respectively) were performed as 3 9 2 described above. Three μl of 100 and 62.8 times diluted H. axyridis and C. septempunctata 3 9 3 first-strand cDNA was used as a template for PCR, respectively. The PCR cycle numbers are 3 9 4 35 for all genes. A set of primers #1 and #2 for each gene was used for this analysis 3 9 5 (Supplementary Table 7). 3 9 6 For the gene expression analysis in the future red and black regions, red and black 3 9 7 regions of pharate adult elytra at 84 h AP were bored with a needle. Internal diameters of 0.7 3 9 8 and 0.6 mm were used for H. axyridis and C. septempunctata, respectively. In the case of C. 3 9 9 septempunctata, elytra stained with PO activity were used for boring since carotenoid 4 0 0 localisation was not observable unlike H. axyridis. cDNA synthesis was performed as 4 0 1 described above, using as much total RNA as we could extract. 20 μl of 10 times diluted 4 0 2 and Cs-pnr were injected into two-day-old forth (final) instar larvae, respectively. 4 1 5 Approximately 2.0 to 2.7 μg and 1.4 to 2.7 μg of the EGFP dsRNA were injected into H. 4 1 6 axyridis and C. septempunctata larvae as negative controls, respectively. Different amount of 4 1 7 dsRNA for each gene in this range gave no difference in phenotypic effects (data not shown). 4 1 8 In order to give enough time for the completion of pigmentation, images of adults were 4 1 9 captured at more than 2 days after eclosion using a digital microscope (VHX-900, Keyence). fine forceps after fixation. To reduce non-specific probe hybridisation, fixed, separated and 4 2 7 detergent-permeabilised elytra epidermal samples were stored in 100% methanol for more 4 2 8 than 12 hours at −30°C, and prehybridisation treatment was extended to two overnights. The 4 2 9 ventral epidermis samples were not used for analysis because of high non-specific 4 3 0 background signals. pannier antisense probes were designed at 5' and 3' regions of ORF pair, longer and shorter scaffold were called "primary-bubble" and "secondary-bubble", 4 6 9 respectively. Primary-bubbles, secondary-bubbles and non-bubble scaffolds are collectively 4 7 0 called "phased-scaffolds". 4 7 1 In addition, Platanus2 can connect primary-bubbles and non-bubble scaffolds to construct 4 7 2 long "consensus scaffolds", which consists of mosaic structure of haplotypes (i.e. paternal 4 7 3 and maternal haplotypes are mixed). Employing the strategy of Platanus2, certain highly 4 7 4 heterozygous regions were expected to be assembled contiguously compared to Platanus. 4 7 5 Using the markers of the responsible region of wing pattern (pannier-locus), we found that 4 7 6 two long bubbles and one short non-bubble scaffold corresponded to the locus. Consequently, 4 7 7 one consensus scaffold was constructed from these phased scaffolds, which covering the 4 7 8 breakpoint markers at the pannier locus. We used that consensus scaffold (3.13 Mb) for the 4 7 9 downstream in silico sequence analyses. We assessed the completeness of the genome 4 8 0 Gap filling of the genomic scaffolds at the pannier locus 5 3 9 Concerning the genome assemblies of H. axyridis, we used minimap2 64 (ver. 2.9) 5 4 0 and PBjelly 65 (ver. PBSuite_15.8.24) software to fill gaps around the pannier locus. In each 5 4 1 genome of three strains of H. axyridis, we first mapped PacBio reads to the assembly 5 4 2 generated from the 10x linked-reads using minimap2. Then, we chose PacBio reads mapped 5 4 3 to the scaffold containing pannier gene. These PacBio reads were subjected to gap-filling of 5 4 4 the scaffold with PBjelly. We obtained gap-free nucleotide sequences spanning the entire 5 4 5 pannier locus and the upstream intergenic regions.

4 6
Concerning the genome assembly of C. septempunctata, there was a single gap 5 4 7 estimated to be 15kb long by Supernova programme in the 1st intron of pannier locus. We For the H. axyridis F2-3 sample, trimmed reads of the 15 kbp-mate-pair library were mapped 5 5 2 to the consensus scaffold set of Platanus2 using BWA-MEM 66 (version 0.7.12-r1039) with 5 5 3 default parameters. Next, a consensus scaffold corresponding to pannier-locus was 5 5 4 6 1 search as a query. Exonerate output files were converted to the GFF3 format using our 5 9 6 bug-fixed version of the process_exonerate_gff3.pl 75 with the option '-t EST'. The GFF3 file 5 9 7 and a FASTA format file of each scaffold were converted to a GENBANK format file using 5 9 8 EMBOSS Seqret 76 programme (ver. 6.6.0.0) with the options '-fformat gff -osformat genbank 5 9 9 '. The GENBANK format files corresponding to the 700 kb genomic sequences surrounding 6 0 0 pannier, which were used as input files of Easyfig, were extracted using the 6 0 1 Genbank_slice.py 77 Python script (ver. 1.1.0). 6 0 2 6 0 3 Flexible ddRAD-seq 6 0 4 We newly constructed a flexible ddRAD-seq library preparation protocol to 6 0 5 facilitate high-throughput ddRAD-seq analyses at low cost. We designed all enzymatic 6 0 6 reactions to be completed sequentially without DNA purification in each step to make the 6 0 7 procedures simple. In addition, we designed 96 sets of indexed and forked sequencing 6 0 8 adaptors compatible with Illumina platform sequencers (Supplementary Table 10). 6 0 9 Briefly, 100 ng of genomic DNA was first double-digested with 15 U of EcoRI-HF 6 1 0   T  C  T  T  G  T  C  T  T  G  T  T  T  A  T  G  T  C  G  T