Phylogenomic analysis sheds light on the evolutionary pathways towards acoustic communication in Orthoptera

Acoustic communication is enabled by the evolution of specialised hearing and sound producing organs. In this study, we performed a large-scale macroevolutionary study to understand how both hearing and sound production evolved and affected diversification in the insect order Orthoptera, which includes many familiar singing insects, such as crickets, katydids, and grasshoppers. Using phylogenomic data, we firmly establish phylogenetic relationships among the major lineages and divergence time estimates within Orthoptera, as well as the lineage-specific and dynamic patterns of evolution for hearing and sound producing organs. In the suborder Ensifera, we infer that forewing-based stridulation and tibial tympanal ears co-evolved, but in the suborder Caelifera, abdominal tympanal ears first evolved in a non-sexual context, and later co-opted for sexual signalling when sound producing organs evolved. However, we find little evidence that the evolution of hearing and sound producing organs increased diversification rates in those lineages with known acoustic communication.


Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.

For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings
For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code Data collection

Data analysis
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability Data availability statement has been included in the manuscript. Supplementary Information is available for this paper. All data used for this study have been This study is a molecular phylogenetic analysis based on transcriptome and mitochondrial genome data generated from RNA-grade and DNA-grade specimens. The taxonomic focus of this study is the insect order Orthoptera, which includes familiar insects such as grasshoppers, katydids, and crickets. This study is an output from the 1000 Insect Transcriptome Evolution (1KITE) consortium.
The samples used for this study are 249 insect specimens (10 polyneopteran outgroups and 239 orthopteran ingroups) designed to study the phylogenetic relationships within the order Orthoptera. The samples were selected based on the phylogenetic diversity to represent all major lineages (all 16 superfamilies and 36 families of extant Orthoptera). We included 60 transcriptomes, of which 39 orthopteran species were newly generated either by the 1K Insect Transcriptome Evolution (1KITE) consortium or by the Song Lab at Texas A&M University. The remaining 21 transcriptomes (11 orthopteran and 10 polyneopteran) were from the previous publications (see Supplementary Methods 1.1). To increase taxon sampling, we then combined the transcriptome data with 169 previously and 80 newly generated mtgenomes from 249 taxa. The taxon sampling information with proper accession numbers is presented in Supplementary Data 1 and 3.
The samples were selected based on the phylogenetic diversity to represent all major lineages, as well as the availability of RNAgrade and DNA-grade tissue samples. Our sampling strategy to combine (i) backbone taxa with both transcriptome and mitochondrial genome data, and (ii) additional taxa with mitochondrial genome data, has not been attempted in phylogenomic studies within insects. A conceptually similar approach of combining data-rich backbone taxa and additional taxa with less data had been applied previously in The sequence data were collected between 2016 and 2018. This time frame coincides with the active data generation period of the 1KITE project and the project ongoing in the Song Lab.
We removed potential sequence contaminations before transcriptome/mitochondrial genome assemblies. Once the phylogenetic matrix was formulated, no data were excluded.
We have provided all data used in the study in Dryad, and provided detailed methods in the Supplementary Information and we believe that the study is highly reproducible.
This is a phylogenetic analysis, which is based on taxon sampling and character sampling, and applying specific models of nucleotide substitution to infer likelihood. The methods used in phylogenetics is fundamentally different from a standard experimental design that requires control and treatment groups with a randomized design.
The type of questions and analyses pursued in this study does not require blinding experiment design, because there is no participants who may be influenced by the treatments. Blinding is not applicable to phylogenetic analyses.