(a) Histogram showing the proportion of known and novel transcripts identified across various lens developmental stages in mouse. Only transcripts exhibiting an expression higher than 1 TPM (Transcripts Per Million reads sequenced) are considered in this plot. However, the proportions of known versus novel remained stable irrespective of the threshold on the expression level of a transcript (Figure S1). (b) Violin plot showing the distributions of novelty scores of identified transcripts, expressed in embryonic and postnatal stages. Violin plot represents the boxplot combined with kernel density showing the distribution pattern of a data vector. Novelty score of the transcripts expressed (with TPM > 5.0) at least in one stage were employed to generate two violin plots corresponding to the embryonic (E15, E15.5, E18) and postnatal (P0, P3, P6, P9) stages respectively. Differences in the distribution of novelty scores between embryonic and post-natal stages were compared using Kolmogorov–Smirnov test. Median novelty score for E and P were 10.89 and 9.043 respectively. (c) This panel shows the distribution of PhastCons scores, reflecting the extent of conservation for known, partially novel (novelty score < 70%) and completely novel (novelty score ≥ 70%) transcripts identified across developmental stages in lens. The phastCons score (PS) provides nucleotide level conservation of mouse genomic loci across 46 vertebrate genomes. We found each pair of these transcript classes to be significantly different in their extent of conservation (p < 2.2e-16, Wilcoxon rank sum test) with median conservation scores 0.67, 0.76, and 0.13 for known, partially novel and completely novel transcript groups respectively. (d) Gene ontology enrichment based functional grouping using annotations for genes corresponding to the high confidence partially novel transcripts (PS > 0.76). Functional grouping of the GO-terms based on GO hierarchy was represented as clustered GO-network using the Cytoscape67-ClueGO31 plugin. Significant clustering (p < 1e-10) of genes (color coded by functional annotation group they belong to) based on enriched GO-biological processes generated by ClueGO analysis, with size of the nodes indicating the level of significant association of genes per GO-term, were shown. Only selected biological processes and associated networks are shown in this figure panel, while Fig. S2 shows the complete set of functional groups identified from this analysis.