gga-miRNOME, a microRNA-sequencing dataset from chick embryonic tissues

Duarte, Isabel; Carraco, Gil; de Azevedo, Nayara T. D.; Benes, Vladimir; Andrade, Raquel P.

doi:10.1038/s41597-022-01126-7

Download PDF

Data Descriptor
Open access
Published: 31 January 2022

gga-miRNOME, a microRNA-sequencing dataset from chick embryonic tissues

Scientific Data volume 9, Article number: 29 (2022) Cite this article

1757 Accesses
4 Citations
2 Altmetric
Metrics details

Subjects

Abstract

MicroRNAs (miRNAs) are small non-coding RNA molecules, with sizes ranging from 18 to 25 nucleotides, which are key players in gene expression regulation. These molecules play an important role in fine-tuning early vertebrate embryo development. However, there are scarce publicly available miRNA datasets from non-mammal embryos, such as the chicken (Gallus gallus), which is a classical model system to study vertebrate embryogenesis. Here, we performed microRNA-sequencing to characterize the early stages of trunk and limb development in the chick embryo. For this, we profiled three chick embryonic tissues, namely, Undetermined Presomitic Mesoderm (PSM_U), Determined Presomitic Mesoderm (PSM_D) and Forelimb Distal Cyclic Domain (DCD). We identified 926 known miRNAs, and 1,141 novel candidate miRNAs, which nearly duplicates the number of Gallus gallus entries in the miRBase database. These data will greatly benefit the avian research community, particularly by highlighting new miRNAs potentially involved in the regulation of early vertebrate embryo development, that can be prioritized for further experimental testing.

Measurement(s)	miRNA
Technology Type(s)	microRNA profiling assay
Factor Type(s)	embryonic tissue
Sample Characteristic - Organism	Gallus gallus

Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.17173127

The roles of microRNAs in mouse development

Article 15 January 2021

Conservation and novelty in the microRNA genomic landscape of hyperdiverse cichlid fishes

Article Open access 25 September 2019

The embryonic transcriptome of Parhyale hawaiensis reveals different dynamics of microRNAs and mRNAs during the maternal-zygotic transition

Article Open access 07 January 2022

Background & Summary

MicroRNAs (miRNAs) are small, single-stranded RNAs with sizes ranging from 18 to 25 nucleotides that are involved in gene expression regulation. This is achieved via post-transcriptional silencing of complementary messenger RNA (mRNA) targets by repression of translation and/or mRNA degradation¹. miRNAs were initially called small temporal RNAs (stRNAs), since they were first described as essential for proper developmental stage transition in the C. elegans life cycle². Today they are recognized to act as gatekeepers of developmental time in many other systems, by mediating cell proliferation-to-differentiation transitions³.

The canonical pathway of miRNA biogenesis starts with the transcription of a primary miRNA (pri-miRNA) by RNA polymerase II. The pri-miRNA forms a hairpin that is recognized by DGCR8, which recruits a Class 2 ribonuclease III enzyme, Drosha. This enzyme cleaves the RNA releasing the hairpin, called precursor miRNA (pre-miRNA), which is then exported to the cytoplasm via the Exportin-5 transporter. Here, it is recognized by a second Class 2 ribonuclease III enzyme, Dicer, that cleaves the loop from the hairpin releasing a small double-stranded RNA. One of the strands binds to an Argonaute protein from the RNA-induced silencing complex (RISC), while the other is degraded. At this point, the mature miRNA selectively recognizes and binds to the 3′ untranslated region (3′UTR) of its target mRNA through a small 2–7 nucleotide seed region, leading to RISC-mediated mRNA degradation and/or translational repression¹.

An essential step in addressing miRNA-mediated regulation of gene expression is to identify and quantify the miRNAs present in the biological system of interest. High throughput miRNA profiling studies have identified thousands of miRNAs in Human and mouse samples⁴. However, this effort has been lagging behind in other model organisms, hindering the elucidation of their role in these systems. This is the case of the chicken (Gallus gallus) embryo, a well-established model for studying human embryogenesis due to its extraordinary molecular and morphological similarities in the early stages of development, alongside the ease of experimental manipulation it offers⁵. It was in the chicken embryo that the molecular embryonic clock (EC) underlying the periodic formation of vertebrae precursors was first described⁶. EC genes present cyclic expression maintained by negative feedback regulation in the posterior undetermined presomitic mesoderm (PSM), which gradually slows in the anterior PSM and halts in the segmented somites^7,8. The periodicity of gene expression oscillations in the PSM is species-specific but can also differ in different tissues of the same organism. Namely, hairy2 gene expression oscillates with a periodicity of 90 min in the chick PSM and 6 h in the distal cyclic domain (DCD) of the developing forelimb bud⁹.

mRNA instability is essential for EC cycles of expression and there is evidence of a miRNA-dependent regulation of EC gene oscillations¹⁰. Namely, miR-125a-5p is required for cyclic LFNG expression in the chick PSM¹¹ and miR-9 drives Hes1 oscillations in mouse neural progenitor cells¹². Additionally, we previously showed that the genes encoding the enzymatic machinery for miRNA biogenesis are expressed in both the chick PSM and forelimb bud¹³, tissues where the EC is oscillating.

A thorough characterization of the role of small RNAs in chick embryo development and in the regulation of the EC has been hampered by the scarcity of miRNA expression datasets in embryonic tissues of this model system. To overcome this limitation, we performed a miRNA profiling analysis (miRNA-Seq) of three different tissues of the developing chick embryo (Fig. 1a,b). Namely, two regions of the PSM - Undetermined Presomitic Mesoderm (PSM_U) and Determined Presomitic Mesoderm (PSM_D) - and the Forelimb Distal Cyclic Domain (Limb). We report the identification of 926 known miRNAs, and 1,141 candidate novel miRNAs, not previously described in chicken. Accordingly, we believe that this will be an invaluable data resource for the research community studying miRNA-mediated gene expression in early vertebrate development, particularly in the chick embryo.

Methods

Embryos

Fertilized Gallus gallus eggs (Pintobar, Portugal) were incubated at 38 °C in a humidified atmosphere for two or four days to obtain embryos in stages HH12–13 and HH20–22¹⁴, respectively.

Sample collection

Presomitic mesoderm (PSM) tissues were isolated from embryos in stages HH12–13. To obtain these samples, embryos were collected from 48h-incubated eggs, placed in a petri dish containing phosphate buffer saline (PBS) solution and staged according to Hamburger and Hamilton¹⁴. Only the embryos in stages HH12–13 were selected for further use. The embryos were then placed ventral side up in PBS and 4 μL of Pancreatin (25 mg/mL) (Sigma #8049-47-6) was added to the surface of the embryo. After 3 to 5 minutes, pancreatin was inactivated with goat serum (Gibco #16210-072). The mesoderm located on either side of the neural tube was isolated from all surrounding tissues and divided into determined PSM (PSM_D, upper one-third portion) and undetermined PSM (PSM_U, caudal two-thirds) (Fig. 1a,b). Due to the extraordinarily small size of these tissues, 20 pairs of PSM portions were pooled together for RNA extraction from each biological sample. The samples were snap frozen in liquid nitrogen and stored at -80 °C.

Distal Cyclic Domain (DCD Limb) tissues were isolated from embryos at stages HH20–22 (Fig. 1a,b). Embryos were collected from fertilized eggs incubated for four days, placed in PBS and staged according to Hamburger and Hamilton¹⁴. Only the embryos in stages HH20–22 were selected for further use. The limb tissue (distal medial portion of the forelimb bud) was manually dissected using forceps. 20 DCD Limb pairs were pooled together for each sample, snap frozen in liquid nitrogen and stored at -80 °C.

RNA extraction

Biological samples were defrosted on ice. Total RNA was extracted using TRIzol Reagent (Invitrogen #15596-018) according to the manufacturer’s instructions with slight adaptations, namely, the aqueous phase from the first step of extraction was washed once with Phenol:Chloroform (Sigma #P2069) and then with Chloroform:Isoamyl Alcohol (24:1). The aqueous phase was recovered using Phase lock Gel Heavy (5Prime #2302830) and RNA was precipitated by addition of 1/10 volume of 3 M sodium acetate, 2.5 volumes of 100% ethanol and 3 μL per mL of Linear Acrylamide (Ambion #AM9520). After one hour at -80 °C, the RNA was precipitated by centrifugation at 14,000 rpm for 30 minutes at 4 °C. The pellet was washed with 70% ethanol and centrifuged for 15 minutes at 4 °C, briefly air-dried and resuspended in 50 µL of MilliQ (Merck Millipore) purified water. The samples were quantified using NanoDrop 2000 (Thermo Scientific) and stored at -80 °C.

RNA quality control

A first-round of quality control was performed by Reverse Transcription-PCR. 100 ng of RNA was reverse transcribed using iScript™ cDNA Synthesis Kit (BioRad #1708890). Subsequent PCR for GAPDH was done using DreamTaq DNA Polymerase (Thermo Scientific™ #EP0701). In a second instance, RNA quality control was performed using Experion™ RNA StdSens Analysis Kit (BioRad #700-7103) (Table 1). Only samples with an RQI (RNA Quality Indicator) equal to or above 8.5 were sent for sequencing.

Table 1 Total RNA quality control.

Full size table

Library preparation and miRNA-sequencing

The sequencing libraries were prepared using the NEBNext Multiplex Small RNA Library Prep Set for Illumina (NEB #E7300S/L Version 5.0), starting with 150 ng of total RNA as input. As a first step in the protocol, adaptors ligate directly to the small RNA fragments containing 5′ phosphate and 3′ OH, followed by cDNA generation and PCR amplification. 15 cycles of amplification were performed using specific SR primers for Illumina and index primer of choice for each sample (according to NEB #E7300S/L Version 5.0 protocol).

Size distribution of the final library was assessed on Bioanalyzer (Agilent Technologies) with a DNA High Sensitivity kit (Agilent Technologies #5067-4626), and concentration was measured with Qubit® DNA High Sensitivity kit (Life Technologies #Q32854) in Qubit® 2.0 Flurometer (Life Technologies). Individual libraries that passed the QC step were pooled equimolarly in a 9-plex, and final pool was purified with SPRI select beads at a 1.3x bead ratio (Beckman Coulter #B23319). Pool was loaded to a single lane of an Illumina HiSeq. 2000 sequencing instrument (Illumina Inc.) at 6 pM concentration and was sequenced in 50 bp single-read mode¹⁵. Library construction and sequencing were performed at EMBL’s GeneCore facility in Heidelberg, Germany.

Quality control of sequencing reads

Sequencing reads were firstly evaluated using FastQC (version 0.11.5)¹⁶ to verify the overall read quality of each sample. One library (PSM_U2), from undetermined PSM, did not pass the quality control step, mainly due to its small library size, leading to its removal from further analyses (Fig. 2a).

Quantification and normalization of annotated miRNAs

For the remaining 8 samples that passed the quality control, annotated microRNA read counts were obtained using the Chimira software (version 1.5)¹⁷. Briefly, the pipeline implemented in Chimira for miRNA-seq analysis comprises the following steps: firstly, the sequences are cleaned, trimmed, and size selected to remove adapters and low quality microRNA reads. Next, the reads passing the previous filters are mapped to Gallus gallus hairpin sequences present in miRBase (release 22)⁴ using BLASTn¹⁸ allowing up to two mismatches. Finally, a count-based miRNA expression dataset is generated¹⁹ and normalized across all samples using DESeq2²⁰. Further data validation, visualization, and statistical analyses were conducted using the normalized log2 expression data.

Detection, quantification and normalization of novel miRNAs

Detection of novel miRNAs was performed using the Mirnovo tool (v1.0)²¹, which is a machine learning algorithm that predicts novel miRNAs by analysing structural features of miRNA precursor hairpin sequences gathered directly from small RNA-Sequencing data. Briefly, the Mirnovo pipeline entails the following steps: (i) adapter removal followed by sequence de-duplication; (ii) the tallied sequences then enter a series of clustering steps, followed by cluster refinement to obtain consensus sequences; (iii) the prediction step identifies known and novel miRNAs; and (iv) the final step, aligns the consensus sequences from all miRNAs (known and novel) to the reference genome. This was done by selecting the most stable hairpins (scored by Delta G free energy) found in a 90-nucleotide window around the consensus sequences followed by genomic feature calculation²¹.

The specific parameters used for Mirnovo were: Gallus gallus input species, using the Universal prediction model (since there are no models specifically trained for chicken), length filter between 16 and 28 nucleotides, minimum read depth of 5, minimum variants 1, and initial clustering using an alignment identity threshold of 0.9 (vsearch-id parameter).

The candidate novel miRNAs were then quantified and normalized using Chimira¹⁷ with the Mirnovo extension. The analysis was performed as described above for the annotated miRNAs, with the difference that the custom hairpin FASTA files output from Mirnovo for each sample were uploaded together with the corresponding raw FASTQ files.

Data resulting from this identification and quantification (i.e. hairpin sequences, genomic location, and normalized counts) is freely available²².

microRNA expression profiling

Using customized R scripts (R version 3.6)²³, we conducted quality control analysis, and briefly inspected the profile of annotated and novel microRNA expression in each embryo tissue. For this we used R packages for data visualization, namely, Tidyverse²⁴, UpSetR²⁵, Patchwork²⁶, and plot3D²⁷.

Data Records

All sequencing data has been deposited in the ArrayExpress data repository²⁸ with accession number E-MTAB-8176¹⁵. This dataset consists of 8 microRNA expression raw data files in fastq format. Detailed experimental procedures and data analysis are also available there.

Processed data (in tabular text format) containing the log2 normalized counts of the sequencing reads for annotated miRNAs has been deposited in Figshare¹⁹. Similarly, the list of predicted novel miRNAs, with sequence, and log2 normalized counts is available in Figshare²².

All the sequencing data and the normalized miRNA expression counts are open. The R code used for the exploratory data analysis and visualizations are also freely available for consultation in Figshare²⁹.

Technical Validation

Quality control of microRNA-Seq data

The quality control of the raw sequencing reads was performed using FastQC¹⁶ to assess overall read quality and flag potentially poor-quality samples. All samples except one, passed the QC metrics performed by FastQC¹⁶. The poor-quality sample presented a variable PHRED score distribution across the read length (Fig. 2a), as well as a very low total number of reads (244,296 reads compared to 3 million average reads in the other samples), hinting that the sequencing step was faulty, possibly due to sample degradation prior to or during library preparation. Accordingly, this sample was removed from further data analyses.

Chimira, the software used to quantify the miRNA expression, also performs quality control for the samples, namely read length distribution after trimming, nucleotide distribution per position, and GC content ratio at each position. These were all manually inspected (to ensure that biases and outlier sequences were not present) before accepting the output miRNA quantification values.

Validation of experimental design strategy

Since each tissue sample included pools of 20 embryos, expression variation was expected between the biological replicates for the same tissue. Accordingly, to validate our experimental design and check for sample coherence between replicates, we evaluated sample variance via a Principal Component Analysis (PCA).

The first and second components (17.5% and 12.6% explained variance, respectively) can only distinguish between the Limb and the PSM, but by adding the third component (11.2%) the distinction between determined and undetermined PSM becomes apparent (Fig. 2b). This shows, as expected, that the differences between Limb and PSM, two distinct tissues, are more extensive than the differences between determined and undetermined PSM, two molecular states of the same tissue. Albeit more subtle, such differences within the PSM are visible in the dataset, therefore validating the samples collected for our experimental design (Fig. 1a).

Performance measures for novel miRNA predictions

The quality metrics reported by Mirnovo for the novel miRNAs predicted for Gallus gallus show an overall good scoring for all samples, as seen in the ROC curves reported for the Random Forest algorithm applied (Fig. 3).

As shown in Table 2, the method is highly specific (>95% of true negative identification), despite not being very sensitive (circa 50% true positive identification). This means that although many new miRNAs might be missed, the ones reported should be regarded as highly reliable. These results mirror the fact that the prediction had to be run using a general animal model, given the lack of specific models trained with chicken miRNAs.

Table 2 Performance measures from novel miRNA prediction using the Mirnovo algorithm.

Full size table

This method identified circa 50% novel candidate miRNAs as shown by the novel prediction values presented in Table 2. This represents the addition of 1,141 new candidate miRNAs to the previously existing 1,232 mature miRNAs in the miRBase database for Gallus gallus, further granting relevance to this dataset.

Validation of expression profiling

To validate the read normalization and quantification steps, we evaluated the read distribution before and after normalization, and briefly compared the miRNA expression profile between the three tissues. Known and novel miRNA datasets were independently evaluated, since the analysis was conducted separately, and each dataset represents a different resource for the community.

miRNA-seq Reads distribution and normalization

The distribution of the total number of reads (Fig. 4a,c) shows that there are some differences between the replicates before read count normalization, particularly for the Determined PSM tissue in the known miRNAs set. This fact is most likely a reflection of the embryo pooling strategy that might be contributing asymmetrically to the total miRNA amount present in each sample. After normalization (Fig. 4b,d), the distributions become more balanced between replicas, and therefore amenable for further expression comparisons between tissues. As expected, the miRNA expression distribution for all three tissues is positively skewed (even after log2 transformation), showing a long tail to the right.

miRNA Expression profile in the different tissues

Looking at the miRNA expression profile in the three tissues helps with uncovering possible experimental errors; for example, large asymmetries in the diversity of miRNAs found for each tissue could indicate faulty sequencing, or a total overlap of miRNA identities between tissues could indicate mislabelling or inadequate experimental design.

The top-20 most expressed miRNAs (Fig. 5a,c) are found in all three tissues, with roughly comparable distributions in both annotated and novel miRNAs. Additionally, the intersection plot (Fig. 5b,d) clearly shows that the majority of miRNAs (637 in known miRNAs and 849 in novel miRNAs) are found in all three tissues. Importantly, each tissue presents exclusive miRNAs, namely 71 in Limb, 35 in determined PSM, and 8 in undetermined PSM for known miRNAs (Fig. 5b); and 51 in Limb, 41 in determined PSM, and 7 in undetermined PSM for novel miRNAs (Fig. 5d), showing that each sample is sufficiently different from the others, allowing for proper differentiation between tissues.

Usage Notes

The bioinformatics analysis described here made use of freely available software tools commonly used by the research community (Fig. 1c). There are alternative miRNA-seq analysis pipelines equally applicable to the FASTQ reads from Gallus gallus²², for example, miRDeep2³⁰, QuickMIRSeq³¹. and sRNAnalyzer³². For a recently published miRNA-seq analysis protocol, see Potla et al.³³, discussing available individual tools for each step: (i) quality control (adaptor trimming, read quality/length filtering); (ii) read mapping; (iii) annotation (using miRBase); (iv) quantification; and optionally (v) detection of novel miRNAs.

These data can equally be used to seek the complete small RNA’ome, using for example the recently developed platform coMpSRA that is reported to identify and quantify diverse RNA molecule types, including miRNA, piRNA, snRNA, snoRNA, tRNA, and circRNA³⁴.

The miRNA expression data herein reported¹⁹ will be useful to study gene regulation in the early phases of vertebrate embryo development, for example by performing differential expression and target gene annotation analyses. Some considerations should be taken into account for downstream analyses. Namely, the RNA was extracted from pools of 20 dissected tissues meaning that each sample represents an heterogenous mixture of individuals, whose variability is present in the data. This is even more relevant if we consider that oscillations of clock gene expression occur in the tissues analysed. Thus, care should be taken when using such static sample datasets to contrast tissues with dynamical gene expression. Additionally, some SNPs can potentially interfere with the successful mapping of some miRNA transcripts that might have been discarded, and therefore cause an underrepresentation of expression for those miRNAs. Finally, for differential expression studies comprising the PSM_U tissue, since this group comprises only two replicates, the comparison will have lower statistical power to detect small effect sizes. Accordingly, appropriate statistical techniques should be applied to deal with this limitation.

Since the chicken genome annotation is not yet up-to-par with the annotations from other vertebrate genomes, most chicken miRNAs deposited in databases are not yet experimentally validated, and their target genes are based mostly on chicken-specific computational predictions. This study opens the door for new findings specific for birds, and for validation of known vertebrate miRNAs and their respective target genes. Finally, the predicted novel miRNAs²² represent an invaluable resource for the avian research community looking to experimentally validate novel candidate miRNAs acting in early vertebrate development capable of regulating their gene of interest. Additionally, these data coupled with transcriptomics data for the same tissues can help uncover potential regulatory modules active in early vertebrate embryogenesis.

Code availability

Technical validation and data visualization was performed in RStudio (Version 1.1.463)³⁵, using R (version 3.6)²³, and Bioconductor (version 3.9)³⁶, with packages tidyverse (version 1.3.1)²⁴, UpSetR (version 1.4.0)²⁵, patchwork (version 1.1.1)²⁶, plot3D (1.3)²⁷. The R code used for these analyses, in the form of an annotated R notebook, is freely available in Figshare¹⁹. Additional software tools used to analyse this miRNA-seq dataset were the following: FastQC (version 0.11.5)¹⁶, Chimira (version 1.5)¹⁷, and Mirnovo (version 1.0)²¹ as described in the Methods section.

References

Bartel, D. P. Metazoan MicroRNAs. Cell 173, 20–51 (2018).
Article CAS Google Scholar
Lee, R., Feinbaum, R. & Ambros, V. The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75, 843–854 (1993).
Article CAS Google Scholar
Ambros, V. MicroRNAs and developmental timing. Current Opinion in Genetics & Development 21, 511–517 (2011).
Article CAS Google Scholar
Kozomara, A., Birgaoanu, M. & Griffiths-Jones, S. miRBase: from microRNA sequences to function. Nucleic Acids Research 47, D155–D162 (2018).
Article Google Scholar
Stern, C. The ChickA Great Model System Becomes Even Greater. Developmental Cell 8, 9–17 (2005).
CAS PubMed Google Scholar
Palmeirim, I., Henrique, D., Ish-Horowicz, D. & Pourquié, O. Avian hairy Gene Expression Identifies a Molecular Clock Linked to Vertebrate Segmentation and Somitogenesis. Cell 91, 639–648 (1997).
Article CAS Google Scholar
Oates, A., Morelli, L. & Ares, S. Patterning embryos with oscillations: structure, function and dynamics of the vertebrate segmentation clock. Development 139, 625–639 (2012).
Article CAS Google Scholar
Shih, N., François, P., Delaune, E. & Amacher, S. Dynamics of the slowing segmentation clock reveal alternating two-segment periodicity. Development 142, 1785–1793 (2015).
Article CAS Google Scholar
Sheeba, C., Andrade, R. & Palmeirim, I. Joint interpretation of AER/FGF and ZPA/SHH over time and space underlies hairy2 expression in the chick limb. Biology Open 1, 1102–1110 (2012).
Article CAS Google Scholar
Jing, B. et al. Dynamic properties of the segmentation clock mediated by microRNA. Int. J. Clin. Exp. Pathol. 8, 196–206 (2015).
PubMed PubMed Central Google Scholar
Riley, M., Bochter, M., Wahi, K., Nuovo, G. & Cole, S. mir-125a-5p-Mediated Regulation of Lfng Is Essential for the Avian Segmentation Clock. Developmental Cell 24, 554–561 (2013).
Article CAS Google Scholar
Bonev, B., Stanley, P. & Papalopulu, N. MicroRNA-9 Modulates Hes1 Ultradian Oscillations by Forming a Double-Negative Feedback Loop. Cell Reports 2, 10–18 (2012).
Article CAS Google Scholar
Carraco, G., Gonçalves, A., Serra, C. & Andrade, R. MicroRNA processing machinery in the developing chick embryo. Gene Expression Patterns 16, 114–121 (2014).
Article CAS Google Scholar
Hamburger, V. & Hamilton, H. A series of normal stages in the development of the chick embryo. Journal of Morphology 88, 49–92 (1951).
Article CAS Google Scholar
Carraco, G., Duarte, I. & Andrade, R. P. microRNA-Seq of Gallus gallus embryo tissues: Undetermined Presomitic Mesoderm (PSM), Determined PSM, and Limb bud. ArrayExpress https://identifiers.org/arrayexpress:E-MTAB-8176 (2021).
Babraham Bioinformatics - FastQC A Quality Control tool for High Throughput Sequence Data. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (2016).
Vitsios, D. & Enright, A. Chimira: analysis of small RNA sequencing data and microRNA modifications: Fig. 1. Bioinformatics 31, 3365–3367 (2015).
Article CAS Google Scholar
Boratyn, G. et al. BLAST: a more efficient report with usability improvements. Nucleic Acids Research 41, W29–W33 (2013).
Article Google Scholar
Duarte, I., Carraco, G. & Andrade, R. P. gga_mirnOME | microRNA-seq | miRNA Expression dataset from chick embryonic tissues. Figshare https://doi.org/10.6084/m9.figshare.14706867 (2021).
Love, M., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 15 (2014).
Vitsios, D. et al. Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests. Nucleic Acids Research 45, e177–e177 (2017).
Article CAS Google Scholar
Duarte, I., Carraco, G. & Andrade, R. P. gga_mirnOME | microRNA-seq | Novel Predicted miRNAs and Expression values from chick embryonic tissues. Figshare https://doi.org/10.6084/m9.figshare.14901102 (2021).
R Core Team. R: The R Project for Statistical Computing. R-project.org https://www.R-project.org/ (2017).
Wickham, H. et al. Welcome to the Tidyverse. Journal of Open Source Software 4, 1686 (2019).
Article ADS Google Scholar
Gehlenborg, N. UpSetR: A More Scalable Alternative to Venn and Euler Diagrams for Visualizing Intersecting Sets. R package version 1.4.0. https://CRAN.R-project.org/package=UpSetR (2019).
Pedersen, T. L. patchwork: The Composer of Plots. R package version 1.1.1. https://CRAN.R-project.org/package=patchwork (2020).
Soetaert, K. plot3D: Plotting Multi-Dimensional Data. R package version 1.3. https://CRAN.R-project.org/package=plot3D (2019).
Athar, A. et al. ArrayExpress update – from bulk to single-cell expression data. Nucleic Acids Research 47, D711–D715 (2018).
Article Google Scholar
Duarte, I., Carraco, G. & Andrade, R. P. gga_mirnOME | R notebook | miRNA Expression data analysis. Figshare https://doi.org/10.6084/m9.figshare.14706891 (2021).
Friedländer, M., Mackowiak, S., Li, N., Chen, W. & Rajewsky, N. miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades. Nucleic Acids Research 40, 37–52 (2011).
Article Google Scholar
Zhao, S. et al. QuickMIRSeq: a pipeline for quick and accurate quantification of both known miRNAs and isomiRs by jointly processing multiple samples from microRNA sequencing. BMC Bioinformatics 18 (2017).
Wu, X. et al. sRNAnalyzer—a flexible and customizable small RNA sequencing data analysis pipeline. Nucleic Acids Research 45, 12140–12151 (2017).
Article CAS Google Scholar
Potla, P., Ali, S. & Kapoor, M. A bioinformatics approach to microRNA-sequencing analysis. Osteoarthritis and Cartilage Open 3, 100131 (2021).
Article Google Scholar
Li, J. et al. COMPSRA: a COMprehensive Platform for Small RNA-Seq data Analysis. Scientific Reports 10 (2020).
RStudio Team. RStudio: Integrated Development for R. RStudio, Inc., Boston, MA http://www.rstudio.com/ (2015).
Bioconductor. Bioconductor.org. https://www.bioconductor.org/ (2019).

Download references

Acknowledgements

This study was supported by the Portuguese Fundação para a Ciência e Tecnologia (FCT) grant PTDC/BEX-BID/5410/2014 to RPA and ID and by the Município de Loulé. GC was supported by the FCT scholarship SFRH/BD/101609/2014.

Author information

These authors contributed equally: Isabel Duarte, Gil Carraco.

Authors and Affiliations

Faculdade de Medicina e Ciências Biomédicas (FMCB), Universidade do Algarve, Campus de Gambelas, 8005-139, Faro, Portugal
Isabel Duarte, Gil Carraco & Raquel P. Andrade
Center for Health Technology and Services Research (CINTESIS), Polo da Universidade do Algarve, 8005-139, Faro, Portugal
Isabel Duarte
ProRegeM-PhD Program in Mechanisms of Disease and Regenerative Medicine, Faro, Portugal
Gil Carraco
ABC-RI, Algarve Biomedical Center Research Institute, Faro, Portugal
Gil Carraco & Raquel P. Andrade
Genomics Core Facility, EMBL, Heidelberg, Germany
Nayara T. D. de Azevedo & Vladimir Benes
Champalimaud Research Program, Champalimaud Center for the Unknown, Lisbon, Portugal
Raquel P. Andrade

Authors

Isabel Duarte
View author publications
You can also search for this author in PubMed Google Scholar
Gil Carraco
View author publications
You can also search for this author in PubMed Google Scholar
Nayara T. D. de Azevedo
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Benes
View author publications
You can also search for this author in PubMed Google Scholar
Raquel P. Andrade
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

I.D. performed data analysis, managed, and archived the data, and drafted the manuscript. G.C. collected the samples, performed the RNA extraction, and drafted the manuscript. N.T.D.A. and V.B. acquired the data. R.P.A. conceived the study, designed the experiments, coordinated the project, and drafted the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Raquel P. Andrade.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.

Reprints and permissions

About this article

Cite this article

Duarte, I., Carraco, G., de Azevedo, N.T.D. et al. gga-miRNOME, a microRNA-sequencing dataset from chick embryonic tissues. Sci Data 9, 29 (2022). https://doi.org/10.1038/s41597-022-01126-7

Download citation

Received: 28 July 2021
Accepted: 15 December 2021
Published: 31 January 2022
DOI: https://doi.org/10.1038/s41597-022-01126-7

This article is cited by

Regulation of mRNA and miRNA in the response to Salmonella enterica serovar Enteritidis infection in chicken cecum
- Xiuxiu Miao
- Lewen Liu
- Xianyao Li
BMC Veterinary Research (2022)