Metacyclogenesis defects and gene expression hallmarks of histone deacetylase 4-deficient Trypanosoma cruzi cells

Trypanosoma cruzi—the causative agent of Chagas disease—like other kinetoplastids, relies mostly on post-transcriptional mechanisms for regulation of gene expression. However, trypanosomatids undergo drastic changes in nuclear architecture and chromatin structure along their complex life cycle which, combined with a remarkable set of reversible histone post-translational modifications, indicate that chromatin is also a target for control of gene expression and differentiation signals in these organisms. Chromatin-modifying enzymes have a direct impact on gene expression programs and DNA metabolism. In this work, we have investigated the function of T. cruzi histone deacetylase 4 (TcHDAC4). We show that, although TcHDAC4 is not essential for viability, metacyclic trypomastigote TcHDAC4 null mutants show a thin cell body and a round and less condensed nucleus located very close to the kinetoplast. Sixty-four acetylation sites were quantitatively evaluated, which revealed H2AT85ac, H4K10ac and H4K78ac as potential target sites of TcHDAC4. Gene expression analyses identified three chromosomes with overrepresented regions of differentially expressed genes in the TcHDAC4 knockout mutant compared with the wild type, showing clusters of either up or downregulated genes. The adjacent chromosomal location of some of these genes indicates that TcHDAC4 participates in gene expression regulation during T. cruzi differentiation.

. Analysis of sequence conservation of trypanosomatid HDAC4 orthologs. (A) Multiple sequence alignment of the histone deacetylases 4 from T. cruzi (C4B63_31g166) T. brucei (Tb427.05.2900) and Leishmania major (LmjF.08.1090) The HDAC domain is indicated between red brackets. Sequences are colored according to amino acid identity (black background) or similarity (gray background). Residues involved in zinc binding are indicated with blue stars and the catalytic tyrosine with a red star. (B) percentage of sequence identity between the trypanosomatid HDAC4 orthologs. Figure S3. Generation of TcHDAC4 null mutant parasites. (A) Scheme of TcHDAC4 gene knockout by homologous recombination. Epimastigote T. cruzi cells were initially transfected with the hygromycin selection marker containing the 5' and 3' TcHDAC4 flanking sequences. After selection, cells from the hygromycin-resistant population were transfected with the neomycin selection marker containing the 5' and 3' TcHDAC4 flanking sequence and the cells were selected for resistance to both hygromycin and neomycin. (B) Western blot of wild type (WT) and null mutant (KO) cell extracts using antisera specific to detect TcHDAC4 and TcGAPDH. The latter was used as a loading control. Only the part of the immunoblot corresponding to position of the TcHDAC4 and TcGAPDH bands is shown. Complete image of the western blot is presented at the end of this document as Figure (Table S11).
Amplification of the gene is seen only in the wild type sample. The lanes identified by "locus" (yellow box in Fig. CI-3) correspond to the PCR reactions performed using primers DAC4ExtF + DAC4ExtR (Table S11). The single band in the WT sample indicates the presence of the original genome sequence. The two bands in the KO samples represent each of the selection makers used to knockout the two copies of the TcHDAC4 gene. (D) Scheme of the expected PCR amplification products for each primer set. Figure S4. Dynamic changes in gene expression profile between different stages of wild type T. cruzi Dm28c. (A) Heatmap of 1,213 genes differentially expressed (FDR < 0.001) among the stages in wild type parasites. Each row represents a gene and the expression level of each gene is shown by a color scale of red and blue (z-score), which represents high and low expression, respectively. Each column represents a different sample, and the color above each sample represents the developmental stage. (B) Venn-diagram of genes identified in this study (grey circles) compared to two other studies: Santos and collaborators [8] (blue circle) and Smircich and collaborators [9] (pink circle). Developmental stages (grey circles) comprise all stages, using Epimastigote as reference. (C) Barplot of the percentage of differentially expressed genes with concordant and discordant patterns of change of expression. The y-axis shows the comparisons performed in this study and the x-axis the percentage of discordant (grey) and concordant changes in expression colored in pink or blue, which represent up and down regulated genes, respectively. Figure S5. Clusters of differentially expressed genes in epimastigote stage along 16 overrepresented chromosomes. Bars represent the chromosomes scaled according to their sizes; the ID names of the chromosomes are on the left, with the number of differentially expressed genes indicated below each ID name. Only genes that were identified as differentially expressed are marked as red bars or blue bars, which mean up and down regulation (KO/WT), respectively. Colored bars that are above the chromosome indicate that the genes are encoded in the plus strand, and colored bars underneath the chromosome represent genes encoded in the minus strand. Figure S6. Clusters of differentially expressed genes in stressed epimastigotes along 6 overrepresented chromosomes. Bars represent the chromosomes scaled according to their sizes; the ID names of the chromosomes are on the left, with the number of differentially expressed genes indicated below each ID name. Only genes that were identified as differentially expressed are marked as red bars or blue bars, which means up and down regulation (KO/WT), respectively. Colored bars that are above the chromosome indicate that the genes are encoded in the plus strand, and colored bars underneath the chromosome represent genes encoded in the minus strand. Figure S7. Clusters of differentially expressed genes in 24-h differentiating forms along 16 overrepresented chromosomes. Bars represent the chromosomes scaled according to their sizes; the ID names of the chromosomes are on the left, with the number of differentially expressed genes indicated below each ID name. Only genes that were identified as differentially expressed are marked as red bars or blue bars, which means up and down regulation (KO/WT), respectively. Colored bars that are above the chromosome indicate that the genes are encoded in the plus strand, and colored bars underneath the chromosome represent genes encoded in the minus strand. Figure S8. Clusters of differentially expressed genes in metacyclic trypomastigotes along 11 overrepresented chromosomes. Bars represent the chromosomes scaled according to their sizes; the ID names of the chromosomes are on the left, with the number of differentially expressed genes indicated below each ID name. Only genes that were identified as differentially expressed are marked as red bars or blue bars, which means up and down regulation (KO/WT), respectively. Colored bars that are above the chromosome indicate that the genes are encoded in the plus strand, and colored bars underneath the chromosome represent genes encoded in the minus strand. Figure S9. Unsupervised heatmap clustering of differentially expressed genes in KO vs WT parasites along all developmental stages. Each row represents a gene and, the expression levels are shown by the colored scale from white to blue (z-score scale at right), which represent low and high expression, respectively. Each column represents a sample and, the developmental stage and genetic condition (WT or KO) are assigned in the colored squares at the top.

Genome assembly:
Five genome assembly versions identified as Dm28c 2014, Dm28c 2017, Dm28c 2018, SylvioX10-1-2012 and SylvioX10-1were retrieved from TriTrypDB (release 44) [1] for evaluation using the QUAST (v.5.0.2) [2] algorithm. The genomes were inspect concerning the number of contigs, N50, larger contig and ratio of coverage as shown in Table SM1. This evaluation indicated that the two best genome assemblies were obtained for Dm28c 2018 and SylvioX10-1. These two genomes were initially selected for further analyses. genome. This was also observed using very sensitive parameters, which increased the multi mapping rate. In addition, Dm28c 2018 has 623 annotated SL sequences, while SylvioX10-1 had no SL sequences identified in its annotation file (release 44), which indicates that Dm28c 2018 has a better gene prediction annotation. Also, Dm28c 2018 assembly has 59 rRNA annotations, while SylvioX10-1 presented only four rRNA annotations throughout the genome. Thus, Dm28c 2018 (release-44) assembly and annotation were used as references for further analyses.

Metacyclic Trypomastigote expression analysis:
Next, in order to identify the possible gene targets of TcHDAC4, we have compared the gene expression pattern of wild type with TcHDAC4 knockout cells during the different stages of metacyclogenesis in vitro. First, to confirm TcHDAC4 knockout in the sequenced samples, we checked in the RNA-Seq data if all knockout samples showed no reads mapping at the TcHDAC4 gene locus (C4B63_31g166-t42_1, Dm28c 2018). Indeed, TcHDAC4 reads were absent in all developmental stage samples. However, in the metacyclic trypomastigote stage only one replicate corresponded to the TcHDAC4 knockout ( Figure SM1). Figure SM1. Unsupervised heatmap clustering of TcHDAC4 expression in each sample. Expression levels are shown as z-scores (-1 to 1.5) and are colored according to the scale at right from yellow to blue, which represent low and high expression, respectively. Developmental stage (stage) and genetic condition (condition, WT or KO) are indicated by two colored squares at the left side of each sample. ** samples that are clustered in the wrong group.
Unfortunately, obtaining new samples from knockout metacyclic trypomastigotes with the same method to replace those with problems was no longer possible since the DEAE resin used for parasite purifications was discontinued by the manufacturer (DEAE cellulose,Sigma). Other types of matrix with the DEAE functional group have been tested but they were not as efficient to separate the metacyclic trypomastigotes and, using any other protocol to perform this separation could introduce unknown sources of non-biological bias in the expression analysis. Therefore, the analyses comprise four samples each of the wild type and null mutant from epimastigote, stressed epimastigote and 24 h adhered differentiating parasites whereas for the metacyclic trypomastigotes the samples comprised three wild type and only one null mutant. To perform the gene expression analysis of the metacyclic trypomastigotes, we used the NOISeq (v 2.28.0) [6,7] algorithm, which can handle assays without replicates. To adjust the analysis parameters, we used the biological coefficient of variation (BCV) function from EdgeR to calculate the overall variability using all 30 samples in the analysis, namely all samples of the developmental stages and genetic conditions (WT and KO). To calculate the subsample parameter, we tested the statistical analysis using from 10% up to 80% subsampled reads.
Each test of subsampled reads size was repeated 10 times and the differentially expressed genes with FDR < 0.1 (probability 0.9) were calculated. The coefficient of variation (cv) was calculated for each point of subsampled reads size, and the smaller size that corresponded to a plateau in the graph (stabilized cv) was chosen ( Figure SM2). It is possible to observe that the plateau begins at the 0.2 (20%) size of simulated subsamples. Figure SM2. Distribution of coefficient of variation (cv) as a function of the size of simulated samples. The y-axis at the left side shows the cv in each of 10 simulated tests performed at each size of simulated samples. The right side of the y-axis shows the mean value of number of genes identified as differentially expressed (FDR < 10%). The x-axis shows the size of simulated samples.