Stage-specific transcription activator ESB1 regulates monoallelic antigen expression in Trypanosoma brucei

Variant surface glycoprotein (VSG) coats bloodstream form Trypanosoma brucei parasites, and monoallelic VSG expression underpins the antigenic variation necessary for pathogenicity. One of thousands of VSG genes is transcribed by RNA polymerase I in a singular nuclear structure called the expression site body (ESB), but how monoallelic VSG transcription is achieved remains unclear. Using a localization screen of 153 proteins we found one, ESB-specific protein 1 (ESB1), that localized only to the ESB and is expressed only in VSG-expressing life cycle stages. ESB1 associates with DNA near the active VSG promoter and is necessary for VSG expression, with overexpression activating inactive VSG promoters. Mechanistically, ESB1 is necessary for recruitment of a subset of ESB components, including RNA polymerase I, revealing that the ESB has separately assembled subdomains. Because many trypanosomatid parasites have divergent ESB1 orthologues yet do not undergo antigenic variation, ESB1 probably represents an important class of transcription regulators.


Reporting Summary
Nature Portfolio wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Portfolio policies, see our Editorial Policies and the Editorial Policy Checklist.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.
n/a Confirmed The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.
For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code Data collection  Data analysis ImageJ 1.5, BWA-MEM 0.7.17, samtools 1.7, ART 2016-06-05, Trinity 2.11.0, Rcorrector 1.0.4, Trim Galore! 0.6.0 For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Portfolio guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A description of any restrictions on data availability -For clinical datasets or third party data, please ensure that the statement adheres to our policy Note that full information on the approval of the study protocol must also be provided in the manuscript.

Field-specific reporting
Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
Sample sizes were not pre-determined. Sample sizes similar or exceeding previously published work in this field were used.

Replication
RNAseq analyses were repeated 4 times, with additional time points also analysed. ChIP sequencing data were replicated twice. Unless specifically noted, cell biology experiments were replicated three times. Genetic interventions were replicated by independent methods where possible: For ESB1 localisation, 3 independent tagging approaches. For ESB1 knockdown in BSFs, 2 idependent knowckdown/ conditional knockout approaches.
Randomization Participant allocation/randomisation is not relevant to this study. All work involved reference genome strains of cultured parasites, with experimental intervention applied to large populations of clonal parasites.

Blinding
Blinding during data collection was not possible as it involved sample-specific experimental intervention. Microscopy data was blinded for analysis of nuclear structures, with nuclei cropped and indexed by one researcher, then classified by a second prior to unblinding.

Behavioural & social sciences study design
All studies must disclose on these points even when the disclosure is negative.

Study description
Briefly describe the study type including whether data are quantitative, qualitative, or mixed-methods (e.g. qualitative cross-sectional, quantitative experimental, mixed-methods case study).

Ecological, evolutionary & environmental sciences study design
All studies must disclose on these points even when the disclosure is negative.

Study description
Briefly

Sampling strategy
Note the sampling procedure. Describe the statistical methods that were used to predetermine sample size OR if no sample-size calculation was performed, describe how sample sizes were chosen and provide a rationale for why these sample sizes are sufficient.

Data collection
Describe the data collection procedure, including who recorded the data and how.

Location
State the location of the sampling or experiment, providing relevant parameters (e.g. latitude and longitude, elevation, water depth).

Access & import/export Describe the efforts you have made to access habitats and to collect and import/export your samples in a responsible manner and in
compliance with local, national and international laws, noting any permits that were obtained (give the name of the issuing authority, the date of issue, and any identifying information).

Disturbance
Describe any disturbance caused by the study and how it was minimized.

Reporting for specific materials, systems and methods
We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. Methods n/a Involved in the study ChIP-seq

Specimen deposition
Indicate where the specimens have been deposited to permit free access by other researchers.

Dating methods
If new dates are provided, describe how they were obtained (e.g. collection, storage, sample pretreatment and measurement), where they were obtained (i.e. lab name), the calibration program and the protocol for quality assurance OR state that no new dates are provided.
Tick this box to confirm that the raw and calibrated dates are available in the paper or in Supplementary Information.

Ethics oversight
Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or guidance was required and explain why not.
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Animals and other research organisms
Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research, and Sex and Gender in Research

Laboratory animals
For laboratory animals, report species, strain and age OR state that the study did not involve laboratory animals.

Wild animals
Provide details on animals observed in or captured in the field; report species and age where possible.

Ethics oversight
Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or guidance was required and explain why not.
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Clinical data
Policy information about clinical studies All manuscripts should comply with the ICMJE guidelines for publication of clinical research and a completed CONSORT checklist must be included with all submissions.
Clinical trial registration Provide the trial registration number from ClinicalTrials.gov or an equivalent agency.

Study protocol
Note where the full trial protocol can be accessed OR if not available, explain why.

Data collection
Describe the settings and locales of data collection, noting the time periods of recruitment and data collection.

Outcomes
Describe how you pre-defined primary and secondary outcome measures and how you assessed these measures.

Dual use research of concern
Policy information about dual use research of concern

Hazards
Could the accidental, deliberate or reckless misuse of agents or technologies generated in the work, or the application of information presented in the manuscript, pose a threat to:

Experiments of concern
Does the work involve any of these experiments of concern: No Yes Confirm that both raw and final processed data have been deposited in a public database such as GEO.
Confirm that you have deposited or provided access to graph files (e.g. BED files) for the called peaks.

Data access links
May remain private before publication.

NCBI BioProject PRJNA784098
Files in database submission