Network Walking charts transcriptional dynamics of nitrogen signaling by integrating validated and predicted genome-wide interactions

Charting a temporal path in gene networks requires linking early transcription factor (TF)-triggered events to downstream effects. We scale-up a cell-based TF-perturbation assay to identify direct regulated targets of 33 nitrogen (N)-early response TFs encompassing 88% of N-responsive Arabidopsis genes. We uncover a duality where each TF is an inducer and repressor, and in vitro cis-motifs are typically specific to regulation directionality. Validated TF-targets (71,836) are used to refine precision of a time-inferred root network, connecting 145 N-responsive TFs and 311 targets. These data are used to chart network paths from direct TF1-regulated targets identified in cells to indirect targets responding only in planta via Network Walking. We uncover network paths from TGA1 and CRF4 to direct TF2 targets, which in turn regulate 76% and 87% of TF1 indirect targets in planta, respectively. These results have implications for N-use and the approach can reveal temporal networks for any biological system.


Effects of treatment conditions in root protoplast
To test the effects of cycloheximide (CHX) on gene expression in isolated root cells transfected with an expression construct, we performed the TARGET assay (as described in the main text and Methods) with the following changes. A total of 8-12 million cells were transfected separately with either the GR-only empty vector (EV) or GR-TGA4, in pBeaconRFP_GR 1 . Cells transfected with each construct were split into 6 replicate wells of a 24 well plate after being washed three times. Following overnight incubation, transfected root protoplasts were treated as described in the main text (+N/+Dex) except that 20 min before DEX treatment half of the samples (3 replicates) for each construct were treated with 35uM CHX in DMSO (+CHX) and half with DMSO only (-CHX). Transfected cells were sorted by FACS for RFP expressing cells 3 hours after DEX-induced nuclear import.
The effect of nitrogen (N) pre-treatments on identification of TF-target genes was tested in the TARGET assay (as above) with the follow changes. Following transfection of 12-16 million cells with the same constructs, samples were split into 9 replicate wells in a 24 well plate. After overnight incubation and 2 hours before DEX treatment, root protoplasts transfected with each construct were treated with either 20mM NH 4 NO 3 + 20mM NH 4 KNO 3 , 5mM KNO 3 + 15mM KCl, or 20mM KCl (3 replicates each treatment).
All samples were treated with 35uM CHX for 20 min before DEX treatment.
For the above samples, libraries were generated, sequenced and processed as described in the main text. Genes that had a 5 fold-change difference in expression level in the +CHX condition compared the -CHX condition (Table S1) were excluded from subsequent analyses.

Nitrogen response in protoplasts compared to whole roots
To compare the N-response in the isolated root protoplasts transfected with an expression construct compared to whole roots, we grew and transfected protoplasts with the EV construct, as described in the main text. N-treatment of the protoplast samples was either 20 mM KNO3 + 20 mM NH4NO3 or 20 mM KCl. No CHX or DEX treatments were performed. In parallel, we grew Arabidopsis plants in Phytatrays (Sigma) in liquid media of the same composition (1% w/v sucrose, 0.5 g/L MES, 1X MS basal salts (-CN), 1 mM KNO 3 , pH 5.7). After 11 days, and at the same time, transfected root protoplasts were being treated with N, we transferred the Arabidopsis plants to fresh Phytatrays containing either the basal MS media (1% w/v sucrose, 0.5 g/L MES, 1X MS basal salts (-CN), pH 5.7) with N-supply (20 mM KNO 3 + 20 mM NH 4 NO 3 ) or with 20 mM KCl. Five hours after N-treatment, when the protoplasts were being sorted, roots were harvested and flash frozen in liquid nitrogen. Four independent Phytatray and protoplasts samples were collected for each treatment. Libraries were generated, sequenced and processed as described in the main text.

Construction of a nitrogen-response inferred network
We used a machine learning approach called Dynamic Factor Graphs (DFG) 15 , which we have previously validated 6,16,17 , to derive the TF-target interactions in response to N-treatment in Arabidopsis roots. DFG infers interactions between 145 TFs and 1458 genes that responded to N in the root time-course 6 . DFG We tested the effect that cycloheximide (CHX) had on 25 superior housekeeping genes from Czechowski et al. 5 , by performing parallel TARGET assays with and without the CHX treatment (6 treatment replicates). We found that while CHX does have an effect on some housekeeping genes (e.g.  6 . This analysis shows that 80% of genes that respond to N-treatment in transfected root protoplasts respond in the whole root data sets. While a larger number of genes respond to N in intact roots, the overlap between the N-response genes in root protoplast and either whole root experiment (e.g. 56-59% of root protoplast N responsive genes) is similar to the overlap between the two whole root datasets (49-53%). To ascertain if high levels of overexpression of a TF in the TARGET system leads to more off-target genes responding to the TF, we plotted the level of TF expression relative to the empty vector control against the number of genes identified as differentially expressed. There is no correlation between expression level of the TF and number of target genes. Indeed, we see that two of the most highly over-expressed TFs, CRF4 and BEE2, have the fewest number of targets. Source data are provided as a Source Data file.

Supplementary Figure 6. The TARGET assay most often identified fewer direct regulated targets compared to TF-targets identified by in vitro or in vivo binding assays.
Barplots of the number of regulated target genes identified for each TF from TARGET, compared to the number of targets found to be bound using DAP-seq 8 (Table 1). Source data are provided as a Source Data file.

Supplementary Figure 7. Validation of the in planta relevance of direct TF-regulated targets identified in cells using the TARGET assay
The targets of TGA1 identified in isolated root cells using the TARGET assay (Supplementary Data 2) overlap significantly with the targets that respond to TGA1 overexpression in whole roots (Supplementary Data 4). The number in each area represents the number of TGA1 targets identified, and the number in parenthesis is the p-value (Fisher's exact test) of the overlap between the TGA1 targets identified in root cells using TARGET and those identified by TF overexpression in planta. We can also identify candidate direct targets (blue shading) as those that respond to the TF perturbation in isolated root cells (in the presence of CHX), and genes which only respond to TGA1 overexpression in planta, which are more likely to be indirect targets (red shading).

Supplementary Figure 8. Combinatorial influence of the N-early response TFs on N-uptake and Nassimilation pathway
Heatmap displaying the influence of each of the 33 N-early response TFs on the genes involved in Nuptake (ammonia and nitrate transporters), nitrate reduction to ammonia, Glutamine (Gln) and Asparagine (Asn) synthesis, and amino acid metabolism. The dendogram analysis clusters TFs that regulate shared genes in the pathway. Red and blue shading indicate repression and induction, respectively, relative to the EV control. Source data are provided as a Source Data file.

Supplementary Figure 9. Direct regulated edges to the N metabolism network identified in vivo by TARGET complement TF-binding assays
Heatmap displaying the influence of each of the 33 N-early response TFs on the 98 genes in the N-metabolism network screened by Gaudinier et al. in a recent high-throughput yeast-one-hybrid (Y1H) study 11 . The TARGET assay identifies 425 direct regulated targets of these 33 TFs for this set of genes in Arabidopsis root cells, compared to the 20 edges identified in the Y1H study (open diamonds). We also looked at the number of in vitro bound edges between the 17 TFs with DAP-seq targets 8 (TFs in bold) and the 98 genes of this N-metabolism network and found more edges (529) in the TF-DNA binding assay for these 17 TFs (open grey circles), compared to those identified in root cells using TARGET for all 33 TFs. Red and blue shading indicate repression and induction, respectively, relative to the EV control. Source data are provided as a Source Data file.

Supplementary Figure 10. Heatmap of cis-binding motif clustering for 1,282 Arabidopsis TF motifs into 80 groups
Heatmap of cis-binding motifs was generated by the RSAT matrix-clustering tool 12

Supplementary Figure 12. A pruned network inferred using Dynamic Factor Graphs predicts highconfidence targets for 145 NxTime TFs
The root NxTime data from Varala et al. 6 was used to infer TF-target influence with a time-based machine learning approach, DFG 15 . Using genome-wide validated targets for 29 of the 33 TFs (Fig. 2) identified by the TARGET system in root cells, the inferred edges in the DFG inferred GRN were pruned to a precision threshold of 0.32, chosen based on AUPR analysis ( Fig. 6 and Supplementary Table 2). This means that ~1/3 predicted TF-target edges are likely true. The resulting network is displayed in the context of the Just-in-Time bins for each TF (left) and NxTime TF-target genes (right) 6 . The size of the nodes is representative of the number of edges for each TF or target. The shading of the edges indicates the edge score from DFG.