TFvelo: gene regulation inspired RNA velocity estimation

RNA velocity is closely related with cell fate and is an important indicator for the prediction of cell states with elegant physical explanation derived from single-cell RNA-seq data. Most existing RNA velocity models aim to extract dynamics from the phase delay between unspliced and spliced mRNA for each individual gene. However, unspliced/spliced mRNA abundance may not provide sufficient signal for dynamic modeling, leading to poor fit in phase portraits. Motivated by the idea that RNA velocity could be driven by the transcriptional regulation, we propose TFvelo, which expands RNA velocity concept to various single-cell datasets without relying on splicing information, by introducing gene regulatory information. Our experiments on synthetic data and multiple scRNA-Seq datasets show that TFvelo can accurately fit genes dynamics on phase portraits, and effectively infer cell pseudo-time and trajectory from RNA abundance data. TFvelo opens a robust and accurate avenue for modeling RNA velocity for single cell data.


Introduction
Single-cell RNA sequencing (scRNA-seq) [1][2][3] provides a wealth of information about the gene expression profile of individual cells.To expand the scope of scRNA-seq analysis beyond a static snapshot and capture cellular dynamics without tracking alive individual cells over time, many pseudotime and trajectory inference (TI) approaches, like PAGA 4 , Monocle 5 , Slingshot 6 and Palantir 7 have been developed.In recent years, RNA velocity 8 theory was proposed, which describes the time derivative of gene abundance in a physical-informed approach, by modeling the relationship between the unspliced (immature) and spliced (mature) mRNAs.Combining velocities across multiple genes, velocity-based pseudotime and cell fate can be inferred from the transition probability between cells.To estimate the RNA velocity, velocyto 8 is introduced as the vanilla approach with a steady state assumption.ScVelo 9 models the dynamics without the steady state assumption and employs an Expectation-Maximization (EM) approach for better reconstructing the underlying kinetics.
More recently, several approaches have been developed to improve the estimation of RNA velocity.For instance, VeloAE 10 utilized an auto-encoder and low-dimensional space to smooth RNA velocity.Still in the unspliced/spliced space, DeepCycle 11 used an autoencoder to map cell cycle-related genes.UniTVelo 12 directly designed a function of time to model the spliced RNA level.In contrast to deterministic models, VeloVI 13 , Pyrovelocity 14 and veloVAE 15 estimated RNA velocity using Bayesian inference frameworks.LatentVelo 16 embeds the unspliced and spliced expression into the latent space with a variational auto-encoder, so that a low-dimensional representation of RNA velocity could be obtained.CellDancer 17 models the transcription, splicing and degradation rate of a gene as a function of its unspliced and spliced counts with a neural network.Furthermore, apart from the unspliced/spliced data, dynamics can also be modeled with additional information.For instance, Dynamo 18 improve the RNA velocity model by using the new/total labeled RNA-seq 19 .Taking the advantage of single-cell multi-omics technology 20,21 , RNA velocity analysis can be expanded to protein abundances (protaccel 22 ) and single-cell ATAC-seq (MultiVelo 23 ) datasets.
However, these additional omics data or labeling may be cost or even not available in most single cell datasets.
While RNA velocity theory has improved the inference of single-cell trajectories, pseudo-time, and gene regulation inference in numerous studies 24 , current RNA velocity models still face severe challenges.Firstly, RNA velocity models always treat each gene independently and do not take the underlying regulation into consideration 25 , although it is expected to be helpful for cell fate inference by integrating gene regulatory mechanism analysis.Secondly, these models can not fit the gene dynamics well on most genes, which might be because that the transcriptional dynamics of mRNA splicing may not provide sufficient signal in single cell resolution 25 .The high sparsity and noise nature of unspliced mRNA counts 26,27 could be one significant reason.Most conventional approaches assume that the joint plot between unspliced and spliced expression should form a clockwise curve on the phase portrait 8,9,12 because of the phase delay within them, which is, nevertheless, rarely observed from the data.Last but not least, the RNA velocity models can only be applied to scRNA-seq datasets with unspliced/spliced or new/total labels.However, the abundance of unspliced/spliced data may not be available in certain datasets, such as those obtained through Fluorescence In Situ Hybridization (FISH) technologies 28 , and some human sequencing data due to privacy restrictions.These challenges motivate us to construct the gene dynamics based on regulation, instead of only relying on the unspliced/spliced counts.
The gene regulation has been explored a lot in the field of single cell research [29][30][31][32][33] .In this study, we aim at inferring the gene dynamics and cell fate based on the underlying regulatory among genes.To this end, we investigate the relationship between gene regulatory patterns and RNA velocity, which reveals that the RNA velocity can be approximately estimated as a linear combination of the expression levels of transcription factors (TFs).Notably, the TFs with the non-zero weights are significantly enriched with TF set functionally linked to the target gene.In addition, we find that the joint distribution of expressions between a TF and its target could form a clockwise curve on the co-expression plot 34 , indicating the phase delay between them.Those findings imply that the abundance of TFs could also be used to construct dynamics model of RNA velocity.
Here we propose a new RNA velocity estimation method, TFvelo, to address the challenges mentioned above.TFvelo is a novel method that models the RNA velocity, which means the time derivative of the expression level here, based on the expression levels of TFs.The computational framework of TFvelo relies on a generalized EM algorithm, which iteratively updates the learned representation of TFs, the latent time of cells, and the parameters in the dynamic equation.Compared with those TI methods, TFvelo is a velocity-based pseudotime inferenced method that does not require the initial cell state annotation, and offers additional analysis capabilities, such as modeling the dynamics of each gene, identifying best fitted genes and significant TFs.As a result, this approach enables RNA velocity modeling to be robustly applied on those genes where unspliced count is sparse, and even those datasets without splicing information.
Finally, using both synthetic data and real datasets, we demonstrate that the TFvelo model can capture the dynamics on phase portraits of TFs-target, accurately infer the pseudo-time and the developmental trajectories of cells and help explore biological findings.

Results
Findings: RNA velocity can be approximately estimated with TFs' abundance.
The fact that RNA velocity can be modelled by the transcriptional regulation is supported by the following two findings.We find that the RNA velocity model by previous approach can be approximately estimated as a linear combination of TFs' abundance.On the scRNA-Seq pancreas dataset 8 , given a target gene, Least Absolute Shrinkage and Selection Operator (LASSO) regression is applied to predict RNA velocity modelled by scVelo, based on the expression level of TFs and the target gene itself (Fig. S1a).Here we run scVelo pipeline on pancreas dataset with the default parameters to get the velocities, and only apply LASSO regression to those genes which are modelled by scVelo.We feed all TFs 35 into LASSO, and get a sparse weights-vector where only some TFs have non-zero weights.As a result, the RNA velocity of gene  can be estimated as: where  represents the set of TFs,  and  are the expression level of  and gene  respectively,  , and  , are the weights of  and the target gene itself.
Cells are separated into training and test sets to evaluate the LASSO model.The regression results on the test set shows a high correlation between the predicted and labeled velocity, as demonstrated in Fig. S1 b-e, which indicates that velocity of the target gene can be predicted based on the expressions of its TFs.In addition, as shown in Table S1, the TFs with non-zero positive weights are significantly enriched in TFs set functionally linked to the target gene (according to ENCODE TF-target datasets 36 ) with p value of 6.66e-06 (one-sided t-test), further suggesting that velocity could be modelled by TFs.Secondly, due to regulatory relationship between TF and target, the expression levels of them change asynchronously, which could result in a clockwise curve on the phase portraits.And as expected, we find the desired clockwise curve on the co-expression plots between some TF-target pairs 34 (Fig. S2), which is similar to the theoretical prediction of existing velocity models on unspliced-spliced space.These findings suggest that it is possible to construct a novel RNA velocity model based on the expression of TFs, instead of using unspliced/spliced RNA counts or any additional experimental omics data and labels.

Estimate RNA velocity with TFvelo
Here we report TFvelo to estimate the RNA velocity of a target gene using the abundance of the target gene itself and its potential TFs.Fig. 1a  where  and  are two sets of gene-specific time-invariant parameters, which describe the influence of TFs on the target gene, and the shape of phase portraits, respectively.Considering that linear models have been employed to represent the gene regulatory in previous studies [37][38][39] , ℎ    ,   ;  is implemented with a linear model         .The profile function of   can be chosen flexibly from a series of second-order differentiable functions 12 , which is designed as a sine function in implementation (Methods).This is because sine functions can model the nonmonotonic dynamics start from both down-regulation and up-regulation, which is not feasible for those methods relying on one switching time point 8,9,18 .
For optimization, all learnable parameters are divided into three groups, which are cellspecific latent time , TFs' weights   and the shape parameters of phase portraits  , in which  and  represent number of cells and number of genes, respectively.A generalized EM 40,41 algorithm is adopted to minimize the loss function in each iteration, by updating the three groups of parameters alternately (Fig. 1 f-h).Based on the optimized models on all genes, we can obtain a transition matrix and calculate the pseudotime and velocity stream for cell trajectory analysis.Please find the details in Methods.The performance of TFvelo was evaluated using the spearman correlation between the ground truth and the reconstructed values.Under 20 iterations for optimization, the spearman correlation coefficient between the ground truth weights and inferred weights of TFs is 0.86, and that for velocity estimation is 0.95 (Fig. 1n&o).The high consistence shows that TFvelo can effectively reconstruct the underlying dynamics.In addition, the high F1 score (0.951) and high ratio of area under the receiver operating characteristic (ROC) curve (0.968) also demonstrate that TFvelo can correctly recognize whether the weight is positive or negative.Please see Supplementary Information Section 3 for details on the synthetic dataset.

TFvelo can model cell differentiation process on Pancreas dataset.
To evaluate TFvelo on real scRNA-seq data, we first apply it to the dataset of E15.5 mouse pancreas 8 , which has been widely adopted in previous RNA velocity studies.
The pseudo-time and trajectory predicted by TFvelo can identify the differentiation process (Fig. 2a&b).Compared with previous methods, TFvelo shows advantage on phase portrait fitting.Example genes are shown in Fig. 2c and Fig. S7.Fig. 2c compares the phase portrait fitting of different methods, and Fig. S7 provides the gene expression dynamics resolved along gene-specific latent time obtained by scVelo and TFvelo.The expression of H19, which plays an important role in early development 42 , is found to increase at the beginning and then decrease at the stage of Ngn3 high EP by TFvelo.
For MAML3, only TFvelo can correctly detect the process that starts with up-regulation then turns to down-regulation.Other methods can not correctly detect the order between pre-endocrine cells (in green) and those endocrine populations, including alpha, beta, delta and epsilon cells (in blue, light blue, purple and pink).
To quantitatively evaluate the phase portrait fitting, we propose three metrics, including: (1) The intra-class distance on phase portrait, which reflects how cells within the same type gather on the phase portrait (the lower the better).( 2) The inter-class distance on phase portrait, which reflects how cells from different types are separated well on the phase portrait (the higher the better).( 3) The fitting error on phase portrait, which measures the normalized distance between each cell to the constructed model on the phase portrait.We calculate these metrics based on the genes that can be commonly modeled by all methods.For the "intra-class distance" and "inter class distance", we take the Paired Samples T Test between the value obtained on the same gene by TFvelo's phase portrait and the un/spliced phase portrait, which is commonly adopted by all baselines.The TFvelo results (Fig. 2d-f) show advantage with extreme high significance (p=4.23e-51 and p=5.65e-117).Apart from the noise and high sparsity (Fig.

S6
) in splicing data, considering that the splicing can be accomplished within 30 seconds 43 , the too short phase delay between the unspliced and spliced mRNA might also give the phase portrait fitting more challenge (Fig. S12).Compared with those un/spliced based approaches, TFvelo can achieve lower intra-class distance, higher inter-class distance and lower fitting error.To draw a conclusion, TFvelo can address the current issue caused by noisy un/spliced data for RNA velocity modeling, by using the learned feature with transcription regulation.
In addition, we also employed two metrics adopted in VeloAE and UniTVelo, which are "Cross Boundary Direction Correctness (CBDir)" and "In Cluster Coherence (ICCoh)".CBDir assesses the correctness of transitions from one cell type to the next, utilizing boundary cells with ground truth annotations.ICCoh is computed using cosine similarity among cells within the same cluster, measuring the smoothness of velocity in high-dimensional space within clusters.As shown in Fig. 2g&h, TFvelo achieves similar median value compared to UniTVelo, and significantly outperforms the other methods.Also TFvelo can even achieve a high velocity confidence (Fig. 2i), which indicates a higher consistency of RNA velocities across neighboring cells in the UMAP space 9 .The whiskers extend to 1.5× the interquartile range of the distribution from the hinge.
Data beyond the end of the whiskers are determined to be outliers and plotted individually as diamonds.
Then we conduct KEGG pathways enrichment analysis based on the best fitted genes, and observe a strong enrichment associated with insulin secretion and the glucagon signaling pathway (Fig. 3a).Additionally, we find that REST and HMGN3 consistently exhibit high absolute weights on modeling most target genes (Fig. 3b).When examining the UMAP distribution, it is clear that REST expression decreases during differentiation, while HMGN3 expression increases (Fig. 3c).Previous research has established REST as a key negative regulator of endocrine differentiation during pancreas organogenesis 44,45 .Earlier studies have also report HMGN3 to be a regulator for insulin secretion in pancreatic cells 45 .We further analyze the weights of these two TFs on modeling genes within the insulin secretion pathway.REST consistently has a negative weight, while HMGN3 consistently exhibits a positive weight (Fig. 3d), which is consistent with the previous findings.
The comparison of phase between TFs and target are shown in Fig. 3e, which shows clear phase delay between the learned representation of TFs () and target gene's expression.We also analyze the phase delay between a TF and the individual target.
Among the two genes, HMGN3 has a positive weight on modeling SURF, where a phase delay between them can be observed.On the contrary, REST has a negative one on modeling ECE1, where their changing shows negative correlation.

TFvelo can model cell differentiation process on gastrulation erythroid dataset.
The Gastrulation Erythroid dataset is selected from the transcriptional profile for mouse embryos 46 , and has be adopted in RNA velocity studies 12 .Using TFvelo, the pseudotime and cell development flow visualized by streamlines are consistent with the development in the erythroid lineage, from blood progenitors to erythroid cells (Figs. 4a and 4b).By comparison, several of the previous approaches cannot capture such developmental dynamics, as shown in Fig. 4d.
The sparsity in data is a common challenge for scRNA-seq studies.Considering that only about 20% of reads contained unspliced intronic sequences 8 , the sparse unspliced abundance is a large obstacle for dynamics fitting in phase portrait.Our quantitative analysis of the sparsity in unspliced, spliced and total mRNA counts is shown in Fig. 4c, which verifies the high sparsity in unspliced counts.Fig. 4e shows the comparison on two genes with very sparse unspliced counts to illustrate the advantage of TFvelo for addressing the issue of sparsity.Although these genes can pass the filtering and selection during preprocessing, they are still too sparse to provide sufficient information for fitting the spiced-based models well.Fig. 4f provides more example genes of the comparison between TFvelo and baselines.TACC1 is annotated as a gene involved in promoting cell division prior to the formation of differentiated tissues 47 .In our experiment, the expression of TACC1 initially increases, reaching its highest value in blood progenitors 2, which is just before differentiation into erythroid cells.
Subsequently, TACC1's expression decreases with pseudo-time from erythroid cells 1 to erythroid cells 3. By comparison, previous approaches fail to detect a dynamic starting from blood progenitors 1 (in red).As for HSP90AB1, only TFvelo can capture the correct dynamics from the learned phase portrait.
We next take the GO term enrichment analysis based on the best fitted genes, and find that the most significant GO terms include porphyrin-containing compound biosynthetic process (GO:0006779) and heme biosynthetic process (GO:0006783), which are directly related to erythroid development (Fig. 4g).In addition, the TFs having high weights on most target genes are GATA1, GATA2 and LMO2 (Fig. 4h), all of which has been reported to be involved in erythroid differentiation.GATA1 plays a significant role in regulating the transcriptional aspects of erythroid maturation and function 48 .GATA2 is verified to regulates hematopoietic stem cell proliferation and differentiation 49 .LMO2 is a bridging molecule assembling an erythroid, DNA-binding complex 50 .These findings demonstrate the potential of TFvelo in detecting important regulatory genes from single cell data.Next we apply TFvelo to a 10x multi-omics embryonic mouse brain dataset, where both Assay for Transposase-Accessible Chromatin with sequencing (ATAC-Seq) 51  The pseudotime inferred by TFvelo closely resemble Multivelo's results (Fig. 5a and     b), and both methods correctly identify the differentiation direction in most cells.
Notably, TFvelo achieves this performance without using ATAC-seq data.Fig. 5c shows the phase portrait fitting of both methods, where Multivelo could show an additional c-u phase portrait, reflecting the joint plot between ATAC-seq and unspliced mRNA.
Regarding the AHI1 gene, Multivelo mistakenly constructs a cyclic dynamic, in contrast, TFvelo correctly captures the dynamic, with expression increasing consistently from V-SVZ cells (in green).For the NTRK2 gene, TFvelo captures such dynamics correctly, while Multivelo misses the decreasing process at the beginning.On the GRIN2B gene, both TFvelo and Multivelo detect the same dynamics, with expression increasing from the beginning to the terminal stage.These results suggest that by extracting features from multiple TFs, TFvelo can achieve a better fitting for the dynamic of individual gene than Multivelo, without using ATAC-seq.Perhaps the high sparse of ATAC-seq data 52   TFvelo can predict the cell fate on dataset without splicing information.
Next, TFvelo was applied to another single-cell RNA-seq dataset comprising 1,529 cells obtained from 88 human preimplantation embryos ranging from embryonic day 3 to 7 53 .Since the raw sequencing file is not provided on the online dataset server (see Data Availability Statement), those RNA velocity methods relying on the spliced/unspliced information are not feasible.By contrast, TFvelo still successfully identifies the pseudo-time and developmental trajectory of the cells, as depicted in Fig. 6a and Fig. 6b.Additionally, dynamic models successfully fit the expression patterns of target genes, which are illustrated in Fig. 6d as examples.In details, the dynamics predicted on PPP1R14A reveals a clockwise curve on TFs-target plot, and finds the changing point between the increasing and decreasing of expression level at day 5 postfertilization.Consistent with the TFvelo results, it has been reported that the expression of PPP1R14A can be detected with high dynamics that it increases at the beginning and declines throughout the development in the early stage of zebrafish embryos 54 .TFvelo finds that the expression of RGS2 is the highest at the beginning, and decreases with the pseudo-time, which can be supported by the early findings that RGS2 plays a critical role in early embryo development 55 .
Apart from scRNA-seq, In Situ Hybridization (ISH) techniques can also detect RNA abundance at single-cell resolution.However, the splicing information is not provided in the ISH-based datasets, which prevents the application of existing RNA velocity models requiring the splicing input.To show the capability of TFvelo on these data, we then perform TFvelo modelling on a dataset obtained by Multiplexed Error Robust Fluorescence In Situ Hybridization (MERFISH) 28,[56][57][58] , which is a spatial transcriptomics method based on ISH.On a cultured U-2 OS MERFISH dataset 28 , previous studies have identified distinct cell cycle states based on marker genes and GO term analysis 28,59 .They applied scVelo to this dataset, where nuclear and cytoplasmic counts mimic unspliced and spliced transcripts 28,60 .As shown in Fig. 6e, without using the nuclear and cytoplasmic counts, which is not always provided in single-cell datasets, TFvelo can also detect the cyclic dynamics from the pseudo-time on the MERFISH dataset.Fig. 6f shows genes that are best fitted by TFvelo.

Discussion
The recent development in RNA velocity approaches provide new insights for temporal analysis of single cell data.However, the existing methods often fails to fit the cell dynamics well in the unspliced/spliced space, which is partially because of the insufficient signal and high sparsity and noise.Motivated by the insight that the phase delay between TFs and target can also provide temporal information, we propose TFvelo to estimate the gene dynamics based on TFs abundance.Different from previous approaches which require the splicing information from the data, TFvelo models the dynamics of a target gene with the expression level of TFs and the target gene itself.In the computation framework, a generalized EM algorithm is adopted, so that the latenttime, the weight of TFs and those learnable parameters in the dynamic function can be updated alternately in each iteration.
Experiments on a synthetic dataset validate that TFvelo could detect the underlying dynamics from the data.Results on various scRNA-seq datasets, a MERFISH dataset and a 10x multi-omics dataset further demonstrate that TFvelo can fit the gene dynamics well, showing a desired clockwise curve on the phase portrait between the TFs representation and the target.Furthermore, TFvelo can be used to infer the pseudo time, cell trajectory and detect key TFs which play important role in the differentiation.
In addition, compared with the previous RNA velocity model relying on splicing information, TFvelo can achieve better performance on those scRNA-seq datasets, and be flexibly applied to more categories of datasets without splicing information.
In the future, more complex models may be explored to replace the current linear model in TFvelo for estimating the weight of each TFs.The employed sine function for describing expression dynamics could also be replaced by other functional forms flexibly.In addition, those recently proposed ideas for improving unspliced/splicedbased RNA velocity models can also be adopted in the further development of TFvelo, including stochastics modeling with Bayesian inference 14 , multi-omics integration 22 and universe time inference 12 .

Data preprocessing.
We utilize the data preparation procedure with the default setting in scVelo 9 , where the difference is that TFvelo is applied to the total mRNA counts, which is the sum of unspliced and spliced counts.Additionally, after filtering out those genes that can be detected from fewer than 2% of cells, and selecting the top 2,000 highly variable genes (HVGs), we normalize the expression profile of these HVGs by the total count in each cell.A nearest-neighbor graph (with 30 neighbors by default) was calculated based on Euclidean distances in principal component analysis space (with 30 principal components by default) on log-transformed spliced counts.After that, first-order and second-order moments (means and uncentered variances) are computed for each cell across its nearest neighbors.These steps are implemented with scv.pp.filter_and_normalize() and scv.pp.moments().
Additionally, to find potential transcriptional relationship within the selected 2,000 highly variable genes, we refer to the TF-target pairs annotated in ENCODE TF-target dataset 36 and ChEA TF-target dataset 61 .If the regulatory relationship between a TFtarget pair is labeled in at least one of these two datasets, the TF will be included in the involved TFs set when modeling the dynamic of the target gene.

The computational framework of TFvelo
Given a gene  , TFvelo assumes that the RNA velocity is determined by the expression level of involved TFs   and the target gene itself  , which is ℎ    ,   ;  .Using a top-down strategy 12 , TFvelo directly designs a profile function of target gene's expression level to relax the gene dynamics to more flexible profiles, that is    ;  . and  are two sets of gene-specific, timeinvariant parameters, which control the influence of TFs on the target gene, and the shape of target gene's phase portraits, respectively.Our findings and early studies 37 have shown that it is feasible to infer the regulation relationship according to the learned parameter in linear regression models based on expression data.Additionally, linear models always have high interpretability, and rarely suffer from over-fitting.As a result, where  ∈ 0,1 is the latent time assigned to each cell,  is the degradation rate.
Besides, The profile function of   can be chosen flexibly from those second-order differentiable functions 12 .For simplification,   is designed with a sine function in our implementation, which is high-order differentiable and suitable for capture the curves on phase portrait.In addition, different with those models which assume the gene expression should follow a pattern of initial improvement followed by a decrease (or part of the whole process), TFvelo is more flexible and can also model gene dynamics where expression initially decreases and later increases with the sine function.
;        , 5 where  ,  ,  are gene-specific parameters to be learned. is fixed as 2π, so that the   is unimodal within  ∈ 0,1 , which follows the common assumption shared by most current approaches.By employing the  function, TFvelo can model the dynamics of each gene without defining a switching time point between the of increasing and decreasing periods 8 9 14 18 .To optimize the parameters in TFvelo, a generalized EM 40,41 algorithm is adopted, as discussed in the following.

Optimization of generalized EM algorithm
To simplify the mathematical representation, here we denote the   ,   ,  ,  ,  ,  ,  and  as , , , , , ,  and , respectively.As a result, the equation ( 5) and ( 4) can be written as .7 From the dynamics model in equation ( 6) and (7) In standard EM algorithms, the E-step computes the expected value of likelihood function given the observed data and the current estimated parameters to update the latent variables, and the M-step consists of maximizing that expectation computed in E step by optimizing the parameters.While under some circumstances, there is an intractable problem in M step, which could be addressed by generalized EM algorithms.
Expectation conditional maximization is one form of generalized EM algorithms, which makes multiple constraint optimizations within each M step 40 .In detail, the parameters can be separated into more subsets, and the M step consists of multiple steps as well, each of which optimizes one subset of parameters with the reminder fixed 41 .
In TFvelo, all learnable parameters can be divided into three groups, which are cell- By iteratively running the three steps, parameters of TFvelo for each target gene can be optimized.Furthermore, multiple downstream analysis could be conducted by combining the learned dynamics of all target genes.

Initialization of the generalized EM algorithm.
Since

Inference of Pseudo-time
After constructing the dynamic model for each gene, TFvelo will infer a gene-agnostic pseudo-time for each cell based on the combination of RNA velocities of all genes.For this purpose, we first compute the cosine similarities between velocities and potential cell state transitions to get the transition matrix.Then the end points and root cells are obtained as stationary states of the velocity-inferred transition matrix and its transposed, respectively.After inferring the root cells' location on the embedding space, velocity pseudo-time, which measures the average number of steps it takes to reach a cell after start walking from root cells, can be computed.The strategy of calculating pseudo-time is the similar with that in scVelo 9 .The different is that in scVelo, the transition matrix is obtained based on the abundance and velocity of spliced mRNA, while in TFvelo that is based on the abundance and velocity of total mRNA.

Velocity streams on 2D embedding space
To illustrate cell transitions, we create stream plots in a 2D space using UMAP visualization as the default.To this end, we select genes whose loss is low and genespecific time aligns with the pseudotime to construct velocity streams.
For the single cell datasets, we first select genes with modeling error in the bottom 50%.
Since sine functions exhibit periodicity, we need to determine the order of cells instead of using latent time directly.We achieve this by analyzing the histogram of latent time.
If there are several consecutive blank bins in the cell distribution of latent time, we set the minimum value of the next non-blank bin as the initial state (normalized latent time = 0).Conversely, the maximum value of the last non-blank bin will be set as the final state (normalized latent time = 1).The normalized latent time of all other cells will be mapped within the range of 0 to 1. Subsequently, we select genes where the normalized latent time aligns with the pseudotime based on the Spearman correlation coefficient between them.
For the MERFISH dataset of human osteosarcoma (U-2 OS) cells, since our focus is on analyzing the cell cycle process, we choose genes that exhibit periodic dynamics.To achieve this, we analyze the histogram of latent time and only select genes with no nonblank bins, indicating that their dynamics cover the entire periodic curve on the phase portrait.We then perform clustering within the selected genes, choosing the largest gene clusters as the best-fitted genes, by which the velocity graph following velocity embedding stream are generated in the same strategy as that in scvelo.

Metrics for evaluating in phase portrait.
We propose three metrics for phase portrait fitting.
(1) The intra-class distance on phase portrait, which measures the normalized distance between cells within the same type on the phase portrait.This reflects whether cells within the same type gather on the phase portrait, which is the lower the better.On unspliced/spliced based methods, the intra-class distance is defined as  ∑ ∑ ∈ , where  ,  means the center of cell type k, std(.)means the standard variance and  means the index of a cell.For TFvelo, the distance can be defined by substituting  with  and  with .
(2) The inter-class distance on phase portrait, which measures the normalized distance between the distribution centers of different cell types on the phase portrait, which is defined as  ∑ .This reflects whether cells from different types are separated well on the phase portrait, which is the higher the better.
(3) The fitting error on phase portrait, which measures the normalized distance between each cell to the constructed model on the phase portrait.This reflects the fitting accuracy and calculated in the following way.On the phase portrait of each selected gene, calculate the normalized mean square error of the distance between each cell to the inferred model curve.For the TFvelo data,  ∑ , where  means the index of a cell ,  means the latent time modeled for cell  and N is the total number of cells.  and  refer to the location of cell on the phase portrait.  and   refers to the model given by TFvelo.Similarly, the error for un/spliced data-based methods is defined as  ∑ .

GO term and KEGG pathway enrichment analysis
To perform the GO term and KEGG pathway enrichment analysis, we utilized the "gseapy.enrichr()"function in Python package "gseapy" with using Fisher's exact test by default.For each dataset, the background gene list consists of all genes within the original dataset before preprocessing.

Weights Normalization and TFs selection
Considering that the abundance of different TFs could vary a lot, the quantitative comparison between the weights of TFs is conducted after normalizing them.For each TF on modeling a target, we multiply the mean expression level by the learned weights to get the average influence of the TF to the target gene.For instance, the average influence of TF  is  * ̅ .Then the normalized weights are defined as: . The TFs with high weights are select according to these normalized weights with default threshold of 0.5.

Figure 1 .
Figure 1.The overview of TFvelo.(a) The comparison among constructed dynamics on LITAF gene in Pancreas dataset by TFvelo and previous approaches.Cells are colored in the same way as those in Fig. 2a.(b) The workflow of TFvelo.(c-e) The phase portrait plot on WX-y space, the plot of WX with respect to pseudotime, and the plot of y with respect to pseudotime.The color bar indicating the RNA velocity is share by the three panels.For simplification,  ,  ,  and  , are written as W, X, y and , respectively.(f-h) The steps within each iteration of the generalized EM algorithm.Here the purple line represents the current prediction by the model.(f) Assignment of latent time to each cell (in green) to find the corresponding target point (shown in red), which is the nearest point to the cell on the purple line.(g) Optimization of the weights  to move all cells along the  axis, minimizing the mean loss over all cells.(h) Optimization of the parameters in the dynamical function, to move the target point on the curve closer to each cell.(i-k) Visualization and analysis of TFvelo's results, including phase portrait fitting pseudo-time inference, and cell trajectory prediction.(lo) Simulation on synthetic dataset.(l) An example of the generated synthetic dynamics.(m) Reconstructed dynamics by TFvelo of the sample shown in (l) with MSE loss.(n) The joint plot of ground truth velocity and inferred velocity with their spearman correlation.The red reference line refers to that the ground truth is equal to the inferred value.(o) The joint plot of ground truth weights and inferred weights with their spearman correlation.

Figure 2 .
Figure 2. Results of TFvelo on pancreas dataset.(a) Pseudo-time inferred by TFvelo in the UMAP-based embedding space.(b) The stream plot projected to UMAP.The Ngn3 low/high EP denotes the neurogenin3 low/high epithelial cells.(c) Two example genes for illustrating the dynamics fitting in phase portrait.The cells are colored in the same way as panel (b).(d-f) The quantitative comparisons in phase portrait fitting.(d) Intra class distance.(e) Inter class distance.(f) Fitting loss.(g-h) The quantitative comparisons in high-dimensional space.(g) Cross boundary direction correctness.(h) In cluster coherence.(i) Velocity confidence.For each box-plot, the down, central and up hinges correspond to the first quartile, median value and third quartile, respectively.

Figure 3 .
Figure 3. Biological analysis based on the results of TFvelo on pancreas dataset.(a) The KEGG pathway enrichment from the genes best fitted by TFvelo on pancreas dataset.(b) List of TFs having high weights on multiple target genes.The counts mean the number of target genes where the TF have a normalized weight whose absolute value is larger than 0.5.(c) The distribution of REST and HMGN3 on UMAP.(d) The weights distribution of TFs REST and HMGN3 on modeling in the genes that in the insulin secretion pathway.The down, central and up hinges correspond to the first quartile, median value and third quartile, respectively.The whiskers extend to 1.5× the interquartile range of the distribution from the hinge.(e) Phase analysis on two example genes.On each row, from the left to the right: the phase portrait fitting, the distribution of abundance on UMAP, the value of learned TFs representation (WX) and the target gene along with pseudotime, and the abundance of a TF and the target gene along with pseudotime.

Figure 4 .
Figure 4. Results of TFvelo on gastrulation erythroid dataset.(a) Pseudo-time inferenced by TFvelo projected into a UMAP-based embedding.(b) The stream plot obtained by TFvelo.(c) The sparsity for unspliced, spliced and total mRNA counts.The sparsity of a gene is defined as: Sparsity .(d) The stream plot obtained by baseline approaches.(e) The phase portrait fitting on genes with sparse unspliced counts.(f) The phase portrait fitting on TACC1 and HSP90AB1 with sparse unspliced counts.(g) The GO term biological process enrichment based on the best fitted genes.(h) The number of target genes where the absolute value of the weight larger than 0.5 for each TF.
could lead to mixing of cells from different types and become a challenge for phase portrait fitting.As shown in Fig.5c, TFvelo also provides 3D phase portrait plotting, by combining the spliced and unspliced counts with the learned representation of TFs.The TFvelo's 3D phase portrait can clearly show the correct dynamics.

Figure 5 .
Figure 5. Results of TFvelo on 10x multi-omics embryonic mouse brain dataset.(a) Pseudo-time and trajectory inferenced by Multivelo projected into a UMAP-based embedding.(b) Pseudo-time and trajectory inferenced by TFvelo projected into a UMAP-based embedding.(c) The comparison between Multivelo and TFvelo on three example genes, which are AHI1, NTRK2 and GRIN2B, respectively.For Multivelo plot, c means the chromatin accessibility in ATAC-seq.For TFvelo the 3D phase portrait is obtained by combining the learned representation of TFs (WX) with the spliced and unspliced abundance.

Figure 6 .
Figure 6.Results of TFvelo on dataset without splicing information.(a-d) Results on human preimplantation embryos dataset.(a) Pseudo-time inferenced by TFvelo projected into a UMAP-based embedding.(b) The streamplot of UMAP.(c) The heatmap which shows the dynamics of gene expression resolved along pseudo-time in the top 300 likelihood-ranked genes.(d) The constructed dynamic models on three example genes with high likelihoods, which are PPP1R14A, RGS2 and GSTP1, respectively.From the left to the right within each row, the three subfigures show phase specific latent time  ∈  , weight of each TF  ∈  and scalars [, , ,  ] respectively, where  and  represent the number of cells and the number of involved TFs.Generalized EM algorithm is adopted so that the three group of parameters will be updated alternately to minimize the loss function in each iteration: a) Assign  within [0,1) for each cell by grid search.For each cell, on the phase plot, the closest point located on the curve described by the dynamic model is defined as the target point.The step b) and c) aim to minimize the average distance between cells and their corresponding target points.b) Update  by linear regression with bounds on these weights to minimize the loss function.Trust region reflective algorithm 62 , which is an interior-point-like method is adopted to solve this linear least-squares problem with inequality constraint.In our implementation, the weights are bounded within (-20, 20) by default.c) Update [, , ,  ] with constraints of ∈ (0, ∞),  ∈ (−, ) and ∈ (0, ∞) to minimize the loss function.
show the capability of modeling RNA velocity based on gene regulatory relationship, we simply adopt a linear model as ℎ    ,   ;  , which can map the expression , we can derive that,   .Aiming at finding the phase trajectory ̂  that best fits the observations for all cells, we define the loss function as:Where ̂ , , , ,  represents ̂  ; , , , ,  .Finally, the optimization algorithm aims to minimize the negative log-likelihood function: EM algorithms could be trapped in local minimum, parameters are initialized in the following steps.Firstly, those cells in which expression of target gene is 0 are  can be initialized by  | ⟶ .At the points of maximal and minimal  value, we have    and y α  .As a result,  and  can be initialized by  and  , respectively.These assumptions are only used for initialization and in the following optimization process, all parameters will be updated iteratively.The hyper-parameter  is set as 2 and parameter  isintialized with various values, so that the generalized EM algorithm will run at different start points.After that, the model with the lowest loss is finally selected.
filtered out.Then ,  are normalized by  /  ，  /  , where  • represents the standard deviation function.The weights  is initialized with the value of spearman correlation of each TF-target pair.For parameters ,  and  , according to the steady-state assumption that  | 0,