Introduction

Multiple outcomes are often encountered in a variety of fields including social sciences, economics, and biomedical, in order to characterize the effect of a covariate or investigate the association between multiple outcomes and the interested variables. In many cases, these outcomes are of mixed types in the sense that some are continuous, and others may be ordinal. For example, in a mental health study1 in Florida, USA, the collected outcomes consist of mental impairment, life events, and socioeconomic status(SES) which are of different data types. The mental impairment is ordinal, with 4 categories(1 = well, 2 = mild symptom, 3 = moderate symptom formation, 4 = impaired). For the life events index, it is a composite measure which includes the number and severity of important life events occurred to the subject within the past 3 years, such as the birth of a child, a new job, a divorce, or a death in the family. The life events index can be taken as a continuous response. SES is measured as binary which can be taken as a covariate. Investigators want to judge whether mental impairment or life events index is associated with SES.

Various approaches have been developed to model mixed outcomes in the literature. A direct procedure is to ignore the correlations among multiple outcomes and fit each outcome with a model that best suits its type separately. Dale2 proposed global cross-ratio models as a measure of association for bivariate, discrete and ordinal responses. For testing association between a bivariate trait with a continuous and discrete (taking values 0, 1) outcomes and a covariate jointly, a widely used class of approaches are latent variable models. Catalano and Ryan3 proposed a bivariate latent variable models for clustered discrete and continuous outcomes with the joint distirbution being a product of a standard random effects model for the continuous variable and a probit model for the discrete variable. Fitzmaurice and Laird4 proposed a model for a correlated binary outcome and a continuous outcome based on the factorization of the joint distribution of the outcomes. Sammel et al.5 presented a latent variable model by assuming that the observed outcomes are physical manifestations of a latent variable. Gueorguieva et al.6 proposed a correlated probit model to model clustered binary and continuous responses jointly. Teixeira-Pinto et al.7 provided a new joint model for a binary and a continuous outcomes. In genome-wide association analysis field, Liu et al.8 developed an extended generalized estimating equation method for bivariate association analyses of continuous and binary traits. Their simulation results demonstrated that, compared with univariate analysis, bivariate analysis could substantially improve power while having comparable type I error rates under certain situations. Yuan et al.9 extended the joint linkage analysis of multivariate qualitative and quantitative traits described by Williams et al.10, 11 to association analysis. Yuan et al.9 also assumed two latent variables specified for the qualitative and quantitative traits followed a bivariate normal distribution. With such modeling, likelihood-based inference procedures are introduced to test for pleiotropic genetic effects.

In dealing with an single ordinal(taking values 1, 2, …) outcome, the proportional odds model12 is usually adopted. In particular, the proportional odds model with the outcome belonging to a set of ordered categories can be regarded as an extension of the logistic regression model for binary outcomes. It can be expressed as a series of logistic regression models for dependent binary variables with common regression parameters reflecting the proportional odds assumption13. When the outcome is continuous, the ordinary linear model is the most often used. In order to test the association between a bivariate trait with a continuous and ordinal outcomes and some covariates jointly, we also applied a latent variable model and proposed a statistical approach to model the ordinal outcome and continuous outcome simultaneously. Let us begin with two ordinary linear models Y 1 = α 0 + G τ α + ε 1, and Z = β 0 + G τ β + ε 2 where Y 1 is a continuous response, Z is a latent continuous variable, G = (G (1), \(\cdots \), G (p))τ represents a p-dimensional covariate, α 0 and β 0 are the intercept parameters, β = (β 1, \(\cdots \), β p )τ and α = (α 1, \(\cdots \), α p )τ are parameters, and (ε 1,ε 2)τ follow the bivariate normal distribution with mean vector (0, 0)τ and symmetric covariance matrix consisting of diagonal elements being \({\sigma }_{1}^{2}\), and \({\sigma }_{2}^{2}\), and off-diagonal elements both being ρσ 1 σ 2, respectively. We note that standard errors σ 1, σ 2 and correlation coefficient ρ are unknown. In addition, suppose that Z is a latent variable underlied an observed ordinal response Y 2 which can be observed in the following manners: Y 2 = 1 as −∞ < Z ≤ γ 1; Y 2 = 2 as γ 1 < Z ≤ γ 2; \(\cdots \); Y 2 = k as γ k−1 < Z ≤ γ k ; Y 2 = k + 1 as γ k  < Z ≤ ∞, where k is an integer related to the number of categories for Y 2 and γ 1,γ 2, \(\cdots \), γ k are k ordered cutpoint values. Note that the latent variable Z is unobserved.

Suppose n independent observations \(({Y}_{1i},{Y}_{2i},{G}_{i}^{\tau })\) are available for (Y 1, Y 2, G τ), where G i  = (g 1i , \(\cdots \), g pi )τ, i = 1, \(\cdots \), n. Based on the foregoing assumption, (Y 1i , Z i )τ are independently distributed as a bivariate normal distribution with mean vector \({({\alpha }_{0}+{G}_{i}^{\tau }\alpha ,{\beta }_{0}+{G}_{i}^{\tau }\beta )}^{\tau }\). The conditional distribution of Z i given Y 1i is normal with mean \(({\beta }_{0}+{G}_{i}^{\tau }\beta )+({Y}_{1i}-{G}_{i}^{\tau }\alpha -{\alpha }_{0})\rho \frac{{\sigma }_{2}}{{\sigma }_{1}}\) and variance \({\sigma }_{2}^{2}\mathrm{(1}-{\rho }^{2})\). By some algebra, for j = 1, \(\cdots \), k, and i = 1, …, n, it can be obtained that

$$\begin{array}{rcl}{\rm{\Pr }}({Z}_{i}\le {\gamma }_{j}|{Y}_{1i}) & = & Pr\{\frac{{Z}_{i}-[({\beta }_{0}+{G}_{i}^{\tau }\beta )+({Y}_{1i}-{\alpha }_{0}-{G}_{i}^{\tau }\alpha )\rho \frac{{\sigma }_{2}}{{\sigma }_{1}}]}{\sqrt{{\sigma }_{2}^{2}\mathrm{(1}-{\rho }^{2})}}\le \frac{{\gamma }_{j}-[({\beta }_{0}+{G}_{i}^{\tau }\beta )+({Y}_{1i}-{\alpha }_{0}-{G}_{i}^{\tau }\alpha )\rho \frac{{\sigma }_{2}}{{\sigma }_{1}}]}{\sqrt{{\sigma }_{2}^{2}\mathrm{(1}-{\rho }^{2})}}|{Y}_{1i}\}\\ & = & {\rm{\Phi }}\{\frac{{\gamma }_{j}-[({\beta }_{0}+{G}_{i}^{\tau }\beta )+({Y}_{1i}-{\alpha }_{0}-{G}_{i}^{\tau }\alpha )\rho \frac{{\sigma }_{2}}{{\sigma }_{1}}]}{\sqrt{{\sigma }_{2}^{2}\mathrm{(1}-{\rho }^{2})}}\}\\ & = & {\rm{\Phi }}[\frac{{\gamma }_{j}}{{\sigma }_{2}\sqrt{1-{\rho }^{2}}}-(\frac{{\beta }_{0}}{{\sigma }_{2}\sqrt{1-{\rho }^{2}}}-\frac{{\alpha }_{0}\rho }{{\sigma }_{1}\sqrt{1-{\rho }^{2}}})-{G}_{i}^{\tau }(\frac{\beta }{{\sigma }_{2}\sqrt{1-{\rho }^{2}}}-\frac{\alpha \rho }{{\sigma }_{1}\sqrt{1-{\rho }^{2}}})-{Y}_{1i}\frac{\rho }{{\sigma }_{1}\sqrt{1-{\rho }^{2}}}],\end{array}$$
(1)

where, Φ(·) is the standard normal distribution function.

Denote \({\beta }_{0}^{\ast }=\frac{{\beta }_{0}}{{\sigma }_{2}\sqrt{1-{\rho }^{2}}}\), \({\beta }^{\ast }=\frac{\beta }{{\sigma }_{2}\sqrt{1-{\rho }^{2}}}\), \({\gamma }_{j}^{\ast }=\frac{{\gamma }_{j}}{{\sigma }_{2}\sqrt{1-{\rho }^{2}}}\), and \({\theta }^{\ast }=\frac{\rho }{{\sigma }_{1}\sqrt{1-{\rho }^{2}}}\). We have

$${{\rm{\Phi }}}^{-1}[Pr({Z}_{i}\le {\gamma }_{j}|{Y}_{1i})]={\gamma }_{j}^{\ast }-{\beta }_{0}^{\ast }-{G}_{i}^{\tau }{\beta }^{\ast }-({Y}_{1i}-{\alpha }_{0}-{G}_{i}^{\tau }\alpha ){\theta }^{\ast },$$
(2)

for j = 1, \(\cdots \), k, and i = 1, …, n.

In this paper, we propose a joint test for testing the association between a bivariate with an ordinal response and a continuous response and the covariates of interest. We derive the asymptotic properties for the estimators for parameters of interested covariates in the joint model. Extensive simulations are conducted to compare the performances of our proposed method to those of the existing combined p-values method. Application to the aforementioned mental impairment study in Florida further demonstrate good performances of our new method.

Results

Joint Model for a Bivariate with a continuous response and an ordinal responses and a Covariate

For \(i=1,\cdots ,n\), \({Y}_{1i}-{\alpha }_{0}-{G}_{i}^{\tau }\alpha \) represents error term and has expectation value 0, so β * in (2) can be taken as a measure for the association between the ordinal response Y 2 and covariates G after adjusting the effect of continuous response Y 1. But \({Y}_{1i}-{\alpha }_{0}-{G}_{i}^{\tau }\alpha \) is unobserved for \(i=1,\cdots ,n.\) Alternatively, a more common procedure instead of (2) is to use

$${{\rm{\Phi }}}^{-1}[{\rm{\Pr }}({Z}_{i}\le {\gamma }_{j}|{Y}_{1i})]={\gamma }_{j}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}_{i}^{\tau }{\beta }^{\ast \ast }-{Y}_{1i}{\theta }^{\ast },$$
(3)

where β ** = β * − α *, \({\beta }_{0}^{\ast \ast }={\beta }_{0}^{\ast }-{\alpha }_{0}^{\ast }\), α * = αθ *, and \({\alpha }_{0}^{\ast }={\alpha }_{0}{\theta }^{\ast }\). It is obvious that the model (3) links the ordinal response Y 2 with G and Y 2 with a similar manner as the proportional odds model. The main difference is that (3) uses the standard normal distribution Φ as the link function, while logit function is utilized by the proportional odds model.

With α = α */θ *, the joint model of continuous variable Y 1 and ordinal variable Y 2 can be constructed as follows:

as j = 1,

$$\{\begin{array}{rcl}{\rm{\Pr }}({Y}_{1} < {y}_{1}) & = & {\int }_{-\infty }^{{y}_{1}}\frac{1}{\sqrt{2\pi {\sigma }_{1}^{2}}}\exp [\frac{-{(v-\frac{{\alpha }_{0}^{\ast }}{{\theta }^{\ast }}-{G}^{\tau }\frac{{\alpha }^{\ast }}{{\theta }^{\ast }})}^{2}}{2{\sigma }_{1}^{2}}]dv\\ {\rm{\Pr }}({Y}_{2}=j|{Y}_{1}) & = & {\rm{\Phi }}({\gamma }_{1}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}^{\tau }{\beta }^{\ast \ast }-{Y}_{1}{\theta }^{\ast })\end{array};$$
(4)

as j = 2, \(\cdots \), k,

$$\{\begin{array}{rcl}{\rm{\Pr }}({Y}_{1} < {y}_{1}) & = & {\int }_{-\infty }^{{y}_{1}}\frac{1}{\sqrt{2\pi {\sigma }_{1}^{2}}}\exp [\frac{-{(v-\frac{{\alpha }_{0}^{\ast }}{{\theta }^{\ast }}-{G}_{i}^{\tau }\frac{{\alpha }^{\ast }}{{\theta }^{\ast }})}^{2}}{2{\sigma }_{1}^{2}}]dv\\ {\rm{\Pr }}({Y}_{2}=j|{Y}_{1}) & = & {\rm{\Phi }}({\gamma }_{j}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}^{\tau }{\beta }^{\ast \ast }-{Y}_{1}{\theta }^{\ast })-{\rm{\Phi }}({\gamma }_{j-1}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}^{\tau }{\beta }^{\ast \ast }-{Y}_{1}{\theta }^{\ast })\end{array};$$
(5)

as j = k + 1,

$$\{\begin{array}{rcl}{\rm{\Pr }}({Y}_{1} < {y}_{1}) & = & {\int }_{-\infty }^{{y}_{1}}\frac{1}{\sqrt{2\pi {\sigma }_{1}^{2}}}{\rm{e}}{\rm{x}}{\rm{p}}[\frac{-{(v-\frac{{\alpha }_{0}^{\ast }}{{\theta }^{\ast }}-{G}_{i}^{\tau }\frac{{\alpha }^{\ast }}{{\theta }^{\ast }})}^{2}}{2{\sigma }_{1}^{2}}]dv\\ {\rm{\Pr }}({Y}_{2}=j|{Y}_{1}) & = & 1-{\rm{\Phi }}({\gamma }_{k}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}_{i}^{\tau }{\beta }^{\ast \ast }-{Y}_{1}{\theta }^{\ast })\end{array}.$$
(6)

Testing Association between a Bivariate and a Covariate

The maximum likelihood estimates (MLEs) for parameters \({\alpha }_{0}^{\ast }\), \({\beta }_{0}^{\ast \ast }\), α *, β **, θ *, \({\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast }\), and \({\sigma }_{1}^{2}\) can be obtained by maximizing the log-likelihood function which is implemented by solving score equations in the method section. Denote \(a={({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2})}^{\tau }\) and denote the corresponding MLE by \(\hat{a}={(\widehat{{\alpha }_{0}^{\ast }},\widehat{{\beta }_{0}^{\ast \ast }},\widehat{{\alpha }^{\ast }},\widehat{{\beta }^{\ast \ast }},\widehat{{\theta }^{\ast }},\widehat{{\gamma }_{1}^{\ast }},\cdots ,\widehat{{\gamma }_{k}^{\ast }},\widehat{{\sigma }_{1}^{2}})}^{\tau }\). Following the statistical asymptotical theory14, under the null hypothesis where neither the continuous response Y 1 nor the ordinal response Y 2 is associated with covariates G, we have \(\sqrt{n}({\hat{a}}^{\tau }-{a}^{\tau })\) asymptotically follows from the normal distribution with mean vector 0(2p+k+4)×1 and covariance I −1(a), where I(a) is the Fisher information matrix(see method section). As mentioned earlier, β * in (2) measure the association between the ordinal response Y 2 and G by adjusting the effect of continuous response Y 1. Denote \(\widehat{{\beta }^{\ast }}=\widehat{{\beta }^{\ast \ast }}+\widehat{{\alpha }^{\ast }}\) and let V be a submatrix corresponding to the 3rd to the (2p + 2)th rows and columns of I −1(a), and B = (I p , I p ), where I p is a p × p identity matrix. Then we can conclude that \(\widehat{{\beta }^{\ast }}\) is an asymptotic unbiased estimate for the parameter β * and it follows asymptotically from the normal distribution with mean vector β ** + α * and covariance matrix BVB τ as the sample size n goes to infinity according to the multi-normal distribution theory15.

Our goal of this paper is to test whether the bivariate with the outcomes of mixed types (Y 1 and Y 2) is associated with covariates G of interest. The null hypothesis is neither the continuous response Y 1 nor the ordinal response Y 2 is associated with the variable G. Denote \(\widehat{\eta }={(\widehat{{\alpha }_{1}^{\ast }},\cdots ,\widehat{{\alpha }_{p}^{\ast }},\widehat{{\beta }_{1}^{\ast }},\cdots ,\widehat{{\beta }_{p}^{\ast }})}^{\tau }\) and \(W={\rm{var}}(\widehat{\eta })\). We can obtain W = HVH τ, where, \(H={(\begin{array}{c}{I}_{p}{\mathrm{,0}}_{p}\\ {I}_{p},{I}_{p}\end{array})}_{2p\times 2p}\). The detailed derivation of W is displayed in method section. With this, we propose a joint test denoted by

$${\rm{JT}}={(\widehat{\eta })}^{\tau }{(\widehat{W})}^{-1}\widehat{\eta },$$
(7)

which follows from a Chi-Squared distribution with degree freedom of 2p under the null hypothesis, where \(\widehat{W}=H\widehat{V}{H}^{\tau }\) is a consistent estimate of the covariance matrix W, and \(\widehat{V}\) is a consistent estimate of the covariance matrix V corresponding to the 3rd to the 2p + 2th rows and columns of \({I}^{-1}(\hat{a})\).

Simulation Results

In this section, we explore the performances of JT by comparing it to a method of combined p-values16 (denoted by CP). In our case, CP can be implemented as: firstly, we apply the ordinary linear model to regressing Y 1 on G and calculate the p-value denoted by pv1 for testing the association between Y 1 and G; secondly, we apply the proportional odd model to regressing the ordinal response Y 2 on G and denote the resulted p-value by pv2; lastly, we use the statistic CP = −2log(pv 1) − 2log(pv 2) as the final test. The p-value of CP can be calculated by the permutation method with 200 iterations. Assume that p = 1, and n = 500. We generate a 1-dimensional covariate G from standard normal distribution with sample size n and fix it as a covariate in our simulation iterations. The correlated error terms can be sampled from a bivariate normal distribution with mean vector (0, 0)τ and covariance matrix in which its diagonal elements are both equal to 1 and the non-diagonal element is equal to ρ. We generate continuous variable Y 1 and the latent continuous variable Z based on the former two linear models in introduction section with parameter α 0, β 0, α 1 and β 1. We consider 2 scenarios with 3 ordinal category values(k = 2) and 4 ordinal category values(k = 3). Tables 1 and 2 show the empirical type I error rates for k = 2 and k = 3 respectively. Tables 3 and 4 show the empirical powers for k = 2 and k = 3 respectively.

Table 1 Empirical type-1 error rates for tests JT and CP with 3 ordinal categories(k = 2).
Table 2 Empirical type-1 error rates for tests JT and CP with 4 ordinal categories(k = 3).
Table 3 Empirical powers for tests JT and CP with 3 ordinal categories(k = 2).
Table 4 Empirical powers for tests JT and CP with 4 ordinal categories(k = 3).

For the first scenario with k = 2, the final ordinal outcome Y 2 can be obtained by dichotomizing Z into 1, 2, 3 with the thresholds of γ 1 = 0 and γ 2 = 1. The nominal significant level is set to be 0.05 to calculate the empirical type I error rate and power. All results are calculated based on 500 replicates. From Table 1, we can see that both JT and CP have correct type I error rate since the values are always close to the nominal significance level of 0.05. For example, when α 0 = 0.2, β 0 = 0.2, and ρ = 0.3, the empirical type I error rates of JT and CP are 0.047 and 0.052, respectively. For example, when α 0 = 1.2, β 0 = 1.2, and ρ = 0.9, the empirical type I error rates of JT and CP are 0.048 and 0.053, respectively. From Table 3, it can be seen that the performance of CP is unsatisfactory when the correlation coefficient ρ is large. However, our proposed test JT always has more desirable powers than CP when parameter α 1 is much smaller than β 1 and the correlation between Y 1 and Y 2 is not weak. For example, when α 0 = 0.2, β 0 = 0.2, α 1 = 0.05, β 1 = 0.15, and ρ = 0.9, the powers of of JT and CP are 0.922 and 0.57, respectively. When α 0 = 1.2, β 0 = 1.2, α 1 = 0.05, β 1 = 0.15, and ρ = 0.8, the powers of of JT and CP are 0.876 and 0.674, respectively. Another example is that when α 0 = 0.2, β 0 = 0.2, α 1 = 0.05, β 1 = 0.15, and ρ = 0.7, the powers of of JT and CP are 0.756 and 0.582, respectively. In addition to this, when α 0 = 1.2, β 0 = 1.2, α 1 = 0.05, β 1 = 0.15, and ρ = 0.5, the powers of of JT and CP are 0.716 and 0.658, respectively. When the parameters α 1 is larger than or equal to β 1, the powers of the proposed test JT and CP are nearly the same. For example, when α 0 = 0.2, β 0 = 0.2,α 1 = 0.15, β 1 = 0, and ρ = 0.3, the powers of of JT and CP are 0.8 and 0.792, respectively. The powers of of JT and CP are 0.832 and 0.866 respectively, when α 0 = 1.2, β 0 = 1.2, α 1 = 0.15, β 1 = 0.15, and ρ = 0.9.

For the second scenario with k = 3, the final ordinal outcome Y 2 can be obtained by dichotomizing Z into 1, 2, 3, 4 with the thresholds of γ 1 = −0.6, γ 2 = 0 and γ 3 = 0.6. The nominal significant level is set to be 0.05 to calculate the empirical type I error rate and power. All results are calculated based on 500 replicates. From Table 2, we can see that both JT and CP have correct type I error rate since the values are always close to the nominal significance level of 0.05. For example, when α 0 = 0.2, β 0 = 0.2, and ρ = 0.4, the empirical type I error rates of JT and CP are 0.05 and 0.046, respectively. For example, when α 0 = 1.2, β 0 = 1.2, and ρ = 0.8, the empirical type I error rates of JT and CP are 0.046 and 0.049, respectively. From Table 3, it can be seen that the performance of CP is unsatisfactory when the correlation coefficient ρ is large. However, our proposed test JT always has more desirable powers than CP when parameter α 1 is much smaller than β 1 and the correlation between Y 1 and Y 2 is not weak. For example, when α 0 = 0.2, β 0 = 0.2, α 1 = 0, β 1 = 0.15, and ρ = 0.9, the powers of of JT and CP are 0.996 and 0.636, respectively. When α 0 = 1.2, β 0 = 1.2, α 1 = 0.05, β 1 = 0.15, and ρ = 0.8, the powers of of JT and CP are 0.774 and 0.608, respectively. Another example is that when α 0 = 0.2, β 0 = 0.2, α 1 = 0, β 1 = 0.15, and ρ = 0.7, the powers of of JT and CP are 0.968 and 0.692, respectively. In addition to this, when α 0 = 1.2, β 0 = 1.2, α 1 = 0, β 1 = 0.15, and ρ = 0.5, the powers of of JT and CP are 0.744 and 0.586, respectively. When the parameters α 1 is larger than or equal to β 1, the powers of the proposed test JT and CP are nearly the same. For example, when α 0 = 0.2, β 0 = 0.2,α 1 = 0.15, β 1 = 0, and ρ = 0.3, the powers of of JT and CP are 0.872 and 0.856, respectively. The powers of of JT and CP are 0.87 and 0.916 respectively, when α 0 = 1.2, β 0 = 1.2, α 1 = 0.15, β 1 = 0.15, and ρ = 0.9.

Real Data Analysis

To further explore the performance of JT and CP on the testing for the association between multiple outcomes of mixed types and interested covariates, we apply them to the mental health study1 in Florida, USA which can be downloaded in www.stat.ufl.edu/aa/glm/data. Mental impairment acts as the ordinal response and life event index which is a composite measure of the number and severity of important life events acts as the continuous response, and socioeconomic status(SES) acts as a 1-dimensional covariate. There are totally 40 samples. The 40 observations analyzed here are merely reflective of patterns found with much larger sample in the study mentioned in introduction section. There are 4 categories about mental impairment values being 1, 2, 3 and 4, which means k = 3. Our aim is to test whether the Mental impairment or life event index is associated with SES simultaneously. The p-values of JT and CP for testing this association are 0.042 and 0.46 by R software, which indicates that the proposed method can identify the association under significance level 0.05.

Discussion

In this paper, we propose a joint model for modeling the association between a bivariate with a continuous outcome and an ordinal outcome by using a latent variable. We conduct some statistical inferences on the parameters in the proposed joint model. Furthermore, a test method is proposed to test whether the the continuous response or ordinal response is associated with covariates. Extensive simulations are conducted to assess the performances of the proposed test procedure. From the simulation results, the proposed method always outperforms the combined p-value method when the correlation of the continuous response and the ordinal response is not weak. Application to a real data analysis further demonstrates the superiority of the new method. When Y 1 and Y 2 act as traits related to genetic, and G acts as genotypes, our proposed joint model and test can be applied in modern genome wide association study analysis as Hu et al.17. In addition, when the dimension number p and categories related number k become large, it is hard for JT to solve a large number of equations. A feasible approach for ordinal data is to build the test based on ranks18, 19. Our analysis is based on an assumption that the observations are from a bivariate normal distribution. When this assumption is not satisfied, a robust method is more appealing. So it deserves further study.

Methods

Statistical Inference on Joint Model

In this part, we make statistical inference by the use of MLE statistical theory. Based on the joint model in results section, the likelihood function for unknown parameters can be given by

$$\begin{array}{rcl}{L}_{n}({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2}) & = & \prod _{i=1}^{n}{s}_{i}({\beta }_{0}^{\ast \ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast })\frac{1}{\sqrt{2\pi {\sigma }_{1}^{2}}}{\rm{e}}{\rm{x}}{\rm{p}}\\ & & -\frac{{({Y}_{1i}-{\alpha }_{0}^{\ast }/{\theta }^{\ast }-{G}_{i}^{\tau }{\alpha }^{\ast }/{\theta }^{\ast })}^{2}}{2{\sigma }_{1}^{2}},\end{array}$$
(8)

where

$$\begin{array}{rcl}{s}_{i}({\beta }_{0}^{\ast \ast },{\beta }^{\ast \ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast }) & = & {\rm{\Phi }}({\gamma }_{1}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}_{i}^{\tau }{\beta }^{\ast \ast }-{Y}_{1i}{\theta }^{\ast }){I}_{\{{Y}_{2i}\mathrm{=1\}}}\\ & & +[1-{\rm{\Phi }}({\gamma }_{k}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}_{i}^{\tau }{\beta }^{\ast \ast }-{Y}_{1i}{\theta }^{\ast })]{I}_{\{{Y}_{2i}=k+\mathrm{1\}}}\\ & & +\sum _{j=2}^{k}[{\rm{\Phi }}({\gamma }_{j}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}_{i}^{\tau }{\beta }^{\ast \ast }-{Y}_{1i}{\theta }^{\ast })\\ & & -{\rm{\Phi }}({\gamma }_{j-1}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}_{i}^{\tau }{\beta }^{\ast \ast }-{Y}_{1i}{\theta }^{\ast })]{I}_{\{{Y}_{2i}=j\}}.\end{array}$$
(9)

The log-likelihood function is

$$\begin{array}{rcl}{l}_{n}({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2}) & = & \sum _{{Y}_{2i}=1}\mathrm{log}[{\rm{\Phi }}({\gamma }_{1}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}_{i}^{\tau }{\beta }^{\ast \ast }-{Y}_{1i}{\theta }^{\ast })]\\ & & +\sum _{{Y}_{2i}=k+1}\mathrm{log}[1-{\rm{\Phi }}({\gamma }_{k}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}_{i}^{\tau }{\beta }^{\ast \ast }-{Y}_{1i}{\theta }^{\ast })]\\ & & +\sum _{j=2}^{k}\sum _{{Y}_{2i}=j}\mathrm{log}\,[{\rm{\Phi }}({\gamma }_{j}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}_{i}^{\tau }{\beta }^{\ast \ast }-{Y}_{1i}{\theta }^{\ast })\\ & & -{\rm{\Phi }}({\gamma }_{j-1}^{\ast }-{\beta }_{0}^{\ast \ast }-{G}_{i}^{\tau }{\beta }^{\ast \ast }-{Y}_{1i}{\theta }^{\ast })]\\ & & -\sum _{i=1}^{n}[\frac{{({Y}_{1i}-\frac{{\alpha }_{0}^{\ast }}{{\theta }^{\ast }}-{G}_{i}^{\tau }\frac{{\alpha }^{\ast }}{{\theta }^{\ast }})}^{2}}{2{\sigma }_{1}^{2}}+\frac{\mathrm{log}({\sigma }_{1}^{2})}{2}+\frac{\mathrm{log}(2\pi )}{2}].\end{array}$$
(10)

The MLEs of parameters \({\alpha }_{0}^{\ast }\), \({\beta }_{0}^{\ast \ast }\), α *, β **, θ *, \({\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast }\), and \({\sigma }_{1}^{2}\) can be obtained by maximizing the log-likelihood function which is implemented by solve the following score equations group

$$\{\begin{array}{cc}\frac{\partial }{\partial {\alpha }_{0}^{\ast }}{l}_{n}({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2}) & \mathrm{=0}\\ \frac{\partial }{\partial {\beta }_{0}^{\ast \ast }}{l}_{n}({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2}) & \mathrm{=0}\\ \frac{\partial }{\partial {\alpha }^{\ast }}{l}_{n}({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2}) & \mathrm{=0}\\ \frac{\partial }{\partial {\beta }^{\ast \ast }}{l}_{n}({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2}) & \mathrm{=0}\\ \frac{\partial }{\partial {\theta }^{\ast }}{l}_{n}({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2}) & \mathrm{=0}\\ \frac{\partial }{\partial {\gamma }_{1}^{\ast }}{l}_{n}({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2}) & \mathrm{=0}\\ \vdots & \vdots \\ \frac{\partial }{\partial {\gamma }_{k}^{\ast }}{l}_{n}({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2}) & \mathrm{=0}\\ \frac{\partial }{\partial ({\sigma }_{1}^{2})}{l}_{n}({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2}) & \mathrm{=0}\end{array}$$
(11)

Denote \({\mu }_{i}={\beta }_{0}^{\ast \ast }+{G}_{i}^{\tau }{\beta }^{\ast \ast }+{Y}_{1i}{\theta }^{\ast }\). With \(a={({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2})}^{\tau }\) in results section, we can write \({l}_{n}({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast },{\theta }^{\ast },{\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2})\) in a simple form as follows:

$$\begin{array}{rcl}{l}_{n}(a) & = & \sum _{{Y}_{2i}=1}\mathrm{log}[{\rm{\Phi }}({\gamma }_{1}^{\ast }-{\mu }_{i})]+\sum _{{Y}_{2i}=k+1}\mathrm{log}[1-{\rm{\Phi }}({\gamma }_{k}^{\ast }-{\mu }_{i})]\\ & & +\sum _{j=2}^{k}\sum _{{Y}_{2i}=j}\mathrm{log}\,[{\rm{\Phi }}({\gamma }_{j}^{\ast }-{\mu }_{i})-{\rm{\Phi }}({\gamma }_{j-1}^{\ast }-{\mu }_{i})]\\ & & -\sum _{i=1}^{n}[\frac{{({Y}_{1i}-\frac{{\alpha }_{0}^{\ast }}{{\theta }^{\ast }}-{G}_{i}^{\tau }\frac{{\alpha }^{\ast }}{{\theta }^{\ast }})}^{2}}{2{\sigma }_{1}^{2}}+\frac{\mathrm{log}({\sigma }_{1}^{2})}{2}+\frac{\mathrm{log}(2\pi )}{2}].\end{array}$$
(12)

The detailed expressions of the score functions(left sides of (11)) are provided as follows:

$$\frac{\partial }{\partial {\alpha }_{0}^{\ast }}{l}_{n}(a)=\sum _{i=1}^{n}[\frac{({Y}_{1i}-\frac{{\alpha }_{0}^{\ast }}{{\theta }^{\ast }}-\frac{{G}_{i}^{\tau }{\alpha }^{\ast }}{{\theta }^{\ast }})}{2{\sigma }_{1}^{2}}],$$
(13)
$$\frac{\partial }{\partial {\beta }_{0}^{\ast \ast }}{l}_{n}(a)=-\sum _{{Y}_{2i}=1}\frac{\varphi ({\gamma }_{1}^{\ast }-{\mu }_{i})}{{\rm{\Phi }}({\gamma }_{1}^{\ast }-{\mu }_{i})}-\sum _{{Y}_{2i}=k+1}\frac{-\varphi ({\gamma }_{k}^{\ast }-{\mu }_{i})}{1-{\rm{\Phi }}({\gamma }_{k}^{\ast }-{\mu }_{i})}-\sum _{j\mathrm{=2}}^{k}\sum _{{Y}_{2i}=j}\frac{\varphi ({\gamma }_{j}^{\ast }-{\mu }_{i})-\varphi ({\gamma }_{j-1}^{\ast }-{\mu }_{i})}{{\rm{\Phi }}({\gamma }_{j}^{\ast }-{\mu }_{i})-{\rm{\Phi }}({\gamma }_{j-1}^{\ast }-{\mu }_{i})},$$
(14)
$$\frac{\partial }{\partial {\alpha }^{\ast }}{l}_{n}(a)=\sum _{i=1}^{n}[\frac{({Y}_{1i}-\frac{{\alpha }_{0}^{\ast }}{{\theta }^{\ast }}-\frac{{G}_{i}^{\tau }{\alpha }^{\ast }}{{\theta }^{\ast }})}{2{\sigma }_{1}^{2}}{G}_{i}],$$
(15)
$$\frac{\partial }{\partial {\beta }^{\ast \ast }}{l}_{n}(a)=-\sum _{{Y}_{2i}\mathrm{=1}}\frac{\varphi ({\gamma }_{1}^{\ast }-{\mu }_{i})}{{\rm{\Phi }}({\gamma }_{1}^{\ast }-{\mu }_{i})}{G}_{i}-\sum _{{Y}_{2i}=k+1}\frac{-\varphi ({\gamma }_{k}^{\ast }-{\mu }_{i})}{1-{\rm{\Phi }}({\gamma }_{k}^{\ast }-{\mu }_{i})}{G}_{i}-\sum _{j=2}^{k}\sum _{{Y}_{2i}=j}\frac{\varphi ({\gamma }_{j}^{\ast }-{\mu }_{i})-\varphi ({\gamma }_{j-1}^{\ast }-{\mu }_{i})}{{\rm{\Phi }}({\gamma }_{j}^{\ast }-{\mu }_{i})-{\rm{\Phi }}({\gamma }_{j-1}^{\ast }-{\mu }_{i})}{G}_{i},$$
(16)
$$\begin{array}{rcl}\frac{\partial }{\partial {\theta }^{\ast }}{l}_{n}(a) & = & -\sum _{{Y}_{2i}=1}\frac{\varphi ({\gamma }_{1}^{\ast }-{\mu }_{i})}{{\rm{\Phi }}({\gamma }_{1}^{\ast }-{\mu }_{i})}{Y}_{1i}-\sum _{{Y}_{2i}=k+1}\frac{-\varphi ({\gamma }_{k}^{\ast }-{\mu }_{i})}{1-{\rm{\Phi }}({\gamma }_{k}^{\ast }-{\mu }_{i})}{Y}_{1i}\\ & & -\sum _{j=2}^{k}\sum _{{Y}_{2i}=j}\frac{\varphi ({\gamma }_{j}^{\ast }-{\mu }_{i})-\varphi ({\gamma }_{j-1}^{\ast }-{\mu }_{i})}{{\rm{\Phi }}({\gamma }_{j}^{\ast }-{\mu }_{i})-{\rm{\Phi }}({\gamma }_{j-1}^{\ast }-{\mu }_{i})}{Y}_{1i}\\ & & -\sum _{i=1}^{n}[\frac{({Y}_{1i}-\frac{{G}_{i}^{\tau }{\beta }^{\ast }}{{\theta }^{\ast }}){G}_{i}^{\tau }{\beta }^{\ast }}{2{\sigma }_{1}^{2}{({\beta }^{\ast })}^{2}}],\end{array}$$
(17)
$$\frac{\partial }{\partial {\gamma }_{1}^{\ast }}{l}_{n}(a)=\sum _{{Y}_{2i}=1}\frac{\varphi ({\gamma }_{1}^{\ast }-{\mu }_{i})}{{\rm{\Phi }}({\gamma }_{1}^{\ast }-{\mu }_{i})}+\sum _{{Y}_{2i}=2}\frac{-\varphi ({\gamma }_{1}^{\ast }-{\mu }_{i})}{{\rm{\Phi }}({\gamma }_{2}^{\ast }-{\mu }_{i})-{\rm{\Phi }}({\gamma }_{1}^{\ast }-{\mu }_{i})},$$
(18)
$$\begin{array}{rcl}\frac{\partial }{\partial {\gamma }_{j}^{\ast }}{l}_{n}(a) & = & \sum _{{Y}_{2i}=j}\frac{\varphi ({\gamma }_{j}^{\ast }-{\mu }_{i})}{{\rm{\Phi }}({\gamma }_{j}^{\ast }-{\mu }_{i})-{\rm{\Phi }}({\gamma }_{j-1}^{\ast }-{\mu }_{i})}\\ & & +\sum _{{Y}_{2i}=j+1}\frac{-\varphi ({\gamma }_{j}^{\ast }-{\mu }_{i})}{{\rm{\Phi }}({\gamma }_{j+1}^{\ast }-{\mu }_{i})-{\rm{\Phi }}({\gamma }_{j}^{\ast }-{\mu }_{i})},\,{\rm{for}}\,j=\mathrm{2,}\cdots ,k-\mathrm{1,}\end{array}$$
(19)
$$\frac{\partial }{\partial {\gamma }_{k}^{\ast }}{l}_{n}(a)=\sum _{{Y}_{2i}=k}\frac{\varphi ({\gamma }_{k}^{\ast }-{\mu }_{i})}{{\rm{\Phi }}({\gamma }_{k}^{\ast }-{\mu }_{i})-{\rm{\Phi }}({\gamma }_{k-1}^{\ast }-{\mu }_{i})}+\sum _{{Y}_{2i}=k+1}\frac{-\varphi ({\gamma }_{k}^{\ast }-{\mu }_{i})}{1-{\rm{\Phi }}({\gamma }_{k}^{\ast }-{\mu }_{i})},$$
(20)
$$\frac{\partial }{\partial {\sigma }_{1}^{2}}{l}_{n}(a)=\sum _{i=1}^{n}[\frac{{({Y}_{1i}--\frac{{\alpha }_{0}^{\ast }}{{\theta }^{\ast }}-\frac{{G}_{i}^{\tau }{\alpha }^{\ast }}{{\theta }^{\ast }})}^{2}}{2{\sigma }_{1}^{2}}-\frac{1}{2{\sigma }_{1}^{2}}].$$
(21)

We can obtain MLE \(\hat{a}={(\widehat{{\alpha }_{0}^{\ast }},\widehat{{\beta }_{0}^{\ast \ast }},\widehat{{\alpha }^{\ast }},\widehat{{\beta }^{\ast \ast }},\widehat{{\theta }^{\ast }},\widehat{{\gamma }_{1}^{\ast }},\cdots ,\widehat{{\gamma }_{k}^{\ast }},\widehat{{\sigma }_{1}^{2}})}^{\tau }\) for \(a=({\alpha }_{0}^{\ast },{\beta }_{0}^{\ast \ast },{\alpha }^{\ast },{\beta }^{\ast \ast },{\theta }^{\ast },\) \({\gamma }_{1}^{\ast },\cdots ,{\gamma }_{k}^{\ast },{\sigma }_{1}^{2}{)}^{\tau }\) by solving the equations group (11) based on the Newton’s method in R software. According to the statistical asymptotisc theory14, under the null hypothesis where neither the continuous response Y 1 nor the ordinal response Y 2 is associated with covariates G, we have \(\sqrt{n}({\hat{a}}^{\tau }-{a}^{\tau })\) asymptotically follows from the normal distribution with mean vector 0(2p+k+4)×1 and covariance I −1(a), where I(a) is the Fisher information matrix taking the following form

$$I(a)=-{(\begin{array}{ccccc}E[\frac{{\partial }^{2}ln(a)}{{(\partial {\alpha }_{0}^{\ast })}^{2}}] & E[\frac{{\partial }^{2}ln(a)}{\partial {\alpha }_{0}^{\ast }\partial {\beta }_{0}^{\ast \ast }}] & E[\frac{{\partial }^{2}ln(a)}{\partial {\alpha }_{0}^{\ast }\partial {\alpha }_{1}^{\ast }}] & \cdots & E[\frac{{\partial }^{2}ln(a)}{\partial {\alpha }_{0}^{\ast }\partial ({\sigma }_{1}^{2})}]\\ E[\frac{{\partial }^{2}ln(a)}{\partial {\beta }_{0}^{\ast }\partial {\alpha }_{0}^{\ast }}] & E[\frac{{\partial }^{2}ln(a)}{{(\partial {\beta }_{0}^{\ast })}^{2}}] & E[\frac{{\partial }^{2}ln(a)}{\partial {\beta }_{0}^{\ast }\partial {\alpha }_{1}^{\ast }}] & \cdots & E[\frac{{\partial }^{2}ln(a)}{\partial {\alpha }_{1}^{\ast }\partial ({\sigma }_{1}^{2})}]\\ \vdots & \vdots & \vdots & \ddots & \vdots \\ E[\frac{{\partial }^{2}ln(a)}{\partial ({\sigma }_{1}^{2})\partial {\alpha }_{0}^{\ast }}] & E[\frac{{\partial }^{2}ln(a)}{\partial ({\sigma }_{1}^{2})\partial {\beta }_{0}^{\ast }}] & E[\frac{{\partial }^{2}ln(a)}{\partial ({\sigma }_{1}^{2})\partial {\alpha }_{1}^{\ast }}] & \cdots & E[\frac{{\partial }^{2}ln(a)}{{(\partial ({\sigma }_{1}^{2}))}^{2}}]\end{array})}_{\mathrm{(2}p+k+\mathrm{4)}\times \mathrm{(2}p+k+\mathrm{4)}}.$$
(22)

Derivation of Covariance Matrix W

Denote V as sub-matrix in I −1(a) with the 3rd to the (2p + 2)th rows and columns, and \(H={(\begin{array}{c}{I}_{p}{\mathrm{,0}}_{p}\\ {I}_{p},{I}_{p}\end{array})}_{2p\times 2p}\), where I p is the identity matrix with dimension p, 0 p is the null matrix with all elements 0 and dimension p. According to asymptotic theorems14, as sample size n goes for infinity, \({(\begin{array}{c}\widehat{{\alpha }^{\ast }}\\ \widehat{{\beta }^{\ast \ast }}\end{array})}_{2p\times 1}\mathop{\longrightarrow }\limits^{D}N((\begin{array}{c}{\alpha }^{\ast }\\ {\beta }^{\ast \ast }\end{array}),V).\) By some algebra, we have \(\widehat{\eta }=H(\begin{array}{c}\widehat{{\alpha }^{\ast }}\\ \widehat{{\beta }^{\ast \ast }}\end{array})\). We can obtain that \(\widehat{\eta }\) is an asymptotic unbiased estimate for parameters η; and \(\widehat{\eta }-\eta \,\mathop{\longrightarrow }\limits^{D}N(\mathrm{0,}\,HV{H}^{\tau })\) as sample size n goes for infinity based on the multi-normal distribution theory15.