Introduction

In many countries, especially developing countries, agricultural production is mainly based on the extensive business model of high input and high consumption, and the technical efficiency is low. This has not only limited the development of local agriculture, making it difficult to ensure the production and quality of food, but also put great pressure on the ecological environment (Bongaarts, 2019). Excessive input of chemical fertilizers and pesticides leads to groundwater pollution, which poses a great threat to residents’ drinking and agricultural irrigation in China (Yuan et al., 2021), Bangladesh (Huq et al., 2019), Iran (Ostad-Ali-Askari et al., 2017) and other countries. Only by changing the extensive operation mode of agricultural production and promoting the development of inefficient agriculture to efficient agriculture can we promote the coordinated development of agricultural production and resources and the environment.

How to achieve the transformation to efficient agriculture? Existing research has proposed possible solutions from two aspects. On one hand, it is through institutional innovation to create a favorable external environment for agricultural production. For example, secure land ownership systems (Paltasingh et al., 2022; Leta et al., 2021), effective agricultural credit policies (Twumasi et al., 2022), grain subsidy policies (Chen et al., 2023), and so on. On the other hand, it is by utilizing production services or new production technologies to improve the marginal output of input factors. For example, agricultural machinery services (Tang et al., 2018), climate adaptation technologies (Pangapanga-Phiri and Mungatana, 2021), green production technologies (Li and Lin, 2023), and so on.

However, the improvement of agricultural technology efficiency is essentially a production decision problem for farmers based on their circumstances (such as preferences, cognition, resource constraints, and cost-benefit considerations). Neglecting the dominant position of farmers in agricultural production and relying solely on improving the institutional environment or promoting production technologies may not effectively solve the issue of enhancing technical efficiency. Firstly, farmers’ production behavior exhibits a certain degree of inertia. Even if institutional innovations provide better opportunities, farmers still need to collect various information to adapt to the new institutional environment and assess the new opportunities and risks (Granderson, 2014; Tompkins and Adger, 2005). Therefore, in the short term, farmers’ production behavior may not adjust immediately or only adjust to a limited extent. Secondly, the adoption of agricultural technologies usually requires farmers to have the necessary knowledge and input of production factors. However, many farmers lack channels to access information and capital, making it difficult for them to meet the conditions for adjusting production decisions. This issue is particularly pronounced for smallholders, as they often face challenges such as inadequate skills, limited financial resources, and information asymmetry (Donovan and Poole, 2014; Wouterse and Badiane, 2019).

As an emerging technology with significant implications for the future, the Internet possesses numerous advantages, such as rich information, rapid dissemination, and overcoming spatial boundaries, which has brought a profound transformation in agriculture. Research on the Internet’s impact on agriculture mainly focuses on how to enhance the market efficiency of agricultural products (Mitra et al., 2018; Tadesse and Bahiigwa, 2015; Jensen, 2007). For example, Reddy (2018) focuses on the electronic agricultural markets and systematically analyzes the impact of market electronization on market participants, market prices, product quality, and trade processes. The profound impact of the Internet is not only limited to the agricultural product market but also plays a crucial role in the agricultural production side. The widespread adoption of the Internet has brought profound transformations to agricultural production. It has provided agricultural producers with abundant information, broader markets, and more efficient collaboration platforms, thereby reducing information search costs (Aker et al., 2016), accelerating agricultural technology diffusion (Zheng et al., 2022), and increasing agricultural output levels (Khan et al., 2022).

However, the research on how Internet use affects agricultural technical efficiency, especially for grain crops, is still in its infancy. The research on the relationship between mobile phone communication and corn technical efficiency conducted by Mwalupaso et al. (2019) can provide a reference for answering this question. However, their study did not take into account whether to use of mobile phones was a self-selected outcome for farmers and thus did not address the issue of sample selection bias. Zhu et al. (2021) and Zheng et al. (2021) respectively analyzed the impact of Internet use on the technical efficiency of apple and banana production in China and found that Internet use is conducive to improving the technical efficiency of smallholders. However, for grain crops, there may be factors that make these findings not applicable. Firstly, the technical requirements and management needs for grain crops are different from cash crops, which leads to different limitations and obstacles in the application of the Internet in grain crop production. Secondly, the market prices of grain crops are often lower, which may not provide sufficient economic incentives to encourage farmers to use the Internet to change their production methods. As a result, farmers in grain crop production may rely more on traditional agricultural techniques and farming experience, limiting the positive impact of Internet use on technical efficiency. Therefore, further research is needed to investigate the actual effects of Internet use on the technical efficiency of grain crop production. Additionally, we also need to gain a deeper understanding of the potential mechanisms through which Internet use affects agricultural technical efficiency to formulate relevant policies.

In this study, we constructed a theoretical framework to investigate the impact of Internet use on the technical efficiency of grain production. We conducted empirical tests using a bias-corrected stochastic frontier model and a propensity score matching (PSM) model based on the grain production data of 1699 households in China. The possible innovations of this paper are as follows: Firstly, this paper incorporates internet technology into the research framework of the technical efficiency of grain production and conducts empirical tests, expanding and supplementing existing research in this area. Secondly, by combining a bias-corrected stochastic frontier model with a PSM model, this paper obtained an unbiased estimate of the impact of Internet use on technical efficiency. This approach addresses the issues of sample selection bias and self-selection bias. Thirdly, this paper conducted a comprehensive analysis of the potential channels through which Internet use affects the technical efficiency of grain production, focusing on social capital, financial constraints, and mechanized farming. This in-depth exploration helps to understand and release the enhancement effect of Internet use on technical efficiency.

The rest of the paper is organized as follows. Section “Background” briefly introduces the development of the Internet in rural China. Section “Research analysis” builds a theoretical framework for the impact of Internet use on the technical efficiency of food crops. The research data, methods, and variables are introduced in the section “Methods”. Section “Results and discussion” analyzes and discusses the empirical results. The section “Conclusion and recommendation” presents conclusions and relevant policy recommendations.

Background

As the largest developing country in the world, China’s grain crop production has always been based on high inputs applications of pesticides, fertilizers, and other factors in exchange for high yields (Wang et al., 2017). According to statistics, the average application of chemical fertilizer in China’s crops is 2.74 times that of the world average levelFootnote 1, and the average application of chemical pesticides is 2.5–5 times that of developed countries (Jin et al., 2017). Fertilizers and pesticides have changed from a tool for increasing production to an important factor for destroying the ecological environment (Yu et al., 2021). Moreover, with the rapid rise in the cost of production, such as higher prices of pesticides and fertilizers in recent years, the comparative advantage of China’s agricultural products is gradually losing. According to statistics from the Ministry of Commerce, China’s trade deficit in major agricultural products increased from USD 1.46 billion in 2005 to USD 71.28 billion in 2018. Therefore, improving the technical efficiency of agricultural production has become the only way for the coordinated development of agriculture and environment in China. However, The China Agricultural Sector Development Report 2021 pointed out that the technical efficiency of China’s agricultural production is declining, and the application and promotion of technology are facing challenges. Therefore, improving the technical efficiency of grain production is an urgent problem for China, and it can also provide experience and reference for the agricultural development of developing countries.

The Internet has developed rapidly in China since it was fully connected to the Internet in 1994. However, due to a lack of knowledge about networking, lagging network infrastructure, and access conditions, rural internet development progressed slowly at the initial stage. However, after the Chinese government officially wrote “Promoting rural informatization construction and actively supporting rural internet infrastructure construction” into the first policy statements released by central authorities each year, the rural information infrastructure has been continuously improved, and the Internet has been rapidly popularized in rural areas. The 51st statistical report on China’s Internet development showed that by the end of 2022, there were 308 million Internet users in rural China, with a penetration rate of 61.90%. With its continuous development, the Internet has also played an important role in agricultural production. In particular, after China proposed the vigorous promotion of ‘Internet +’ modern agriculture in 2016, the Internet exerted a profound influence on agricultural production, operation, management, services, and other aspects of the agricultural industry chain, providing new impetus for the development of agricultural modernization. By 2019, the overall level of digital agriculture and rural development in China’s counties reached 36.0%, of which the level of digitalization of agricultural production reached 23.8%.Footnote 2 It is of great practical significance to understand and clarify the influence of the Internet on technical efficiency in agricultural production.

Research analysis

Following the existing literature, we propose a simple theoretical framework, as shown in Fig. 1. This theoretical framework consists of two parts: Internet use decision and the channels through which Internet use affects technical efficiency. First, whether to use the Internet is a self-chosen behavior of smallholders. The decision is influenced by many factors, such as personal characteristics, family characteristics, and village characteristics (Zheng et al., 2022). In terms of individual characteristics, factors such as the age, gender, and education level of the household head directly impact their understanding and acceptance of the Internet. The younger generation and farmers with higher levels of education may possess stronger technological adaptability, making it easier for them to grasp and utilize Internet resources. In terms of family characteristics, factors such as economic income and population size can affect their ability to invest resources in using the Internet. Families with lower incomes may find it difficult to bear the related costs of internet use, and households with smaller populations may find it difficult to form the network effect of internet use. In terms of village characteristics, factors such as geographical location and topography directly influence the construction and penetration level of the village’s Internet infrastructure, thereby affecting the accessibility of Internet services for farmers (Cui et al., 2022). In summary, according to the user demand and their endowments, households will apply internet technology at different time points and different intensity levels.

Fig. 1
figure 1

The theoretical framework for the impact of Internet use on technical efficiency.

As for the channels through which Internet use affects household technical efficiency, this paper analyzes three perspectives: alleviating financial constraints, strengthening social capital, and promoting mechanized farming. For a long time, agriculture has been a weak link in China’s economic development, and smallholders are a represented group with limited resource endowment (Qiu et al., 2021). Due to low agricultural economic returns and vulnerability to external shocks such as climate conditions, it is difficult for smallholders to afford the application of advanced agricultural technologies only by their capital accumulation, which limits the improvement of technical efficiency. Credit financing can effectively alleviate the financial constraints faced by farmers in adopting technologies, but smallholders face difficulties in obtaining support from formal financial institutions in terms of loans, collateral, and guarantees. Moreover, in the process of obtaining financial support from formal financial institutions, smallholders have difficulties with loans, mortgages, and guarantees (Tchamyou et al., 2019). The Internet serves as a good information flow channel among financial market participants (Boateng et al., 2018), reducing the dilemma of information asymmetry between banks and farmers (Andonova, 2006). This aids in overcoming the issues of “financial discrimination” and “financial mismatch” in the traditional financial system. Firstly, the use of the Internet is beneficial for improving farmers’ ability to access and distinguish financial information. With the help of Internet technology, smallholders can timely and accurately obtain loan issuance policies and requirements and make the most favorable credit choice for production according to their conditions and needs (Parlasca et al., 2022). Secondly, the Internet assists rural financial institutions in analyzing credit data and the economic conditions of low-income groups, thus accurately targeting intended customers. Financial institutions can even use the Internet to track loan households, ensuring that agricultural credit funds are mainly used in agricultural production. In summary, Internet technology promotes the improvement of technical efficiency by alleviating the financing constraints of smallholders.

Another channel for Internet use to affect technical efficiency is strengthening the social capital of households. Communicating with acquaintances is the pathway for smallholders to learn technology (Leta et al., 2018). This type of communication allows smallholders to learn about the problems they encounter in actual production and their corresponding solutions from acquaintances. Due to the high level of trust between acquaintances, this intimate communication is more effective in promoting the transfer of knowledge and the application of technology. The rapid development of the Internet provides channels and platforms for smallholders to communicate and interact with friends without being limited by geographical distances. This significantly reduces the information exchange costs among smallholders’ mutual communication, enabling them to broaden their social network and accumulate social capital (Bauernschuster et al., 2014). This provides an important source of technical information for smallholders and helps to accelerate the adoption process and ratio of agricultural technology (Zheng et al., 2022). Furthermore, the improvement of smallholders’ social capital can expand their access increases to informal financial loans, which further alleviates the cost pressure to adopt agricultural technologies. In summary, Internet use promotes technical efficiency by improving the social capital of smallholders.

Internet use can also improve the technical efficiency of smallholders by promoting mechanized farming. The network effect of Internet technology broadens the off-farm employment channels for householders, thereby changing the quantity and structure of the labor force employed by households in agricultural production (Min et al., 2020). The transfer of rural labor, especially young and middle-aged labor, will lead to the structural contradiction of the allocation of human and land elements in agricultural production. In this case, households will increase the input of mechanization to replace the missing labor. Smallholders constrained by operational scale, considering the scale inefficiency of purchasing agricultural machinery, are more inclined to achieve mechanized production through agricultural machinery services (Diao et al., 2014). The Internet can help farmers reduce transaction costs and increase their opportunities to access agricultural mechanization services through outsourcing (Daum et al., 2021). Through the internet, smallholders can compare prices and service quality of different service providers, choosing mechanization services that offer better value for money and pay as needed. This reduces the investment and maintenance costs for mechanization. Undoubtedly, the operation quality of agricultural mechanization production is stable, and the cost is lower, which is conducive to improving technical efficiency (Shi et al., 2021).

Methods

Model setting

The main purpose of this study is to clarify the impact of Internet use on the technical efficiency of smallholders. Following the previous research (Asmare et al., 2022; Villano et al., 2015), we adopt a multi-stage procedure to address selection biases caused by observed and unobserved factors, thereby making up for some shortcomings of existing research. Firstly, we use a sample selection-corrected stochastic production frontier (SPF) model proposed by Greene (2010) to estimate the technical efficiency of smallholders, addressing the selection bias caused by unobserved factors. Then, we take the estimated technical efficiency as the dependent variable and use the PSM method to identify the differences in technical efficiency between samples of Internet users and non-users based on observable characteristics. In addition, we also use an endogenous switching regression model (ESR) to account for the effects of both observed and unobserved factors to verify the robustness of the regression results by effectively addressing endogeneity caused by sample selection.

Selection-corrected SPF

Using the stochastic frontier framework, we applied Heckman’s correction to solve the sample selection problem in this study. It assumes that the sample selection bias is caused by the correlation between the error term in the stochastic frontier model and the error term in the sample selection model. The stochastic frontier model with sample selection correction can be specified as follows:

$${{{\mathrm{Sample}}}}\,{{{\mathrm{selection}}}}\!:d_i^ \ast = \alpha ^\prime z_i + \varepsilon _i,\varepsilon _i \sim N\left( {0,1} \right)$$
$${{{\mathrm{Stochastic}}}}\,{{{\mathrm{production}}}}\,{{{\mathrm{frontier}}}}\,{{{\mathrm{model}}}}\!:y_i = \beta ^\prime x_i + v_i - u_i$$

(yi, xi) are observed only when di = 1.

$$u_i = \sigma _u\left| {U_i} \right|\,{{{\mathrm{where}}}}\,U_i\sim N\left( {0,1} \right)$$
$$v_i = \sigma _vv_i\,{{{\mathrm{where}}}}\,v_i\sim N\left( {0,1} \right)$$
$$\left( {\varepsilon _i,v_i} \right)\sim N_2\left[ {\left( {0,0} \right),\left( {1,\rho \sigma _V,\sigma _V^2} \right)} \right]$$

Where di is a binary variable, which is assigned a value of 1 when the family uses the Internet, and a value of 0 otherwise. zi is the set of explanatory variables for the sample selection model. εi is the unobserved error term. In the SPF model, yi is the output variable, xi is the input variable, vi is the stochastic error term, and ui is the inefficiency term. In particular, if ρ is significant indicates the presence of selectivity bias due to unobservable features.

In this study, the SPF model is estimated using the translog production function. The specific form of the model is as follows:

$$\begin{array}{l}\ln y_i = \beta _0 + \beta _1\ln L_i + \beta _2\ln K_i + \beta _3\ln T_i + \beta _4\left( {\ln L_i} \right)^2 + \beta _5\left( {\ln K_i} \right)^2 + \beta _6\left( {\ln T_i} \right)^2 + \beta _7\ln L_i\ln K_i\\ \quad \quad + \,\beta _7\ln L_i\ln T_i + \beta _7\ln K_i\ln T_i + v_i - u_i\end{array}$$
(1)

Where yi is the total output value of grain cultivation by smallholder i, Li is the labor input, expressed by the number of laborers engaged in agricultural production for more than three months in a year; Ki is the capital input, including the expenditure of cultivation, fertilizer, pesticide, mechanics and other links; Ti is the land input, that is, the planting area of grain crops. The formula for calculating the technical efficiency (TEi) of smallholders’ production is as follows:

$${\rm{TE}}_i = E\left( {Y_i\left| {u_i,Q_i} \right.} \right)/E\left( {Y_i\left| {u_i = 0,Q_i} \right.} \right)$$
(2)

Where TEi represents technical efficiency, that is, the ratio of actual output to the output frontier; Qi is the inputs for smallholder production; \(E\left( {Y_i\left| {u_i,Q_i} \right.} \right)\) represents the expected value of the actual output; \(E\left( {Y_i\left| {u_i = 0,Q_i} \right.} \right)\) represents the expected value on the output frontier in the absence of technical inefficiency.

Propensity score matching

In the sample, smallholders have the option of using the Internet or not, so the smallholders who use the Internet and those who do not use the Internet are not random, and there is a problem of “self-selection”. In this study, the PSM method is used to control for self-selection bias caused by observable features. In addition, since the data of smallholders who use the Internet when they are not using the Internet are not available, directly comparing the technical efficiency differences between smallholders who use the Internet and smallholders who do not use the Internet will cause endogeneity problems. By constructing counterfactual scenarios, the PSM method can effectively solve the endogeneity problem. The specific analysis steps are as follows:

First, according to the observed characteristic variables, use the logit model to estimate the conditional probability fitting value of using the Internet, that is, the propensity score value. The specific expression is as follows:

$$P\left( X \right) = Pr\left( {D = 1\left| X \right.} \right) = \frac{{\exp \left( {\beta X} \right)}}{{1 + \exp \left( {\beta X} \right)}}$$
(3)

Where X is the multidimensional characteristic variable for matching, β is the coefficient vector. Second, the treatment group was matched with the control group, and a common support domain test and a balance test were performed. Finally, the difference in smallholder technical efficiency between the treatment and control groups, average treatment effect (ATT), was calculated to obtain the effect of Internet use on the technical efficiency of smallholders. The expression of ATT is as follows:

$$ATT = E\left( {Y_1\left| {D = 1} \right.} \right) - E\left( {Y_0\left| {D = 0} \right.} \right) = E\left( {Y_1 - Y_0\left| {D = 1} \right.} \right)$$
(4)

Where D is the treatment variable for whether smallholders use the Internet, Y1 is the technical efficiency of smallholders using the Internet, and Y0 is the technical efficiency of smallholders not using the Internet.

Endogenous switching regression

The ESR model adopts the idea of two-stage estimation and considers the influence of observable and unobservable factors at the same time, which can effectively solve the self-selection bias of households using the Internet. First, regress the decision equation for whether smallholders use the Internet:

$$d_i = \alpha z_i + {\it{\epsilon }}_i$$
(5)

In the formula, di represents whether the farmer i uses the Internet; zi is the various control variables that affect the use of the Internet by smallholders; ϵi is the error term. At the same time, two outcome equations for households using the Internet and those not using the Internet are established as follows:

$$Y_{1i} = \beta _1X_{1i} + \varepsilon _{1i}$$
(6)
$$Y_{0i} = \beta _0X_{0i} + \varepsilon _{0i}$$
(7)

Where Y1i and Y0i represent the technical efficiency of households with and without the Internet, respectively; X1i and X0i are various control variables that affect the technical efficiency; ε1i and ε0i are the error terms. Then, a counterfactual framework is constructed using the model-estimated coefficients to estimate the average treatment effect of Internet use on smallholders’ technical efficiency by comparing the expected differences in technical efficiency between farmers who use the Internet and farmers who do not use the Internet under realistic and counterfactual conditions.

The expected values for the technical efficiency of households using the Internet are as follows:

$$E\left( {Y_{1i}\left| {d_i = 1} \right.} \right) = \beta _1X_{1i} + \sigma _{1{\it{\epsilon }}}\lambda _{1i}$$
(8)

The expected values for the technical efficiency of households not using the Internet are as follows:

$$\begin{array}{*{20}{c}} {E\left( {Y_{0i}\left| {d_i = 0} \right.} \right) = \beta _0X_{0i} + \sigma _{0{\it{\epsilon }}}\lambda _{0i}} \end{array}$$
(9)

The expected value of the technical efficiency of households using the Internet when they do not use the Internet is as follows:

$$\begin{array}{*{20}{c}} {E\left( {Y_{0i}\left| {d_i = 1} \right.} \right) = \beta _0X_{1i} + \sigma _{0{\it{\epsilon }}}\lambda _{1i}} \end{array}$$
(10)

The expected value of technical efficiency when using the Internet in households that do not use the Internet is as follows:

$$\begin{array}{*{20}{c}} {E\left( {Y_{1i}\left| {d_i = 0} \right.} \right) = \beta _1X_{0i} + \sigma _{1{\it{\epsilon }}}\lambda _{0i}} \end{array}$$
(11)

Through formula (8) and formula (10), the average treatment effect (average treatment effect on the treated, ATT) of the household technical efficiency of using the Internet is obtained as:

$$\begin{array}{*{20}{c}} {ATT_i = E\left( {Y_{1i}\left| {d_i = 1} \right.} \right) - E\left( {Y_{0i}\left| {d_i = 1} \right.} \right) = \left( {\beta _1 - \beta _0} \right)X_{1i} + \left( {\sigma _{1{\it{\epsilon }}} - \sigma _{0{\it{\epsilon }}}} \right)\lambda _{1i}} \end{array}$$
(12)

Through Equation (9) and Equation (11), the average treatment effect (average treatment effect on the untreated ATU) of household technical efficiency without using the Internet is obtained as:

$$\begin{array}{*{20}{c}} {ATU_i = E\left( {Y_{1i}\left| {d_i = 0} \right.} \right) - E\left( {Y_{0i}\left| {d_i = 0} \right.} \right) = \left( {\beta _1 - \beta _0} \right)X_{0i} + \left( {\sigma _{1{\it{\epsilon }}} - \sigma _{0{\it{\epsilon }}}} \right)\lambda _{0i}} \end{array}$$
(13)

Data and variables

The data in this article comes from the China Labor Dynamics Survey (CLDS) conducted by the Social Science Survey Center of Sun Yat-sen University in 2018. The survey adopts the probability sampling method, which involves a multi-stage, multi-level, and proportional sampling method based on the size of the labor force. The survey data includes three levels of labor force individual, family, and community. The survey covers individual basic characteristics, household income, household consumption, rural household production, and community conditions. The research object of this paper is the smallholders engaged in grain production, so only the samples involved in grain production are retained. At the same time, in the data processing, the family questionnaire, the individual household head questionnaire, and the village residence questionnaire were matched and merged, and the samples with zero actual cultivated land area and missing key variables were removed. Finally, we obtained data from 1699 households covering 25 provinces. The locations of the provinces covered by the survey are shown in Fig. 2. From the basic characteristics of the samples, the age of the household head is mainly 50 years old or above, the education level is mostly junior high school or below, and the family size is mainly 3–4 people. The sample is in line with the reality of China’s rural labor force aging and low education level, and has a good representativeness.

Fig. 2
figure 2

Location of the study area.

Internet use is the core variable of this paper. Referring to the existing studies (Shimamoto et al., 2015; Hou et al., 2019; Zheng et al., 2021), we construct dummy variables for measuring. A value of 1 is given if the household has access to the Internet using a mobile phone, tablet, or computer, and 0 is given otherwise. In our sample, 59.4% of the smallholders in the survey sample use the Internet, which is the mean of Internet usage.

Several variables are selected to perform the PSM and the bias-corrected SPF. Based on existing studies, this paper examines and controls the impact of relevant characteristics on smallholder adoption of the Internet from the three levels individual, household, and village. Whether or not to access the Internet is a household decision-making behavior, and the head of the family often plays a prominent role in decision-making (Jenkins et al., 2019). Therefore, at the individual level, the age, gender, education level, ethnicity, and health status of the household head were selected as control variables. At the household level, this paper selects household size and economic income to control the impact of household characteristics on Internet use decision-making. Household Internet use is also closely related to local infrastructure. The topography, scale, and geographical location of the village can all impact the construction of the village’s network facilities (Forman et al., 2005; Shaffril et al., 2010). Therefore, at the village level, this paper selects the topography of the village, the scale of the village, and the distance from the village to the county seat as control variables. The descriptive statistics of those variables are shown in Table 1.

Table 1 The descriptive statistics of variables.

According to Table 1, the household heads of small farmers are mainly male, and the average age is 55, which is consistent with the current aging trend of the rural labor force in China. The average education level of the household heads is 2.503. It can be seen that, on average, the education level of the sample is below junior high school, indicating a low education level. The average household size is 4.419, indicating that the total population of smallholder households averaged around 4–5 people. In general, the characteristics of the sample are basically in line with the basic situation of China’s rural areas at this stage and are representative to a certain extent.

Results and discussion

Technical efficiency estimation

The likelihood ratio test was performed on the stochastic frontier estimation results of the C–D production function and the translog production functions. The LR value in the result was 59.22, indicating the null hypothesis of no interaction between variables was rejected significantly. Therefore, the translog production function is more suitable.

The results of the maximum likelihood estimation of the production function using the conventional SPF method and the bias-corrected SPF method are shown in Table 2. The estimation results of the bias-corrected SPF model show that the selection bias term, ρ, is statistically significant at the five percent level, indicating that selection bias does exist. Therefore, it is necessary to use a bias-corrected SPF method for technical efficiency estimation. From the estimated coefficients of each input variable, the input of labor and land has a significant impact on the output, and the labor contributes more to the output. This could be due to China’s population aging and high-quality labor resources leaving the country, which results in a large output elasticity of labor (Li and Sicular, 2013). On the other hand, the extensive production mode of high input and high consumption among smallholders in China results in an output increase that is more dependent on basic elements such as labor and land (Yao and Liu, 1998).

Table 2 Parameter estimation results of the SPF model.

The mean difference in technical efficiency estimated using the conventional SPF and the bias-corrected SPF for different Internet usage is presented in Fig. 3. It is evident that, on average, households using the Internet have higher technical efficiency compared to households that do not use the Internet. Furthermore, in terms of the degree of difference, the technical efficiency calculated using the traditional SPF shows greater variation between the groups. Therefore, it is necessary to solve selectivity bias to obtain accurate estimation results.

Fig. 3
figure 3

Descriptive analysis of technical efficiency by Internet use.

The determinants of Internet use

This paper employs a logit regression model to estimate the propensity score for whether households use the Internet or not. The regression results reported in Table 3 show that the probability of smallholders using the Internet is affected by many factors, including the individual characteristics of the head of the household, the characteristics of the household, and the characteristics of the village. It was found that the age of the household head has a significant negative impact on the probability of households using the Internet. For every year older than the head of a household, the probability of a household accessing the Internet decreases by 1.1%. The education level of the household head has a significant positive impact on the choice of the Internet. The probability of a household being connected to the Internet increases by 4.2% for every level of education. This is because farmers with higher education possess greater learning abilities and cognitive skills, making them more receptive to new ideas and technologies (Jung, 2008). In addition, the health status of the household head also has a significant positive impact on Internet technology adoption, which aligns with the findings of Charness and Holley (2004).

Table 3 Estimated results of factors influencing Internet use decisions.

At the level of household characteristics, household size has a significant positive impact on Internet use. For every additional person in a household, the probability of having Internet access increases by 1.7%. The possible reason is that in larger families, it can be more challenging to interconnect and communicate effectively among members, hence the higher likelihood of using Internet technology to ensure smooth communication (Sharaievska, 2017). The results also show that household economic income also has a significant positive impact on Internet use. For each unit increase in income, the probability of Internet access increases by 9.0%. This may be because households with higher economic income have the financial strength to bear the cost of the Internet, and more sufficient funds can be used to improve business efficiency by using Internet technology, so they will actively use Internet technology.

At the level of village characteristics, village topography has a significant positive impact on farmers’ use of Internet technology. When the village terrain is plain, the probability of farmers accessing the Internet increases by 6.0%. This is because whether households use the Internet is closely related to the quality of local network infrastructure. In villages with flat terrain, the cost of infrastructure, such as network optical cables, is lower, and the Internet coverage is higher. Thus, smallholders are more likely to choose to use the Internet.

Average treatment effect on the impact of Internet use on smallholders’ technical efficiency

Common support domain test

To ensure the validity of the PSM model estimation, the data must be tested for common support. That is to say; it is essential that most of the propensity score values for both the treatment group and the control group fall within the common value range. Otherwise, excessive sample omission may occur, which can affect the accuracy of the matching results. Figure 4 displays the common value range of the propensity scores after radius matching. It is evident that there is substantial overlap between the treatment group and the control group after matching, with only a small amount of sample loss. This indicates that the common support domain conditions are met, and the matching results are reliable.

Fig. 4
figure 4

Common range of propensity scores (radius matching).

Balance test

To ensure that there is no significant difference in variable characteristics between the treatment group and the control group samples after matching in different dimensions, this study further conducts a balance test. The test results of the three matching methods are shown in Table 4. After matching, the Pseudo-R2 value decreased significantly, and all were below 0.01; the LR statistic decreased from 330.290 to below 15; the mean deviation and median deviation also decreased significantly. In addition, the standardized mean difference of the matched models, the B value, all dropped below the 25% threshold proposed by Rubin (2001). In conclusion, after matching, each parameter satisfies the balance hypothesis test, thus, there is no statistically significant difference in each covariate, and the data is well-balanced.

Table 4 Balance test results of explanatory variables before and after matching.

The estimated result of the PSM model

In this study, three different matching methods, k-nearest neighbor matching, radius matching, and kernel matching, are used to estimate the impact of Internet use on smallholders’ technical efficiency. The specific results are shown in Table 5. The estimation results obtained by the three different matching methods are consistent, indicating that the results have good robustness. Regardless of using conventional SPF or bias-corrected SPF to measure technical efficiency, after PSM estimation, it is found that Internet use has a significant positive impact on smallholders’ technical efficiency. That is, the use of the Internet increases the technical efficiency of grain production by smallholders, which is consistent with some previous studies on bananas (Zheng et al., 2021) and apples (Zhu et al., 2021). The last row of Table 5 presents the average values of the estimation results obtained from three matching methods. It can be observed that after correcting for bias, the technical efficiency of smallholders using the internet is 0.011 higher than those who do not use the internet. Furthermore, our results show that regardless of the matching method used, the impact of Internet technology on households’ technical efficiency is underestimated when using conventional SPF to measure technical efficiency. Therefore, it is reasonable to use the bias-corrected SPF model to measure technical efficiency.

Table 5 Average treatment effect of Internet use on technical efficiency.

Robustness test

Sensitivity analysis

The PSM method is mainly based on observable variables for matching, so if some important unobservable variables are omitted, it will cause the problems of “hidden bias”. To further test the robustness of the results, this paper uses the boundary method to assess the sensitivity of the PSM estimation results to hidden biases. The specific results are shown in Table 6. The parameter Gamma refers to the influence of uncontrolled factors on whether farmers use the Internet. If the conclusion is not significant when Gamma is close to 1, then the PSM results are considered less robust. In this paper, hidden bias estimation is carried out for three types of matching methods. It can be seen that from the level of 1 to 2, the sensitivity is only reflected when the Gamma coefficient is greater than 1.5, 1.6, and 1.8, respectively. This shows that the hidden bias problem in PSM estimation can be ignored, and the estimation results based on the PSM model are robust.

Table 6 Boundary sensitivity analysis.

Estimated results of the ESR model

The ESR model considers the influence of both observed and unobserved factors, which can effectively solve the problem of selection bias. Therefore, this study analyzes the impact of Internet use on householders’ technical efficiency to confirm the robustness of the PSM regression results. This paper uses the proportion of Internet users in the same village as the instrumental variable to address the endogeneity issue in estimating the impact of Internet use on households’ technical efficiency. First of all, the proportion of Internet users in the same village reflects the perfection and popularity of network facilities in the village to a certain extent, so it is related to the Internet use of households. Secondly, the use of the Internet by others will not affect the technical efficiency of small-scale farmers, which can confirm the exogenous nature of instrumental variables. Furthermore, the Cragg-Donald Wald F statistic has a value of 129.670, indicating that the instrumental variable does not suffer from weak instrument bias. Therefore, the instrumental variables selected in this paper are reasonable and effective.

Table 7 reports the treatment effect of Internet use on technical efficiency obtained by using the ESR model. The regression results showed that the mean treatment effect of households with and without the Internet was 0.004 and 0.003, respectively, both of which were significant at the 1% level. It indicates that Internet use has a positive treatment effect on technical efficiency. Households using the Internet experience an increase in technical efficiency by 0.004; while if the households that do not use the Internet use the Internet, the technical efficiency will increase by 0.003. This is consistent with the average treatment effect obtained by using the PSM model above, indicating that after the PSM treatment, the effect of Internet use on technical efficiency is relatively stable.

Table 7 Average treatment effect of Internet use on smallholders’ technical efficiency.

Mechanism analysis

The above regression results broadly reveal that Internet use can significantly improve the technical efficiency of households, but it does not answer the question of how Internet use affects the technical efficiency of farmers. Next, this paper empirically tested the three potential channels proposed in the above analysis. In this paper, the availability of financial loans for smallholders is reflected by whether households borrow or not. The social capital of households is measured by the total amount of household gift payments. The mechanized farming of smallholders is measured by whether machinery is used in the process of grain production.

Table 8 reports the PSM estimation results of the above three channels, and the results of the three matching methods are the same. Specifically, Internet use significantly increases the probability of borrowing, which indicates that Internet technology improves the availability of financial lending. With the advantages of Internet finance, households can obtain funds promptly, thereby alleviating the cost pressure of technology applications. Internet use positively affects household gift money expenditure at a significant level of 1%, indicating that Internet use expands the social network of households. As an efficient and convenient communication tool, Internet technology reduces the communication cost for smallholders to communicate with each other and promotes the dissemination and application of production technology. Internet use also significantly increases the likelihood that households use machinery for production. With the help of Internet technology, smallholders’ demand for productive services will be met and realized “can’t be affordable but use”, thus obtaining the division of labor and economies of scale brought about by external social services and improving technical efficiency. To sum up, the role of the Internet in alleviating financial constraints, enhancing social capital, and promoting mechanized farming has been verified.

Table 8 Estimated results of potential mechanisms of Internet use on technical efficiency.

Conclusion and policy recommendation

Improving the technical efficiency of agricultural production is an urgent issue in China, as is the case in many developing countries. In this study, we examine the impact of Internet use on smallholders’ technical efficiency using data from a sample of 1699 smallholders engaged in grain crop production in China. We found that the decision of households to access the Internet was significantly influenced by the individual characteristics of the head of the household, the characteristics of the household, and the characteristics of the village. The education level and health status of the household head, the income and size of the household, and the plain terrain of the village significantly increased the likelihood of households using the Internet. An increase in the age of the head of the household will reduce the probability of households using the Internet.

We also found that Internet use is positively correlated with increased technical efficiency of grain production. Sample selection bias arising from unobservable factors such as risk appetite will affect the accuracy of the estimates. After correcting for selection bias, the effect of Internet use on the technical efficiency of grain production becomes bigger. That is, failure to address selection bias would result in underestimating the impact of Internet use on smallholder technical efficiency. Our results, obtained using an endogenous switching model, also demonstrate the effect of Internet use on technical efficiency. Although the research topics are different, the findings are consistent with Reddy’s (2018) study, both supporting some positive transformative trends of the Internet in the agricultural sector.

In addition, we also verified the potential channels through which the use of the Internet affects the technical efficiency of grain production. The results show that the use of the Internet affects technical efficiency through three channels. First, the Internet can improve the availability of financial loans and alleviate the cost pressure of applying new technologies. Second, Internet technology can expand the social capital of households, which is conducive to the dissemination and application of production technology. Third, Internet technology has improved the mechanized farming level of farmers, effectively resolved the structural contradiction of the allocation of human and land factors, and thus improved technical efficiency.

The following policy implications are drawn from the results of this study. Firstly, our research emphasizes the positive role of Internet use in improving the technical efficiency of grain production. In practice, policymakers should not only increase capital investment in rural Internet infrastructure but also formulate policies to reduce network access costs, to improve the popularity of the Internet in rural areas. Secondly, policies aimed at promoting the use of emerging technologies should fully consider the heterogeneous characteristics of smallholders and determine specific popularizing measures suitable for different groups. Encouraging smallholders to access information through various channels, including traditional agricultural production services and farmers’ organizations, as well as agricultural information platforms provided through social media and applications. Thirdly, policymakers can establish technical training centers to provide technical advisory services to farmers, improve rural education and training opportunities, and assist farmers in adopting internet technology and new agricultural production techniques. Finally, in the process of improving the technical efficiency of grain production, we cannot rely solely on the Internet. The government can cooperate with banks, research institutions, cooperatives, agricultural enterprises, and other entities to develop diverse policy measures that provide financial, technological, and production support to smallholders. For example, establishing agricultural development funds to offer credit support, promoting technology research and dissemination, and establishing modern agricultural production service systems.

This article may have some shortcomings. First, limited by the survey data, the measurement of the Internet in this paper is limited to the use of basic networks, and it is impossible to accurately measure the extent to which the Internet provides farmers with effective information to help them make production decisions. In the future, if more in-depth Internet usage data can be collected, it will provide more reliable data support. Second, the development of the Internet is a dynamic process. This paper uses cross-sectional data to analyze the impact of Internet use on the technical efficiency of grain crop production, and it is impossible to measure the dynamic impact of the Internet. It is necessary to use panel data for further in-depth research in the future.