Introduction

Petroleum is generated via thermal alteration of buried organic matter in source rocks, followed by oil expulsion (primary migration) out of those source rocks. Petroleum accumulations are formed mainly via subsequent secondary migration from source rocks through carrier beds to traps. Information about the directions, pathways and distances of secondary petroleum migration is required in the search for new petroleum resources. However, secondary petroleum migration still remains the least understood of the processes involved in petroleum accumulation. Because fractionation of polar organic molecules can result from preferential sorption of these compounds in migrating oils onto immobile mineral surfaces or through partitioning into formation water, molecular indices that are correlated solely with the absolute or relative migration distances migrated by oils have been sought for decades1,2, but with limited success.

Nitrogen-, sulfur- and oxygen-containing compounds exhibit strong sorption on minerals and/or high solubility in water due to their polarities and thus variations in the distribution of these molecules are used to study oil migration processes3,4. However, those compounds with high solubility in water such as alkylphenols can be easily affected by water saturation, water washing and injection for enhancement of oil production and are sensitive to any oil-water interactions in the subsurface environment5. Concentrations and ratios of carbazoles (nitrogen-containing compounds) were previously used as proxies of secondary migration distances2,6,7,8,9,10,11,12,13,14, based on their low solubility in water15. However, recent studies show that these empirical indicators do not solely reflect migration-related fractionations and thus do not actually correlate with migration distance, because their concentrations and ratios can also be affected by variations in organic facies (such as marine, lacustrine, or terrigenous organics; anoxic or suboxic depositional environment; carbonate or shale lithology) and thermal maturation of source rocks as well as biodegradation of oils8,10,16,17,18,19,20. In addition, it appears that properties of migration systems, such as porosity, sorption coefficients, oil saturation and oil volume, may also influence the utility of these tracers4.

Among these influences, the biodegradation effect on carbazoles is negligible when biodegradation levels are less than 3 on the scale of Peters and Moldowan (1993)20,21. On the other hand, source input influences due to variations in source facies and maturity of organic matter, a parameter related to the maximum temperature experienced by source rocks at the time of oil expulsion, are significant and thus cannot be ignored8,10,16,17,19,22. The influence due to the variability of source facies can be minimized by grouping oils according to their source facies22. However, the maturity effect has been difficult to evaluate and impeded studies of secondary petroleum migration.

It is difficult, if not impossible, to find an oil component in nature that is independent of source input influences. Nevertheless, it is feasible to set up a secondary migration fractionation index (SMFI) that is independent of source input influences, reflects only migration-related fractionation and thus correlates directly with migration distance. Here, we advance the concept of SMFI as a reliable measure of migration fractionation and migration distances for a uniform migration system, where porosity, density of solids, sorption coefficients, migration velocity of oil and oil saturation are kept constant. More realistic migration systems with variable properties could be treated by dividing them into subsections with constant properties. The SMFI is defined as the ratio of the concentration of a large polar compound (heavier than 160 Dalton) with low concentration (e.g. carbazoles) to its initial concentration at a reference point for each source facies. In other words, the concentration of the large polar compound is actually a product of the initial concentration (controlled by source input influences) and the index that characterizes fractionation solely with secondary migration distance (see Equations (1, 4 and 5) in the methods section). Oil volume passing through a carrier bed or multiple charging events do not affect the validity of the ratio, when appropriate compounds with very low concentration in petroleum are selected as tracers (see for details in the multiple charging and oil volume section in the online Supplementary Information). This new SMFI is fully described in the methods section and mathematically derived in the Supplementary Information based on the mass balance principle and advection-reaction-dispersion theory.

We then apply and test the validity of the new index in the Ordos and Western Canada Sedimentary basins where both concentrations and ratios of carbazoles do not effectively reflect migration-related fractionations and thus the distances of secondary petroleum migration. The narrow, long, continuous and clay-rich sand body of the Xifeng Oilfield in the Ordos Basin provides an excellent opportunity to test if our index can work well. The Rimbey-Meadowbrook reef trend in the Western Canada Sedimentary Basin is a classical example used to develop the Gussow theory23 of differential petroleum entrapment involving long distance migration along the reef trend in the up-dip direction. However, this theory is still being debated mainly because the empirical indicators do not show obvious migration fractionations for most oils along the reef trend. We demonstrate that our SMFI fits the actual data well and is a reliable odometer for the distance of secondary petroleum migration and we provide supporting evidence for the Gussow theory. The new index represents a significant step forward in petroleum geoscience as it can be used to reveal the distribution patterns of petroleum accumulations in sedimentary basins and to study theories on petroleum accumulation.

Results

We first tested the utility of the SMFI in the Xifeng Oilfield in the southwest part of the Ordos Basin in China (Fig. 1a). Much of the basin lacks well-developed fault systems except in areas along the basin margins24. The reservoirs of this field are distributed in a narrow, long, continuous and clay-rich sand-body in the Eighth member of the Upper Triassic Yanchang Formation with low porosity (5.4–16.6%, 9.9% on average) and low permeability (0.1–36.9 millidarcy (mD), mainly 0.6–3.0 mD)25,26. The sand bodies in this member in the southwest part of the basin (Fig. 1) were deposited in the delta front of a braided river system27. The sedimentary microfacies of these sand bodies include submerged distributary channels and mouth bars28. The sand bodies have undergone various stages of diagenesis29, leading to low porosity and permeability of reservoirs in the field. The main source rocks are dark oil shales in the Seventh member of the Yanchang Formation30, with the peak oil generation and migration occurring at the end of the early Cretaceous31. The source kitchen is mainly distributed to the northeast of the Xifeng Oilfield (Fig. 1b).

Figure 1
figure 1

Distributions of oilfields and the main source rocks in the Ordos Basin, China.

(a) Structural contour map for the top of the Eighth member of the Upper Triassic Yanchang Formation (modified from Chen et al, 2006 and Zhang et al., 2009)45,46 and the oilfield distribution in the basin. The contours are in meters below and above sea level as indicated by plus and minus signs in front of numbers. (b) Distributions of source rocks in the Seventh member of the formation (thickness in meters) and sand bodies in the Eighth member in the southwest part of the basin (modified from Yang and Zhang, 2005)30, together with sampling locations.

Nineteen crude oil samples were collected from the producing wells in the Xifeng Oilfield (Fig. 1b). These samples were analyzed for saturated and aromatic hydrocarbons and carbazoles, following the procedures described previously7,32 (Data are presented in Supplementary Tables S1–S4). The relative uncertainties of the data are typically < 10%.

These data show that oils in the Xifeng Oilfield have consistent source facies (see Supplementary Figs. S1–S2 and relevant discussion in the Supplementary Information) and thus there is no need to separate these oils into genetic groups. The extent of biodegradation in the Xifeng Oilfield is below level 1 on the biodegradation scale of Peters and Moldowan21 and thus the biodegradation effect on carbazoles can be safely ignored. This is further demonstrated in details in the Supplementary Information (the biodegradation level subsection).

To calculate the relative migration distance, a reference point is needed (see details in the methods section). The determined reference point for the Xifeng Oilfied is located at the northeastern edge of the sand body and is close to the source kitchen, shown in Fig. 1b. The relative distance is defined as the length of the trend line of the sand body from the reference point to the projected point of a sampling well on the trend line. The relative distance was calculated for all samples in this manner and listed in Supplementary Table S1.

Many maturity parameters vary along the sand body of the Xifeng Oilfield (Supplementary Figs. S3–S4). Vitrinite reflectance (Ro) is a commonly used thermal maturity indicator of organic matter in source rocks. Maturity levels of oils are quantitatively determined as a vitrinite reflectance equivalent (Ro(equiv.)). To constrain the thermal maturity range of the studied oils, Ro(equiv.) values were calculated using aromatic hydrocarbons33,34 as detailed in the thermal maturity subsection of the Supplementary Information. The calculated Ro(equiv.) values (Ro(equiv.) = 0.14(4,6-DMDBT/1,4-DMDBT) + 0.57; DMDBT = dimethyl dibenzothiophene)34 are in a narrow range of 0.69% to 0.77% and show a clear decreasing trend with increasing distance throughout the oilfield (Fig. 2).

Figure 2
figure 2

Distribution of equivalent vitrinite reflectance (%Ro) calculated from 4,6-DMDBT/1,4-DMDBT ratio along the sand body of the Xifeng Oilfield.

Ro(equiv.) = 0.14(4,6-DMDBT/1,4-DMDBT) + 0.57 (Luo et al., 2001)34. DMDBT = dimethyl dibenzothiophene.

The carbazoles of the studied oils display a predominance of alkylcarbazoles over benzocarbazoles. The ratios of alkylcarbazoles/(alkyl- + benzocarbazoles) are all close to unity (Supplementary Table S4). Therefore, we focused on alkylcarbazoles. The concentrations of alkylcarbazoles decrease with increasing migration distance (Figs. 3a–f) and were thought to reflect secondary petroleum migration6,7,8,9,10,11,12,13,14. If so, their ratios should also be distance indicators. However, as shown in Fig. 3g–l, this expectation is not supported by the ratios of N-H exposed/partially exposed, of exposed/shielded and of partially exposed/shielded dimethylcarbazole isomers. Clearly, other factors are involved and must be teased out before we can use these tracers to track secondary migration distance. The differences in chemical sorption activity of alkylcarbazole isomers for hydrogen bond formation arise mainly from steric effects related to alkylation position35. The sorption of N-H exposed alkylcarbazole isomers (e.g., 2,7-dimethylcarbazole) is stronger than that of alkylcarbazole isomers with partially exposed N–H (e.g., 1,7-dimethylcarbazole). The sorption of partially exposed isomers is stronger than that of N–H shielded alkylcarbazole isomers (e.g., 1,8-dimethylcarbazole)7,35,36,37,38,39. Owing to this shielding effect, the ratios in Figs. 3g–l should decrease with increasing relative migration distance if fractionations without source input influences occurred during secondary petroleum migration. But, this decreasing trend is not evident in the data, likely due to the influence of maturity variations of these oils on alkylcarbazoles in this oilfield.

Figure 3
figure 3

Distributions of MCA, DMCA, EDMCA, PEDMCA, SDMCA and their ratios of the studied oils along the sand body of the Xifeng Oilfield.

MCA: methyl carbazoles; DMCA: dimethyl carbazoles; EDMCA, PEDMCA and SDMCA: exposed, partially exposed and shielded DMCA; SDMCA: 1,8-DMCA; 2,7/1,4-DMCA: 2,7-DMCA/1,4-DMCA; 2,7/1,8-DMCA: 2,7-DMCA/1,8-DMCA; 1,7/1,8-DMCA: 1,7-DMCA/1,8-DMCA.

To isolate maturity influence, we set up a maturity influence index to quantitatively evaluate the maturity effect (Equation (3) in the methods section). The values of maturity influence index are calculated from the derivative of , where is the relative migration distance and the constants of a2 and a3, where is a rate of change of initial concentration with and is related to sorption and oil migration velocity (Equations (S5, S8 and S20) in the online Supplementary Information). Note that cancels out in Equation (3). Although the constants and have fixed geochemical meanings, their values are determined by the properties of a specific migration system represented by the geochemical data of the system. To determine the values of and , non-linear regression analysis was performed using Equation (1) in the methods section and using the values of relative distance in Supplementary Table S1, the values in Supplementary Table S3 and the concentrations of alkylcarbazoles in Supplementary Table S4 as input data. The results are listed in Supplementary Table S5. The values vary from <0 to >50. The values less than zero indicate that the concentrations of these alkylcarbazoles in the studied oils decrease with maturity. Studies of source rocks have also revealed that benzocarbazole concentrations decrease with maturity over the similar maturity range of 0.68% to 0.78% in 16. This may represent a stage of dilution of some carbazoles due to a preferential increase of other components.

The sand body of the Xifeng Oilfield can be divided into two sections according to (Fig. 2). The value in the section from 62 to 90 km is very low (Fig. 2), resulting in small values of maturity influence index (<5%, in Supplementary Table S5). In Figs. 3g–i and k–l, ratios of alkylcarbazoles in the section between 62–90 km shows a weak decreasing trend with increasing relative migration distance. In the section from 51 to 62 km, however, the values of maturity influence index reach up to 13.3–51.1% (Table S5) and yet fractionation of alkylcarbazoles is not apparent (Figs. 3g–l). These observations illustrate that when the maturity influence index is <5%, the maturity influence is not evident. But, if it is ≥5%, the maturity influence needs to be addressed.

Because the information of maturity influence is carried by the initial concentration (see the methods section for details), the maturity influence is accommodated by using the SMFI (Equation (4) in the methods section). Therefore, the SMFI of alkylcarbazoles is independent of maturity influence and offers an effective opportunity to assess the migration fractionation along the sand body (Equation (5) in the methods section). Figs. 4a–d show the relationships between SMFIs of individual carbazoles and relative migration distances. The samples in the section from 51 to 62 km are close to the regression lines and these regression lines all pass through the model value within analytical uncertainty, which equals to 100% at the reference point.

Figure 4
figure 4

Correlations showing inferred relative migration distances from SMFIs of alkylcarbazoles, the geometric SMFI means of different kinds of dimethyl carbazoles and their ratios in the Xifeng Oilfield.

SMFI: secondary migration fractionation index; MCA: methylcarbazoles; DMCA: dimethylcarbazoles; EDMCA, PEDMCA and SDMCA: exposed, partially exposed and shielded DMCA; SDMCA: 1,8-DMCA; GM(EDMCA), GM(PEDMCA) and GM(SDMCA): geometric means of SMFIs of EDMCA, PEDMCA and SDMCA, respectively; 2,7/1,4-DMCA SMFI: ratio of SMFI of 2,7-DMCA to SMFI of 1,4-DMCA (4j); 2,7/1,8-DMCA SMFI: ratio of SMFI of 2,7-DMCA to SMFI of 1,8-DMCA (4k); 1,7/1,8-DMCA SMFI: ratio of SMFI of 1,7-DMCA to SMFI of 1,8-DMCA (4l). The SMFI value of 100 (%) and SMFI ratio of 1 at the reference point (x = 0 km) are the model values. All the regression lines were obtained by only using the actually data points without forcing through the reference point. Therefore, they are derived only from the data.

Sums and ratios of concentrations of carbazoles, often used in the literature, cannot be directly used as migration indicators. This is obvious from Equation (1), where each compound has its own coefficients of , and and varies. Geometric means of SMFI values for different types of dimethyl carbazoles, on the other hand, can be used as tracers for secondary migration distance as shown by Equation (6) in the methods section. As shown in Figs. 4e and f, the geometric means of SMFIs are strongly correlated with relative migration distances. The absolute values of in Equation (1) in the methods section, which are inversely proportional to the absolute values of migration velocity of these carbazoles (Supplementary Equation (S20)), can be used to reflect migration fractionations of dimethylcarbazoles (DMCA). The regression equations in Figs. 4e, 4f and 4d show that the absolute value of of N-H exposed DMCA (0.046 km−1) > partially exposed DMCA (0.036 km−1) > shielded DMCA (i.e.1,8-DMCA) (0.028 km−1). The migration sequence inferred is that shielded DMCA migrated faster than partially exposed DMCA and partially exposed DMCA faster than exposed DMCA. This sequence corresponds to the retardation differences of dimethylcarbazoles determined by their respective sorption coefficients arising from steric effects related to the alkylation position35 (Supplementary Equations (S5, S8 and S20)).

We further show that the ratios of geometric means of SMFIs of N-H exposed/partially exposed, exposed/shielded and partially exposed/shielded dimethylcarbazoles, as well as the corresponding SMFI ratios of individual dimethylcarbazoles can also serve as odometers for secondary petroleum migration, based on Equations (7 and 8 in the methods section. In Figs. 4g–l, these ratios all decrease with migration distance and their regression lines all pass through the model value of 1 at the reference point within analytical uncertainty.

We have thus demonstrated that: (1) the SMFI fits the real data; (2) the higher the sorption coefficients of molecules are, the slower migration velocities, leading to the more evident fractionations; and (3) the petroleum in the Xifeng Oilfield migrated along the sand body from the source kitchen into the field in the SW direction (Fig. 1b).

We now further apply the SMFI concept to oils in the carbonate reservoirs in the Rimbey-Meadowbrook reef trend in the Western Canada Sedimentary Basin to evaluate its validity. As mentioned in the introduction, the Gussow theory23 was derived from this trend but the convincing evidence for long distance migration along the trend has not yet been achieved. The Ro (equiv.) values of the oils along this trend vary from 0.68% to 0.86% (Supplementary Table S6). Sorption capabilities of carbazoles on minerals in carbonate reservoirs are very low compared to clastic reservoirs19. Therefore, benzocarbazoles were examined in these oils, as they are more easily adsorbed than alkylcarbazoles7. The results show that the maturity influence index of benzocarbazoles in the reef trend can reach 85.8% (Supplementary Table S7). The SMFIs and the ratio between SMFIs of benzocarbazoles, computed from the data in Supplementary Table S6, clearly show fractionations consistent with long distance migration along the reef trend in the up-dip direction with remarkably high correlation coefficients (Supplementary Figs. S5 and S6), providing basic evidence for the Gussow theory23. This is in good agreement with the results of oil-source correlation studies that include maturity information8,9. The various lines of evidence suggest that the Gussow theory is generally applicable. Further details are discussed in the Supplementary Information.

Discussion

Carbazoles not only have stronger sorption capabilities than nonpolar compounds but also hold information about their source inputs including source facies and maturity variations. Our study shows that small maturity variations of less than 0.2% in Ro (vitrinite reflectance) can contribute to over 50% of the concentration variations of alkyl- and benzocarbazoles. Given that the bulk of petroleum generation/expulsion occurs over the maturity range of 0.6% to 1.0% in Ro (ref. 2), the concentrations and ratios of carbazoles cannot be used directly as proxies for secondary petroleum migration distance in most basins where there exist significant influences of source variability. The secondary migration fractionation index, established in this paper, offers an effective solution to this problem and can serve as a distance indicator for secondary migration, as it eliminates the source maturity effect on oils grouped according to source facies and only reflects migration fractionation. This approach can be applied to other low concentration, large polar compounds with different sorption coefficients between isomers, although it is shown in this study for alkyl- and benzocarbazoles.

The ability of our index to reliably monitor secondary migration distances may lead to many applications in fundamental and applied petroleum geoscience studies. The index outlined here is a step towards correctly interpreting the behavior of low concentration, polar organic compounds in petroleum and thus it can help resolve many important questions in organic geochemistry and petroleum geology. Moreover, secondary petroleum migration in many basins around the world is poorly understood and yet the information about this process is most important for petroleum exploration2,8. Our index provides a new tool that can aid in the discovery of new resources via accurate assessment of the directions, pathways and distances of petroleum migration. The method established for calculation of the SMFI in this paper may or may not be universally applicable to oil accumulations with other than a simple linear geometry. The method for complex petroleum migration systems is the subject of future investigation.

Methods

Knowing the direction, pathway and distance of lateral secondary migration is essential in searching for new petroleum accumulations. In the following we develop a new methodology to track the distances of lateral secondary migration of oil through porous strata (such as sand bodies) or unconformities (erosional or non-depositional surfaces separating two strata of different ages).

From the mass balance principle, a general advection-reaction-dispersion equation40,41 (Supplementary Equation (S1)) can be established for secondary petroleum migration in a uni-dimensional pathway. The properties and types of pathways were studied by Yang et al. (2005)4. To investigate the source input influence, we focus here on a uniform migration system, in which the properties of the system, including porosity, density of solids, sorption coefficients, migration velocity of oil and oil saturation, are constant.

The general advection-reaction-dispersion equation can be simplified under the conditions below. When large polar compounds such as carbazoles are selected for a secondary migration study, molecular diffusion is insignificant42 and can be safely neglected4. Lateral migration is very slow, especially in cratonic basins such as the Ordos Basin. Precisely because of slow migration, the effect of mechanical dispersion (caused by differences in microscopic migration velocities on a pore scale) is smaller than that of molecular diffusion and thus can be omitted40. Therefore, dispersion including molecular diffusion and mechanical dispersion can be neglected (see discussions in paragraphs following Supplementary Equation (S1)). Partitioning between oil and water is neglected because adsorbable compounds with low solubilities in water must be selected for a secondary migration study. Secondary petroleum migration in carrier beds in the up-dip direction results in decreases in temperature and thus holds back or slows down the thermal evolution of oils if the basin does not subside substantially. In this scenario, we assume that only sorption occurs during secondary migration, to reveal migration fractionation. Thus, the general advection-reaction-dispersion equation reduces to advection-sorption equation (Supplementary Equation (S4)).

Sorption of carbazoles in migration systems can approach equilibrium on geological time scales4,43. In sorption equilibrium theory, the linear isotherm model (Supplementary Equation (S5)) is valid for the natural systems where concentrations of adsorbable compounds are low44. From the advection-sorption equation and linear isotherm model, we can derive the source-dependent migration model that describes how the concentration of a carbazole (or any other large, polar compounds present at trace concentration levels) varies with maturity and migration distances for any given type of source facies (see detailed deduction process from Supplementary Equations (S1) to (S21)):

where is the initial concentration of a carbazole at the filling point (i.e. starting point of secondary petroleum migration); (vitrinite reflectance) as a maturity variable is a function of time for a given type of source facies; both vitrinite reflectance () and its equivalent () quantitatively indicate the maturity levels with the same units and thus are represented by one variable () in the equations; and can also be expressed as and , respectively (refer to Supplementary Equations (S21 and S22)); , and are constants, which can be determined through non-linear regression analysis of Equation (1). The parameter is dictated by geochemical processes of hydrocarbon generation and fractionations in primary migration or migration before the reference point as defined in the next paragraph. It has units of concentration (μg/g). is a rate of change of initial concentration with Ro, defined in Supplementary Equation (S12). It is a dimensionless constant. If > 0, increases with ; < 0, decreases with . Oils from different source facies will have different values of the constants and in Equation (1), reflecting different source input influences due to maturity variations among different source facies. The parameter is proportional to the ratio of the retardation factor of an adsorbable compound to oil migration velocity or is inversely proportional to migration velocity of the compound (Supplementary Equation (S20)). It has units of km−1. The value of is always negative when sorption occurs without any other reactions. Equation (1) indicates that low values of (i.e. large absolute value) cause rapid concentration decreases of the large polar compounds with migration distance if only the sorption effect is considered.

In the above, the filling point was used to define the absolute migration distance in the theoretical analysis (see details in the model section of the Supplementary Information). However, because it is difficult to determine the filling point in practice, a reference point is often used to determine the relative migration distance, which usually is located behind the filling point in a pathway and thus results in a representing the distance between the filling point and the reference point. Nonetheless, the relative migration distances can be directly used for estimating , and (without any correction for ) via non-linear regression analysis of Equation (1), as and do not change with . Although varies with , the variable does not affect the study of the relative migration distance as both are directly correlative.

Equation (1) shows an exponential attenuation law style function with a variable initial concentration as a function of maturity. The derivation of Equation (1) is detailed from Supplementary Equations (S1) to (S22). This functional form is derived under the conditions of the linear isotherm sorption, very low dispersion and uniform pathways.

The initial concentration incorporates the source input information of a carbazole, its generation from a source rock and fractionation during primary migration (oil expulsion). was shown to vary steadily with maturity () in the range of 0.45–1.3% (ref. 8), so that most of the variation can be described by a quadratic equation that becomes linear over a narrow range such as 0.7–0.8% (Supplementary Table S3) in the Xifeng Oilfield (i.e. ) (see Equation (S12) and its relevant discussion in the Supplementary Information).

As sorption equilibrium is achieved during secondary migration43 and the thermal evolution of the oil either stops or slows down after expulsion, provided that the basin does not subside substantially, the present concentrations of a carbazole and values of oils can be used to represent and values during secondary migration in Equation (1).

Our model (Equation (1)) was derived for uniform migration systems. More realistic migration systems with variable properties could be treated by dividing them into subsections with constant properties. To ensure the model validity, proper compounds must be selected that should satisfy the requirements of sufficiently low concentrations in oil, low solubilities in water and strong enough sorption capacity (see for further details in the multiple charging and oil volume section of the Supplementary Information). With these compounds, our model can also be applied to carrier systems with multiple charging, which is demonstrated via linearization of the Langmuir isotherm model (Supplementary Equations (S27–S32)). The geochemical conditions for valid application of the model and the selected compounds are: (1) the thermal evolution of oils expelled from source rocks ceases or the oil migrates in the up-dip direction without substantial basin subsidence after expulsion; (2) the primary migration fractionation index is nearly a constant; (3) the relationship between and the initial concentrations at the filling point or reference point is linear or can be described by a quadratic equation (see Supplementary Information for more details); and (4) oil biodegradation levels are <1 on the biodegradation scale of Peters and Moldowan (1993)21 or the effect of oil biodegradation is quantitatively removed.

In a previous study, the quantitative models on factors influencing the distribution of phenol and carbazole compounds4 did not address the issue of source input influences. In their pivotal model (Equation (33) in Yang et al. (2005)4), the geotracer concentration during secondary migration is constant and the same as the initial concentration at the filling point. This model (Equation (33) in Yang et al. (2005)4) can also be derived in our work as a special case (Supplementary Equation (S18)). In natural migration systems, however, the geotracer concentration during migration and initial concentration are all variable. Our new model (Equation (1)) is developed to address such complexities that are present in real systems.

Quantitative evaluation of source input influences, including organic source facies and maturity, is a necessary first step in order to eliminate source input influences. Here we begin with the total differential of Equation (1)

where represents concentration variation of a carbazole caused by maturity variation for oils from a given type of source facies and represents that caused by migration fractionation. The maturity influence index () for a given type of source facies is defined as

The migration fractionation contribution index () is equal to 100- (%), based on Equations (2 and 3).

The maturity influence index quantitatively indicates the maturity influence in the source input information for a given type of source facies. Before using large polar compounds (e.g. carbazoles) to study secondary migration, the maturity influence index should be calculated to check whether the maturity influence is significant and thus must be removed. The case studies (see the results section) illustrate that when the maturity influence index is ≥5%, the distribution of concentrations and ratios of large polar compounds (e.g. carbazoles) do not solely reflect migration distance and thus the maturity influence must be removed.

To illustrate the net migration fractionation of a carbazole during secondary petroleum migration without maturity influence, we introduce the concept of a secondary migration fractionation index () for oils from a given type of source facies

Substitution of Equation (1) into (4) yields

Evidently, if a reference point is used instead of the filling point, Equations (3 and 5) are still applicable. Since the as defined above only reflects migration fractionation, it serves as an odometer for secondary migration in a uniform pathway. equals 100% at the reference point, which is defined as the model value. In the case of multi-source-facies, oils are first grouped according to their source facies. The , and are then estimated separately based on their respective source facies. This minimizes the influence arising from variations in source facies.

From Equation (5), we can derive

where is the geometric mean of and is the arithmetic mean of for one type of alkylcarbazoles. For example, represents the geometric mean of of N-H exposed dimethylcarbazoles (EDMCA).

From Equation (6), we can get further

where and are the geometric means of for two types of alkylcarbazoles, respectively. The indicates the arithmetic mean of of alkylcarbazoles of type one; , type two.

Similarly, the ratios of of different types of individual alkyl carbazoles can also help identify migration fractionation. From Equation (5), we can derive

where is the secondary migration fractionation index of an alkylcarbazole of type one; , type two. The represents of an alkylcarbazole of type one; , type two.

Evidently, , , and are all functions of migration distance , thus can all serve as odometers for secondary migration in a uniform pathway and can be used to identify migration fractionation and to further reveal migration directions or pathways. Both and decrease with migration distance when and are calculated from the compounds of the type with comparatively low sorption capacities or sorption coefficients. At the filling or reference point, equals 100% and both and equal 1.

The general approach of using our model and is summarized below:

  1. 1

    Classify oils according to their source facies. For each type of source facies, conduct the following analyses;

  2. 2

    Select a possible migration pathway and calculate the relative migration distance (as outlined in the results section for the distance calculation); conduct non-linear regression analysis of Equation (1) with the data of the relative migration distance, concentrations of geotracers and , to derive the constants of , and in Equation (1);

  3. 3

    Conduct linear or polynomial regression analysis between and migration distance , calculate and then compute maturity influence index () from values of , and , by using Equation (3);

  4. 4

    If maturity influence index is <5%, the maturity influence may be ignored. If it is ≥5%, values of geotracers (such as carbazoles) are computed using Equation (4) with the data of concentrations, , and , as shown in the case of the Xifeng Oilfield;

  5. 5

    For carbazoles, calculate the geometric means of exposed, partially exposed and shielded DMCAs (dimethyl carbazoles), ratios of geometric means of of N-H exposed/shielded, exposed/partially exposed, partially exposed/shielded DMCAs, the corresponding ratios of individual dimethyl carbazoles and ratio of benzo[a]/benzo[c]carbazole;

  6. 6

    Analyze the correlation of the values, geometric means and ratios against relative migration distance; if a correlation is evident, identify migration fractions on the base of and the ratios calculated at the fifth step; if the correlation and migration fractions do not support a particular selected pathway being valid, other possible pathways should be investigated by going back to the second step; if migration fractionation exists, the migration pathway, distance and direction are confirmed further with comprehensive analysis of geological and geochemical data.

In the case of large variations in maturity as shown in the Rimbey-Meadowbrook reef trend, a quadratic () needs to be added into the parentheses in Supplementary Equation (S12) and Equation (1) and Equations (24) are adjusted accordingly.