Quantitative colocalization studies suffer from the lack of unified approach to interpret obtained results. We developed a tool to characterize the results of colocalization experiments in a way so that they are understandable and comparable both qualitatively and quantitatively. Employing a fuzzy system model and computer simulation, we produced a set of just five linguistic variables tied to the values of popular colocalization coefficients: “Very Weak”, “Weak”, “Moderate”, “Strong”, and “Very Strong”. The use of the variables ensures that the results of colocalization studies are properly reported, easily shared, and universally understood by all researchers working in the field. When new coefficients are introduced, their values can be readily fitted into the set.
Fluorescence-based techniques revolutionized cell and molecular biological research by becoming its most indispensable tools1. Applicability of fluorescence methodology made a leap forward with introduction of quantitative approaches. With the help of quantification, it became possible to interpret fluorescent observations objectively and analyze them statistically2. Quantification of fluorescence ensured meaningful comparisons of the results between different labs and enabled development of informative mathematical simulations of studied processes3,4. Quantification is particularly important in colocalization observations when fluorophores of different colours, employed to label respective molecules in specific cellular locations, overlap and produce new colours as a mixture of the used5,6. The degree of this overlap is crucial to detect precise locations of the molecules of interest as well as envision the possibility of their interaction7,8.
One major limitation of quantitative colocalization studies is the lack of unified approach for interpretation of results. This is important because even after obtaining numerical values of colocalization coefficients, researchers need to describe the degree of colocalization using natural language with subjective qualifiers, such as “Weak”, “Moderate”, “Strong”, etc. This is understandable not only because natural language is the most expressive way to convey the information, but also because scientific results are usually presented in comparative terms. However, it can be also dangerously misleading, since it disconnects qualitative and quantitative aspects of observations: “Strong” colocalization in the case of overlap coefficient (standard values are from 0 to 1.0) may mean 0.99 to one researcher and 0.51 to another. This discrepancy can cause significant confusions and create errors. In addition, some researchers tend to describe colocalization using their own custom terminology, such as “Relatively Low”, “Slightly High”, etc., which is understandable to them, but may not be so to others. Thus, a solution that properly relates the numerical values of colocalization coefficients to their qualitative estimations, while maintaining the objectiveness of quantification, can be highly beneficial.
To address this issue, we report our findings on the use of the model of a fuzzy linguistic system9,10 to interpret the results of quantitative colocalization studies. A fuzzy system connects numeric crisp values of colocalization coefficients to fuzzy propositions that use fuzzy values such as "Weak" and "Strong", which are more descriptive and understandable to human users. A crisp value is related to fuzzy propositions by the membership functions, which assigns degrees of truth (between 0 and 1) of fuzzy propositions for the given crisp value. The final fuzzy proposition output is selected among all possible ones by essentially maximizing the degree of truth. Our aim was to provide a simple, consistent, and objective set of variables, tied to the ranges of values of respective coefficients used to estimate colocalization, which is very easy to understand and use.
Selection of primary values
We started with primary values such as “Weak”, “Moderate”, and “Strong”, each assuming a Gaussian membership function (Fig. 1). The system ought to produce a "reasonable" description, meaning that the fuzzy predicates it generates should have sounded right to cell and molecular biologists. For example, if the actual value of colocalization is 0.8, then the predicate was presented as "The degree of colocalization is Strong", which indeed sounded right. Table 1 shows fuzzy predicates for the actual values of colocalization ranging from 0 to 0.9 (according to 0 to 1.0 scale). Initially, “Very” and “More or Less” modifiers were used in addition to the primary values to add more flexibility.
Generation of computer simulated images and quantification of colocalization on them
Since we could not know the actual values of colocalization in any given image in advance, we generated synthetic images with values of colocalization exactly known as a source of reference and quantified popular coefficients on them. Images were created with the help of the original software when virtual “molecules” were placed in a synthetic image with number of colocalized molecules precisely controlled (see Methods). The degree of colocalization in the images ranged from 0 to 0.9 (according to 0 to 1.0 scale) (Figure 2). The coefficients included Protein Proximity Index (PPI), Pearson's correlation coefficient (Rr), overlap coefficient (R), overlap coefficients k1(k2), and colocalization coefficients m1(m2) (Table 2). Calculations of coefficients showed gradual increases of their values strictly within the ranges of standard numbers, thus indicating excellent suitability of our synthetic images.
To demonstrate the applicability of our approach on biological imagery, we also created computer-simulated images modeled on a real biological image (see Methods). Figure 3 shows a panel of computer-simulated images with predefined values of colocalization modeled after a real biological image. Colocalization gradually increased in them, as indicated by respective scatter grams, eventually revealing structures with colocalization.
Construction of fuzzy systems
After performing calculations of coefficients on the images, we constructed corresponding fuzzy systems for every coefficient to make them relevant (Table 3). To do so, we adjusted the width of Gaussian member functions to ensure that for each given image the fuzzy system produces the same fuzzy predicates for each coefficient as it does for actual colocalization values (Tables 1 and 2). For example, for an image with an actual value of colocalization 0.8, the fuzzy system for this value was presented as "The degree of colocalization is Strong". By adjusting the width of Gaussian membership functions, we ensured that the same predicate ought to be produced for all coefficients (Table 3). Given the nature of our simulation (virtual “molecules” had the same intensity regardless of the channel), R and k1(k2) coefficients yielded equal values.
Results of our study show that degrees of colocalization, presented as linguistic variables, can be tied to the ranges of the respective coefficients values (Table 4). Our approach is advantageous to the recently reported attempt to systematize descriptions of quantitative colocalization observations based on terminology found in cell biological literature11. Authors of the report provided non-matching and inconsistent variables for different coefficients, thus making them very hard to use in practice, as well as did not use any controls. Simplicity of our approach is based on the use of three primary values, such as “Weak”, “Moderate”, and “Strong”. Among the two initially employed modifiers, “Very” and “More or Less”, we used only “Very” as the most preferable one. Its use ensured consistency and flexibility of the set and brought the total number of variables to just five: “Very Weak”, “Weak”, “Moderate”, “Strong”, and “Very Strong”. To ensure that these variables are used correctly, they were applied to the ranges of values of coefficients obtained using computer-simulated images with exactly known degrees of colocalization. Thus, these variables are useable with precise understanding of what they represent.
Importance of the described approach is not only in providing a framework for correct description of the results of colocalization studies in qualitative and quantitative terms for the currently used coefficients, but also in serving as a tool that allows accommodation of new ones. Since new and improved algorithms to quantify colocalization continue to be developed12, computer-simulated images with known degrees of colocalization generated by us and shown in this study can be employed to obtain values of new coefficients, which can then be easily fitted into the set and Table 4 can be extended further. The images are available for download as Supplementary Information to this article. In the emerging era of bioimage informatics13, with increasing importance on standardization of collected image data5,14, the use of the variables will ensure that the results of quantitative colocalization studies are properly reported, easily shared, and universally understood by all researchers working in the field. Importantly, the use of our approach also represents a paradigm shift in colocalization studies since the results of quantifications now become presentable in familiar qualitative terms while still maintaining the objectiveness of calculations.
To conclude, our approach helps to bridge the gap between qualitative and quantitative aspects of colocalization detection. Given simplicity and consistency, as well as the fact that linguistic variables maintain the objectiveness of quantification, they can serve cell and molecular biologists as a community-wide standard for describing results of quantitative colocalization studies.
Design of a fuzzy system
The design of a fuzzy system started from a crisp system, such as a variable called "colocalization value" that can take any precise values on (0, 1). Then, we introduced fuzzy values, such as "Weak", "Moderate" and "Strong". A crisp proposition like "colocalization value is x" is either true (truth value 1) or false (truth value 0), whereas a fuzzy proposition like "colocalization value is Strong" has a truth value between 0 and 1, which was calculated by a membership function μSTRONG(x). The fuzzy proposition with the largest truth value was then used as the output of the fuzzy system. Fuzzy values were modified using an adverb “Very”. The value “Very Strong” differs from “Strong” in that its membership function is μSTRONG2(x).
Generation of synthetic computer-simulated images
Images with predefined values of colocalization were generated by merging pairs of green and red computer-simulated images for the red/green pair of channels. With the help of BioSim simulation computer software (MATLAB source code is available at www.anes.ucla.edu/~wuyong/biosim.zip), virtual “molecules” were placed in a synthetic image7. The number of colocalized molecules was precisely controlled via the software. The images were free of background noise. The degree of colocalization in the images ranged from 0 to 0.9 (according to 0 to 1.0 scale) (Figure 3). The images can be downloaded and used to obtain the ranges of values of newly-introduced colocalization coefficients, which can then be fitted into the set of linguistic variables shown on Table 4.
Generation of computer-simulated images modeled after a real biological image
Original images were acquired as described in the fluorescence microscopy section below. Prior to be used for modeling, they were processed for background correction using “Average Contrast and Fluorescence” settings with the help of CoLocalizer Pro software. Protein clusters, treated as point sources, were randomly positioned in a representative image according to biological structures. Each of the clusters generated an intensity distribution according to a Gaussian point spread function (PSF). The degree of colocalization was precisely controlled by knowing the exact number of clusters generated by BioSim software. Specifically labeled clusters were distinguishable from nonspecifically labeled ones by being significantly brighter. The degree of colocalization in the images ranged from 0 to 0.9 (according to the 0 to 1.0 scale).
Images of fluorescence of liver bile canaliculi stained for multidrug resistance protein 2 (Mrp2) (red fluorescence) and bile salt export pump (Bsep) (green fluorescence), known to be colocalized15 were acquired using a confocal microscope LSM 410 (Carl Zeiss). Primary anti-Mrp2 and anti-Bsep antibodies were obtained commercially. Alexa 488 and Alexa 594 secondary antibodies (Invitrogen) were used for labeling Bsep and Mrp2, respectively. Dual-stained images were obtained using an immersion-oil Plan-Neofluar 40/0.75 objective and acquired by sequential laser scanning to minimize bleedthrough. Images were saved in lossless TIFF format to ensure reliability of quantification with a dimension of 512 × 512 pixels.
Quantification of colocalization
Colocalization was quantified using protein proximity index (PPI) and various coefficients. Protein proximity analysis (PPA) software (www.anes.ucla.edu/~wuyong/) was used to estimate PPI7. Coefficients included Pearson's correlation coefficient (Rr), overlap coefficient (R), overlap coefficients k1(k2), and colocalization coefficients m1(m2) and were calculated using CoLocalizer Pro 2.7.1 software (CoLocalization Research Software, www.colocalizer.com)8.
This work was partially supported by American Heart Association Postdoctoral Fellowship grant 10POST4230081 to Y.W.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/