A real-time interactive, fully automated, low-cost and scalable biology cloud experimentation platform could provide access to scientific experimentation for learners and researchers alike.
Many access barriers to life-science experimentation exist for academic and commercial research, mainly due to professional training needs, equipment purchase and operation costs, and safety considerations1. Computational cloud and time-sharing paradigms2,3 have recently inspired the development and deployment of cloud-based experimentation labs for biology research, such as commercial platforms that can execute experiments semiautomatically1,4 and the browser-based puzzle game EteRNA, which provides experimental feedback for citizen scientists5. However, these platforms still face limitations, such as relying on batch processing with no opportunity for real-time interaction while the experiment is running, hindering the exploration that hands-on experimentation allows and taking days to return results due to long experimental turnaround times.
Cloud labs are also poised to help solve significant educational challenges. Familiarity with advanced scientific practices and 'authentic inquiry'6,7,8 are imperative for K–12 and college education (for example, Next Generation Science Standards)8,9 but are difficult to achieve in real-world classrooms given logistics and cost6,10. In addition to traditional physical hands-on labs, virtual and remote labs have recently been successfully deployed, with each modality having its distinct advantages given educational goals and situational context11,12,13,14,15. Physical remote labs for life science education are comparably underdeveloped12, in large part because of the associated logistics of specimen handling. We have previously developed, demonstrated and deployed the first educational biology cloud lab with slime mold chemotaxis experiments16, which was suited for non-real-time interactions but did not scale cost-effectively, given back-end logistics and turnaround time.
Here we conceptualize, implement and validate a biology cloud experimentation platform (Fig. 1) that (i) enables the types of inquiry mandated for professional science and educational purposes; (ii) has a low entry barrier and can be used even at the middle-school level; (iii) is real-time interactive; (iv) has a fast result turnaround time (within minutes); (v) is fault tolerant against biological variability and failure; (vi) scales to millions of users worldwide from a design as well as an economic viewpoint; (vii) has a large exploration and discovery space; and (viii) generalizes to many other experiment types.
Interactive biology experimentation online
Our cloud platform focuses on the photoresponsive behavior of Euglena gracilis, a single-celled organism ∼50 μm long (Supplementary Text 1–3). While swimming forward, it rolls and wobbles around its long axis to scan all directions for light with its single eye spot (Fig. 1a). Euglena are commonly used in hands-on biology education17,18,19,20 and are relevant for basic research21,22,23; for food, chemical and fuel production24; and as biosensors25.
Experiments are executed on a cluster of biotic processing units (BPUs)16, instruments that combine sensors, biological material, actuators and a microcontroller (Fig. 1b and Supplementary Figs. 1–5). Each BPU consists of a webcam microscope containing a microfluidic chip with four attached light-emitting diodes (LEDs) that provide directional light stimuli to Euglena (Supplementary Video 1), which are cultured in reservoirs and supplied to the microfluidic chips via automated valves as needed. The microcontroller controls the LEDs, streams live video, postprocesses data and communicates with the central server (Supplementary Text 2 and Supplementary Fig. 5). We adopted the task scheduling concepts of high-performance computing26 to design the central server. This server assigns BPUs and remote users according to a nonexclusive group allocation policy (Fig. 1c and Supplementary Text 2.4), handles distinct BPU types, routes experiments to the best-suited BPU and optimizes wait times through load balancing.
Via a web interface, users choose a specific BPU or are autorouted (Fig. 2a) to execute experiments in real-time live mode or in asynchronous batch mode (Supplementary Video 2). The live mode user interface (Fig. 2b) employs a virtual analog joystick to control intensities of four LEDs (Fig. 1b) to induce directional light stimuli to Euglena; two live video streams show the microscopic Euglena responses and the macroscopic LED actuation. In batch mode (Fig. 2c), the user designs and uploads a program that contains instructions for time sequences of LED intensities. The back-end server automatically tracks shape and motion of all motile cells and overlays these data on the captured videos (Fig. 2d and Supplementary Text 2.5). Video, stimulus and track data are stored for future download and analysis.
Live mode enables open-ended, real-time-interactive exploration of Euglena biophysics followed by quantitative substantiation in batch mode. A user can test Euglena's response to changes in light direction and intensity and then observe variability among traces (Fig. 2d). The prevalent behavior is negative phototaxis, but localized tumbling and changes in cell morphology are also observable (Supplementary Video 3). We characterized the system by executing periodic light on–off experiments in batch mode, measuring the time constants (τ) for cell alignment with light on, τ1 = 6.7 ± 2.4 s (n = 6; mean ± standard deviation throughout), and subsequent light-off orientation decay, τ2 = 9.9 ± 2.6 s (Fig. 2e). We defined responsiveness to quantify how well Euglena aligned with light after 15 s of light exposure (on a scale of 0–1, for random (0) to perfect (1) alignment; Supplementary Text 2.2 and Supplementary Video 4). This responsiveness score also depends on light intensity and exhibits Hill-equation-type characteristics (Fig. 2f). Hence, experimenters can investigate Euglena's response to changes in light direction and intensity on the time scale of seconds, study its long-term behavior over weeks (Fig. 3), and record and download this data for offline analysis (Supplementary Video 2).
Robust, cost-effective and dynamic scaling of BPU clusters
To make this BPU cluster cost-effectively scalable and tolerant against failures in hardware, software and 'bioware' (Supplementary Fig. 5), we extended high-performance computing concepts to include biology by automonitoring its state: the system submits batch experiments to each BPU every hour (Fig. 2e) to measure three variables—cell density, motility and light responsiveness. Population density and responsiveness monitored over 10 d can be stable (Fig. 3a), undergo microecological fluctuations (Fig. 3b) or be susceptible to external ambient light cycles (Fig. 3c). This biological variability emphasizes a key challenge of any cloud lab, i.e., to consistently provide a prespecified experimentation experience. User testing revealed that responsiveness above ∼0.4 was easily recognizable (Supplementary Text 2.2 and Supplementary Table 1), providing a quantitative target for good BPU performance. (Even lower responsiveness highlights interesting and noticeable Euglena behavior; Fig. 2f). BPUs not meeting specifications can often be recovered by automated flushing (Fig. 3b); organisms and chips are replaced every ∼4 weeks, leading to a maintenance burden of ∼10 min per week per BPU (Supplementary Text 3.2). Under current maintenance protocols, individual BPU performance was good ∼61% of the time, and continued experimentation did not decrease BPU performance or Euglena responsiveness (Supplementary Text 3.1 and Supplementary Fig. 6). Each BPU can handle >100,000 experiments per year for ∼$0.01 per experiment (∼4 min per experiment; setup and maintenance cost of ∼$1,000 per BPU per year), with negligible wait times for randomly accessing users. Dynamic addition of BPUs ('hot swapping'; Fig. 3d) or queuing of batch experiments increases throughput (Supplementary Text 3.3). Running a cluster with six BPUs guarantees the availability of at least one good BPU 99.5% of the time at an average availability of 3.6 BPUs, with users automatically routed to good BPUs.
Educational use cases
We evaluated the platform in three educational contexts encompassing and illustrating various aspects of future usage in education and even research (Fig. 4). We primarily assessed (i) whether the technology works robustly, (ii) whether it can be operated even by middle-school students and (iii) whether it achieves the key elements of best laboratory practice as described in America's Lab Report8, i.e., integration into the flow of instruction, alignment with process and content learning goals, and engagement of students in reflection and discussion. The cloud lab was embedded into regular instruction and scaffolded along the main phases of the inquiry cycle7; more details on study design and outcomes are provided in Supplementary Text 4.
First, we studied whether university students taking a professor-led theory-based biophysics class could successfully carry out experiments and sophisticated quantitative data analysis from home in a self-paced manner (Fig. 4a,b and Supplementary Video 2). Working individually over 14 days, ten students completed a homework project focusing on concepts regarding microswimmers, diffusion and low-Reynolds-number hydrodynamics27. Using the live mode (Fig. 2d), students explored Euglena light-response behavior and made cells swim along geometric paths (Fig. 4a). Students were able to self-discover semiquantitative relationships, for example, reporting that the “fraction of Euglena participating in the directed motion seems to increase as you hold the joystick longer, and depending on the intensity of the light.” They performed back-of-the-envelope analyses regarding Euglena size (∼50 mm), speed (∼50 mm/s), and drag and propulsion forces (∼10 pN)27, experimentally confirming theoretical lecture content. Students then analyzed self-generated large-scale batch data (Fig. 2c; typically hundreds of autotracked cell traces in a 1-min movie; Supplementary Video 2) in Matlab, testing two hypotheses. (i) Euglena behave like active particles opposed to passive Brownian particles. Ninety percent of students found that the expected relationship of root-mean-square displacement versus time was violated and the apparent diffusion coefficients (D) were too high given cell size (student example: expected: D ∼0.01 mm2/s; measured D ∼2,000 mm2/s; Fig. 4b). (ii) The population average velocity changes between dark and light conditions. Sixty percent reported that cells slowed when the light was on, 10% reported that cells sped up and the others found no significant differences (student example: 26 ± 12 mm/s (n = 389) for light off versus 13 ± 10 mm/s (n = 431) for light on, respectively; Fig. 4b. A decrease in velocity for increased light is expected22, but results may vary given experimental conditions. These results demonstrate that 1-min experiments provide students with hundreds of autotraced cells supporting sophisticated statistical analysis. The logged data revealed that students accessed the system at their own convenience at day and night (Supplementary Fig. 7) and engaged in different modes of experimentation, from “playful” (as self-described by multiple students) to more systematic testing of one or multiple light directions and intensities (Supplementary Figs. 8 and 9). Student's feedback and the fact that they each ran 11 ± 6 experiments (three were sufficient for the assignment) indicated that the platform affords ease of experimentation and incentivizes self-driven exploration. Students' feedback also captured many items that motivated this project, including ease of exploration and of gaining intuition (30%); ease of obtaining and analyzing large batch data sets (30%); and minimal manual labor, logistic effort and need for technical understanding, which allowed more focus on thinking (50%). Examples of feedback include, “It was fun to play around with real organisms ... didn't require thinking about the set up”; “Playing for a few minutes gave me some intuition”; “Text mode allows more detailed and controlled tests”; “Very little rote labor time, spent most time thinking!”
Second, we studied whether real-time experimentation could be integrated into middle-school classroom settings and whether it could be combined with simulation-based platforms to support sophisticated model exploration practices as prescribed by the Next Generation Science Standards9,28 (Fig. 4c,d and Supplementary Video 5). During a 50-min class period, 27 students (7th and 8th grades, three classes total) working in pairs executed the following activities (Supplementary Table 2). In one class all pairs ran their own live experiments, while in two classes the live experiment was projected to the front wall and operated by one student while the whole class discussed and suggested joystick movements. This generated the hypothesis that Euglena move away from light. Then, student pairs tested this hypothesis by measuring the percentage of Euglena cells moving away from light in previously recorded movies. The entire class discussed possible mechanisms by which Euglena may perceive and respond to light. Student pairs then engaged with a 3D biophysics modeling environment (Fig. 4c) in which a Euglena cell was represented as cuboid surging along and rolling around its long axis. This model had three user-defined parameters (Surge, Coupling, Roll), with the instantaneous pitch velocity being proportionally coupled to the amount of light entering through one body side. Depending on parameter choices, the model mimics many light-responsive behaviors, including positive and negative phototaxis, straight travel versus meandering swimming paths, and even chaotic behavior (Supplementary Video 5). Students explored the function of these three parameters by iterating among self-chosen parameter configurations and then running and stimulating their model through joystick operations, with the overall goal of matching a prerecorded swimming path. Students ran 19 ± 4 simulation experiments; all students found fitting parameter configurations29. Cluster analysis of the activity logs (Fig. 4d) suggests three dominant strategies of students' model exploration: (i) systematic change of one parameter at a time followed by exactly one test experiment (40%); (ii) alternating between multiple cycles in this systematic stage, followed by extended experimentation with a fixed parameter configuration (30%); and (iii) unstructured transition between changing zero, one or multiple parameters simultaneously (30%). These patterns are consistent with the literature on students' productive model explorations30. Students engaged in generative and productive discussions, which led to content-aligned discoveries such as that the roll parameter is required for the cell to “[see] in every direction” or methodological discussion about how the real Euglena differs from its model (Supplementary Tables 3,4,5). Post-tests revealed that students learned the concept of Euglena phototaxis (90% correct) and engaged in scientific argumentation.
Third, we studied whether this cloud lab could be operated and curated through existing third-party educational content management systems that would allow its wider dissemination (Fig. 4e,f, Supplementary Data 1, and Supplementary Video 6) and whether the batch mode feature would be suitable for middle school. We chose the iLabStudio.org (http://www.ilabstudio.org/labjournal) platform31, which enables teachers to create personalized lesson content around online physics and chemistry experiments and to manage student progress. We implemented a general application programming interface (API) and a corresponding iLab batch interface (Fig. 4e). During two 50-min class periods on successive days, 34 students working individually or in pairs (8th grade, two classes) carried out the following activities. Students watched prerecorded videos of interactive experiments and then engaged in an open classroom discussion, generating hypotheses about how Euglena would react to student generated light stimuli. Students responded with “moving to the light” (60%)” or moving “away from light” (20%), or described more complex behaviors (20%); some provided an explanation, such as the “need for photosynthesis” or that the “light might cook them” (both are correct depending on light intensity). To test their hypotheses, students then designed and ran batch experiments (29 total), i.e., entering intensity, duration and direction of light stimulus. The chosen stimulus sequences revealed versatile experimental designs, including systematic variation of light direction or intensity, testing of multiple variables in sequence and seemingly less-structured designs (Fig. 4f and Supplementary Fig. 10). Students provided justifications for their rationales, ranging from “raise the intensity” to “put random numbers.” We characterized 60% of the designs as sufficiently systematic to test for the influence of light intensity, direction or both. Students and teacher discussed experimental designs and results as they were delivered sequentially from the experimental queue. Based on their own data, students reported moving to the light (25%); away from light (45%); and no directional response (30%). These heterogeneous results arose in part because some students did not choose high enough light intensity levels to induce noticeable negative phototaxis. When students afterwards considered how to improve their experimental designs, 50% suggested investigating the effect of light intensity more closely. When asked their opinion of this experiment, 85% expressed liking it, and 30% explicitly mentioned Euglena or living organisms.
From these use cases, we conclude that this platform ran robustly and that we successfully deployed an experimentation model that did not exist in the classroom before, i.e., real-time interaction with microscopic cells on a timescale of seconds, which additionally supported complex, quantitative data analysis and modeling. This should be contrasted with current instructional standards and school lab practices, i.e., passive and qualitative observation of living cells under a microscope, with fixed slide samples, videos or pictures being even more common; in the most sophisticated and rare scenario, students observe a population-level aggregation of Euglena in a petri dish under external light over the course of 15–30 min18. Ideally, five to ten live and batch mode experiments could be combined to enable initial free-form exploration followed by controlled experimental design. We note that new opportunities for mining of educational data sets are emerging ('learning analytics')32,33 as logging user activity data on such platforms is easier and more scalable than in traditional physical labs. For example, revealing differences in student strategy and systematicity (Figs. 4d,f) is useful for instructors to help their students and also for educational research in general. The user numbers in our studies are too small to draw more specific conclusions, but this work only marks the beginning of future extended design-based research and wider dissemination34.
The experiment throughput and cost of this platform scales to serve massive user numbers and diverse curricular demands, from middle-school to college and massive open online courses35. There are more than 15 million high-school students in the United States alone36, and hundreds of millions in developing countries or remote locations could access such platforms via increasingly ubiquitous smartphones37. We estimate that providing lesson plans similarly to Study 1 (Fig. 4a,b) to 1 million users per year could be achieved with ∼250 BPUs, a modest back-end footprint of ∼10 m2 and a regular 1 Gb/s internet connectivity; cloud lab access for all students in a class at ∼1 cent per experiment would cost instructors less than the price of one living Euglena sample (Supplementary Text 3.4).
This technology also has significant potential for primary life-science research. It already supports complex investigations of microswimmers (Fig. 2e,f) and microecology (Fig. 3) of current interest to the biophysics community22,23,38. Image data is information rich, e.g., unexpectedly we captured cell-division events (Supplementary Video 3); given that there is also a rich stimulus space many phenomena can be identified and systematically studied. Because of its domain-specific design39, this platform is expandable beyond Euglena and light stimuli to a general class of increasingly automated and low-cost, high-throughput experiments, such as experiments involving valve-switching in microfluidic devices40 and cloud chemistry41. The ability to support theoreticians carrying out their own investigations, as well as large-scale citizen science5, is within reach.
In conclusion, we demonstrate a new online access and scientific inquiry model that turns observational microbiology into an interactive experience. This enables (i) interaction with living cells in real time, (ii) complex microscopic inquiry practices, (iii) learning analytics for life science experimentation and (iv) improved in-class time use, logistics, costs and safety. The key technical contribution was to extend the distributed computing concept to include unreliable biological specimens while maintaining quality of service. This approach makes complex biological experiments and modern biotechnology accessible to and interactive for multiple currently underserved audiences, such as students, teachers, scientists and the general public. Although the needs for education and research are not identical, they may synergistically drive technology development and its economics. All code and BPU designs are released open source (Supplementary Text 5 and Supplementary Figs. 1–5), enabling wider dissemination and development, and we invite the life-science community to adapt its protocols and technology to make them interactive and available online.
Editor's note: This article has been peer-reviewed.
I.H.R.-K.: project idea and coordination. I.H.R.-K.: engineering conceptualization; P.B., I.H.R.-K.: educational conceptualization. Z.H., A.M.C., C.L., I.H.R.-K.: hardware, biology and experiments; Z.H.: software system architecture; Z.H., A.M.C., C.L.: software implementation at Stanford site; S.N.P.: software implementation at Northwestern site. User study design, execution, and data evaluation: Study 1: I.H.R.-K., H.K., Z.H.; Study 2: P.B., E.W.B., I.H.R.-K.; Study 3: K.J., A.D.W., I.H.R.-K. Manuscript preparation: I.H.R.-K. and Z.H. with creative input from all authors.
We are grateful to the members of the Riedel-Kruse and Blikstein Labs, N. Cira, G. Harrison and the teachers and students who participated. This project was supported by an NSF Cyberlearning grant (#1324753) and NSF awards IIS-1216389, OCI-0753324 and DUE-0938075.
Illustration of interactive joystick experiment on the platform: A user visits the cloud lab website and runs a live experiment on a particular BPU ('eug15'). In the live view the user tests euglena response to four LEDs one at a time with a virtual joystick, while watching a live video feed of the actual LED going off. The Euglena exhibits negative phototaxis by swimming away from each LED in turn (compare also to Fig. 4a in main paper).
Batch mode experimentation and a workflow on the cloud lab platform from a user's point of view (for example as in user study Figs. 4a,b):). A user uploads two batch experiments as text scripts (both JSON and CSV formats) at the same time. The system routes these experiments to the best available BPUs, while avoiding the apparently suboptimal ones. The user then downloads the data from a previously run experiment and investigates a preprocessed video where Euglena and their tracks are automatically traced. This video has a corresponding data file in JSON format that can be processed in Matlab through an API that we provide. This API can export track information in a MS Excel format, CSV, for easier manipulation.
Examples of Euglena variety of behaviors that can be observed on this platform (passive observation as well as active experimentation): A. Euglena, seen through a 10x objective, responding to all four LED directions applied sequentially. B. Euglena, seen through a 4x objective, responding to all four LED directions applied sequentially. C. Euglena responding to light shone at an angle. D. This clip shows how a Euglena can be virtually controlled to follow a path with our joystick interface. E. The microfluidic chip getting overpopulated as seen through a 10x objective. F. The microfluidic chip getting overpopulated as seen through a 4x objective. G. In some scenarios, the linear motility of the Euglena population tends to decrease while they spin vigorously in response to light. H. Cell division events captured during a time lapse
Average orientation (in acute angle, degrees) of Euglena population in response to different LED and no-light conditions: No light stimulus was provided during the first 60s when the Euglena were randomly oriented leading to an average acute angle close to 45°. Each LED was then shone by itself for 30s in sequence, and the Euglenas move away from light every time. The average orientation of all the Euglenas per frame is plotted against time, which shows clear measurable alternating Hill type signals. No light was shone during the last 60s when the cell population converged back to random orientations. We ultimately use this orientation to measure responsiveness of a BPU as discussed in section 2.2.
Illustration of modeling interface as used in second study (Figs. 4c,d): 7th and 8th grade students investigated three parameters: surge, coupling and roll that drive a model Euglena to follow a predefined path upon light stimulus with a joystick. Only the name of the surge parameter was exposed while the other two were unnamed for students to find out as an exercise. The video demonstrates different combinations of parameters to demonstrate their effects on the model as well as to highlight the overall descriptive power of this model (compare also Supplementary Video 3 for related real behaviors): A. The simulation is run without changing the initial parameter values, which only sets surge to a non-zero number. The model Euglena propels without responding to any light. B. The coupling parameter is set to a positive number (15). This time the model Euglena exhibits positive phototaxis, i.e. move towards light. C. Coupling is set to a negative number (−15), the Euglena exhibits negative phototaxis as expected but does not respond to the “Right” LED because the model Euglena was sampling light only from the left as there was no spin. D. Roll is set to a small positive number (2), which lets Euglena see light in all directions, but the response is slow which results in a wobbly path with large amplitude upon light changes. E. Roll is set to 4 and the surge is decreased which corresponds to a near optimal setting. In this case, the Euglena responds to light stimulus in manner that is consistent with reality. F. Roll is set to 5 and coupling to a large negative number, which makes Euglena to tumble and spin uncontrollably.
Illustration of the iLab user study (Figs. 4e,f): This video demonstrates how users can operate the cloud lab from a third party education content management website, in this case iLab 6. A student would login with her iLab credentials, and choose one of the tasks assigned by her teacher. A task contains lessons about Euglena and accompanying quizzes. The images used in this lesson were taken from Wikipedia (https://en.wikipedia.org/wiki/Euglena). In page 3 of this lesson, the student uses a simple interface to design an experiment with light stimulus and timing. The student can get an estimate of how long her experiment will take for the cloud lab to run before submitting it as a batch experiment directly to the cloud lab through the iLab interface. iLab will then fetch the data when the experiment is over and annotated the data with light and timing information which the student can investigate and use to answer further test questions. Due to screen recording, the video player view on page 4 had flickering, which was filtered out for the purpose of clarity. The student can run as many experiments as she wants.