Critical dynamics arise during structured information presentation within embodied in vitro neuronal networks

Understanding how brains process information is an incredibly difficult task. Amongst the metrics characterising information processing in the brain, observations of dynamic near-critical states have generated significant interest. However, theoretical and experimental limitations associated with human and animal models have precluded a definite answer about when and why neural criticality arises with links from attention, to cognition, and even to consciousness. To explore this topic, we used an in vitro neural network of cortical neurons that was trained to play a simplified game of ‘Pong’ to demonstrate Synthetic Biological Intelligence (SBI). We demonstrate that critical dynamics emerge when neural networks receive task-related structured sensory input, reorganizing the system to a near-critical state. Additionally, better task performance correlated with proximity to critical dynamics. However, criticality alone is insufficient for a neuronal network to demonstrate learning in the absence of additional information regarding the consequences of previous actions. These findings offer compelling support that neural criticality arises as a base feature of incoming structured information processing without the need for higher order cognition.


MEA Setup and Plating
MaxOne Multielectrode Arrays (MEA; Maxwell Biosystems, AG, Switzerland) was used and is a high-resolution electrophysiology platform featuring 26,000 platinum electrodes arranged over an 8mm 2 surface.The MaxOne system is based on complementary meta-oxide-semiconductor (CMOS) technology and allows recording from up to 1024 channels.MEAs were coated with either polyethylenimine (PEI) in borate buffer for primary culture cells or Poly-D-Lysine for cells from an iPSC background before being coated with either 10 µg/ml mouse laminin or 10 µg/ml human 521 Laminin (Stemcell Technologies Australia, Melbourne, Australia) respectively to facilitate cell adhesion.Approximately 10 6 cells were plated on MEA after preparation as per [1].Cells were allowed approximately one hour to adhere to MEA surface before the well was flooded.The day after plating, cell culture media was changed for all culture types to BrainPhys™ Neuronal Medium (Stemcell Technologies Australia, Melbourne, Australia) supplemented with 1% penicillin-streptomycin. Cultures were maintained in a low O2 incubator kept at 5% CO2, 5% O2, 36°C and 80% relative humidity.Every two days, half the media from each well was removed and replaced with free media.Media changes always occurred after all recording sessions.

Dishbrain platform and electrode configuration
The current DishBrain platform is configured as a low-latency, real-time MEA control system with on-line spike detection and recording software.The Dish-Brain platform provides on-line spike detection and recording configured as a low-latency, real-time MEA control.The DishBrain software runs at 20 kHz and allows recording at an incredibly fine timescale.There is the option of recording spikes in binary files, and regardless of recording, they are counted over a period of 10 milliseconds (200 samples), at which point the game environment is provided with how many spikes are detected in each electrode in each predefined motor region as described below.Based on which motor region the spikes occurred in, they are interpreted as motor activity, moving the 'paddle' up or down in the virtual space.As the ball moves around the play area at a fixed speed and bounces off the edge of the play area and the paddle, the 'Pong' game is also updated at every 10ms interval.Once the ball hits the edge of the play area behind the paddle, one rally of 'Pong' has come to an end.The game environment will instead determine which type of feedback to apply at the end of the rally: random, silent, or none.Feedback is also provided when the ball contacts the paddle under the standard stimulus condition.A 'stimulation sequencer' module tracks the location of the ball relative to the paddle during each rally and encodes it as stimulation to one of eight stimulation sites.Each time a sample is received from the MEA, the stimulation sequencer is updated 20,000 times a second, and after the previous lot of MEA commands has completed, it constructs a new sequence of MEA commands based on the information it has been configured to transmit based on both place codes and rate codes.The stimulations take the form of a short square bi-phasic pulse that is a positive voltage, then a negative voltage.This pulse sequence is read and applied to the electrode by a Digital to Analog Converter (or DAC) on the MEA.A real-time interactive version of the game visualiseris available at https://spikestream.corticallabs.com/.Alternatively, cells could be recorded at 'rest' in a gameplay environment where activity was recorded to move the paddle but no stimulation was delivered, with corresponding outcomes still recorded.Using this spontaneous activity alone as a baseline, the gameplay characteristics of a culture were determined.Low level code for interacting with Maxwell API was written in C to minimize processing latencies-so packet processing latency was typically <50 µs.High-level code was written in Python, including configuration setups and general instructions for game settings.A 5 ms spike-to-stim latency was achieved, which was substantially due to Max-One's inflexible hardware buffering.Figure S1 illustrates a schematic view of Software components and data flow in the DishBrain closed loop system.

Input Configuration
As introduced in [1], stimulation is delivered at specific locations, frequency, and voltage to key electrodes in a topographically consistent manner in the sensory area relative to the current position of the paddle (see Figure S2).This mimics retinotopic and topographic representations found in many neural systems, which represent the external world [3,4].It was possible to deliver five types of input.Either the 'Sensory Stimulus' encoding the position of the ball, or one of four feedback protocols explained below: Unpredictable, Predictable, Silent, or No-feedback.Sensory Stimulus: Due to the fact that neurons appeared robust to voltage stimulation, the voltage level was determined based on the evidence of neurological function.As a result, 75 mV was chosen as the sensory stimulation voltage to prevent forcing hyperpolarised cells to fire.This stimulation was applied to key electrodes relating to where the ball was relative to the paddle.Combining place coding with a rate coding, when the ball was closest to the opposing wall, stimuli were delivered at 4 Hz, increasing in a linear manner to 40 Hz when it reached the paddle wall.Unpredictable Stimulus: For the standard stimulus feedback condition, cultures received unpredictable stimulation when they missed connecting the paddle with the 'ball', i.e. when a 'miss' occurred.Using a feedback stimulus at a voltage of 150 mV and a frequency of 5 Hz, unpredictable external stimulus could be added to the system.Random stimulation took place at random sites over the 8 predefined input electrodes at random timescales for a period of four seconds, followed by a configurable rest period of four seconds where stimulation paused, then the next rally began.In theory, the higher voltage than used for the Sensory Stimulus would force action potentials regardless of the state in which the cell was in, causing even greater disruption.Predictable Stimulus: For the standard stimulus feedback condition, cultures were exposed to predictable stimulation when a 'hit' was registered -that is, when the 'paddle' connected successfully with the 'ball'.This was delivered at 75mV at 100Hz over 100ms.This occurred when the simulated ball struck the paddle and replaced other sensory information for 100 ms.All 8 stimulation electrodes simultaneously received predictable stimulation at this frequency and period.Silent Feedback: During Silent feedback period, the Unpredictable Stimulus described above was replaced with no stimulation for the same length of time.Predictable Stimulus feedback was also removed during Silent Feedback sessions.There is still a difference between this feedback and No-Feedback described below.This feedback is associated with culture activity in a closedloop manner and therefore constitutes feedback.

No-feedback:
As an open-loop condition, this assessment was designed to determine whether sensory stimulation is sufficient to drive learning in cultures.In other words, there was no feedback of any kind provided to the cultures based on their actions or outcomes.As described above, the cultures were given the same sensory stimulation as described above, and the outcome was determined by the same metric.However, when a 'miss' would otherwise occur, here instead the ball bounced off the wall behind the paddle, continuing the same trajectory -still recorded as a 'miss'.This would otherwise end the rally.The 'ball' would be recorded as a 'hit' whenever it connected with the simulated paddle.So, under No-Feedback, the entire gameplay session is essentially one rally in which the simulated ball's final position can be predicted from its initial vector, but scoring occurs normally as usual.Figure S3 represents a schematic showing the different phases of stimulation and the information delivery to the culture while Figure S4 demonstrates different simulated gameplay environments employing different feedback protocols as above.

Output Configuration
A total of 1024 electrodes were routed on the HD-MEA to record activity.The 'Sensory' area, where stimulation electrodes were embedded as described above consisted of 626 electrodes.The remaining output electrodes were divided into predefined motor regionson the MEA, consisting of four regions that were defined either as motor region 1 or motor region 2 as shown in Figure S2.
Since it was technically difficult to cultivate neurons that displayed perfectly symmetrical activity in both these regions, 'gain' was added to the system.These took a real-time value based on the mean firing in each motor region and multiplied it to achieve a target value of 20 Hz across the entire region.Consequently, changes in activity in each of the two regions could affect the position of the paddle, even if they displayed different latent spontaneous activity.

Culture and experiment statistics
For a combination of reasons there were differences in the number of experiments each culture performed.Tables in Figure S5.a-d represent the details of the number of Gameplay sessions performed by each culture in each condition and the corresponding dates of the experiments.The blue rectangles mark proprietary pieces of hardware from MaxWell, including the MEA well which may contain a live culture of neurons.The green MXWServer is a piece of software provided by MaxWell which is used to configure the MEA and Hub, using a private API directly over the network.The red rectangles mark components of the 'DishServer' program, a high-performance program consisting of four components designed to run asynchronously, despite being run on a single CPU thread.The 'LAN Interface' component stores network state, for talking to the Hub, and produces arrays of voltage values for processing.Voltage values are passed to the 'Spike Detection' component, which stores feedback values and spike counts, and passes recalibration commands back to the LAN Interface.When the 'Pong' environment is ready to run, it updates the state of the paddle based on the spike counts, the state of the ball based on its velocity and collision conditions, and reconfigures the stimulation sequencer based on the relative position of the ball and current state of the game.The stimulation sequencer stores and updates indices and countdowns relating to the stimulations it must produce and converts these into commands each time the corresponding countdown reaches zero, which are finally passed back to the LAN Interface, to send to the MEA system, closing the loop.The procedures associated with each component are run one after the other in a simple loop control flow, but the 'Pong' environment only moves forward every 200th update, short-circuiting otherwise.Additionally, up to three worker processes are launched in parallel, depending on which parts of the system need to be recorded.They receive data from the main thread via shared memory and write it to file, allowing the main thread to continue processing data without having to hand control to the operating system and back again.b) Numeric operations in the real-time spike detection component of the DishBrain closed loop system, including multiple IIR filters.Running a virtual environment in a closed loop imposes strict performance requirements, and digital signal processing is the main bottleneck of this system, with close to 42 MB of data to process every second.Simple sequences of IIR digital filters is applied to incoming data, storing multiple arrays of 1024 feedback values in between each sample.First, spikes on the incoming data are detected by a high pass filter to determine the deviation of the activity, and comparing that to the MAD, which is itself calculated with a subsequent low pass filter.Then, a low pass filter is used to the original data to determine whether the MEA hardware needs to be re-calibrated, affecting future samples.This system was able to keep up with the incoming data on a single thread of an Intel Core i7-8809G.Figures adapted from [1].This article was published in Neuron, 110.23,Kagan, Brett J., Andy C. Kitchen, Nhi T. Tran, Forough Habibollahi, Moein Khajehnejad, Bradyn J. Parker, Anjali Bhat, Ben Rollo, Adeel Razi, and Karl J. Friston., "In vitro neurons learn and exhibit sentience when embodied in a simulated game-world.",3952-3969, Copyright The Author(s).Published by Elsevier Inc. (2022).Fig. S3 Presents a schematic showing the different phases of stimulation which provides information about the environment to the culture, in line with this is the corresponding input voltage and how that voltage appears on the raster plot over 100 seconds.The appearance of random stimulation after a ball missing vs system wide predictable stimulation upon a successful hit is apparent across all three representations.This corresponds to the images on the right which show the position of the ball on both x and y axis relative to the paddle and backwall in % of total distance is shown on the same timescale.

Fig
Fig.S1 a, b) Schematics of software used for DishBrain.a) Software components and data flow in the DishBrain closed loop system.Voltage samples flow from the MEA to the 'Pong' environment, and sensory information flows from the 'Pong' environment back to the MEA, forming a closed loop.The blue rectangles mark proprietary pieces of hardware from MaxWell, including the MEA well which may contain a live culture of neurons.The green MXWServer is a piece of software provided by MaxWell which is used to configure the MEA and Hub, using a private API directly over the network.The red rectangles mark components of the 'DishServer' program, a high-performance program consisting of four components designed to run asynchronously, despite being run on a single CPU thread.The 'LAN Interface' component stores network state, for talking to the Hub, and produces arrays of voltage values for processing.Voltage values are passed to the 'Spike Detection' component, which stores feedback values and spike counts, and passes recalibration commands back to the LAN Interface.When the 'Pong' environment is ready to run, it updates the state of the paddle based on the spike counts, the state of the ball based on its velocity and collision conditions, and reconfigures the stimulation sequencer based on the relative position of the ball and current state of the game.The stimulation sequencer stores and updates indices and countdowns relating to the stimulations it must produce and converts these into commands each time the corresponding countdown reaches zero, which are finally passed back to the LAN Interface, to send to the MEA system, closing the loop.The procedures associated with each component are run one after the other in a simple loop control flow, but the 'Pong' environment only moves forward every 200th update, short-circuiting otherwise.Additionally, up to three worker processes are launched in parallel, depending on which parts of the system need to be recorded.They receive data from the main thread via shared memory and write it to file, allowing the main thread to continue processing data without having to hand control to the operating system and back again.b) Numeric operations in the real-time spike detection component of the DishBrain closed loop system, including multiple IIR filters.Running a virtual environment in a closed loop imposes strict performance requirements, and digital signal processing is the main bottleneck of this system, with close to 42 MB of data to process every second.Simple sequences of IIR digital filters is applied to incoming data, storing multiple arrays of 1024 feedback values in between each sample.First, spikes on the incoming data are detected by a high pass filter to determine the deviation of the activity, and comparing that to the MAD, which is itself calculated with a subsequent low pass filter.Then, a low pass filter is used to the original data to determine whether the MEA hardware needs to be re-calibrated, affecting future samples.This system was able to keep up with the incoming data on a single thread of an Intel Core i7-8809G.Figures adapted from[1].This article was published in Neuron, 110.23,Kagan, Brett J., Andy C. Kitchen, Nhi T. Tran, Forough Habibollahi, Moein Khajehnejad, Bradyn J. Parker, Anjali Bhat, Ben Rollo, Adeel Razi, and Karl J. Friston., "In vitro neurons learn and exhibit sentience when embodied in a simulated game-world.",3952-3969, Copyright The Author(s).Published by Elsevier Inc. (2022).
Fig.S3Presents a schematic showing the different phases of stimulation which provides information about the environment to the culture, in line with this is the corresponding input voltage and how that voltage appears on the raster plot over 100 seconds.The appearance of random stimulation after a ball missing vs system wide predictable stimulation upon a successful hit is apparent across all three representations.This corresponds to the images on the right which show the position of the ball on both x and y axis relative to the paddle and backwall in % of total distance is shown on the same timescale.Figure adapted from [1].This article was published in Neuron, 110.23,Kagan, Brett J., Andy C. Kitchen, Nhi T. Tran, Forough Habibollahi, Moein Khajehnejad, Bradyn J. Parker, Anjali Bhat, Ben Rollo, Adeel Razi, and Karl J. Friston., "In vitro neurons learn and exhibit sentience when embodied in a simulated game-world.",3952-3969, Copyright The Author(s).Published by Elsevier Inc. (2022).

Table S1 .
Multivariate statistical tests and all results for tests done.

Table S2 .
Follow up main text post-hoc tests for multivariate tests, including means, standard error (SE), t-scores, degree of freedom and exact p-values with hedges.