Multi-frequency steady-state visual evoked potential dataset

Mu, Jing; Liu, Shuo; Burkitt, Anthony N.; Grayden, David B.

doi:10.1038/s41597-023-02841-5

Download PDF

Data Descriptor
Open access
Published: 04 January 2024

Multi-frequency steady-state visual evoked potential dataset

Scientific Data volume 11, Article number: 26 (2024) Cite this article

1238 Accesses
Metrics details

Subjects

Abstract

The Steady-State Visual Evoked Potential (SSVEP) is a widely used modality in Brain-Computer Interfaces (BCIs). Existing research has demonstrated the capabilities of SSVEP that use single frequencies for each target in various applications with relatively small numbers of commands required in the BCI. Multi-frequency SSVEP has been developed to extend the capability of single-frequency SSVEP to tasks that involve large numbers of commands. However, the development on multi-frequency SSVEP methodologies is falling behind compared to the number of studies with single-frequency SSVEP. This dataset was constructed to promote research in multi-frequency SSVEP by making SSVEP signals collected with different frequency stimulation settings publicly available. In this dataset, SSVEPs were collected from 35 participants using single-, dual-, and tri-frequency stimulation and with three different multi-frequency stimulation variants.

An open dataset for human SSVEPs in the frequency range of 1-60 Hz

Article Open access 13 February 2024

Improving user experience of SSVEP BCI through low amplitude depth and high frequency stimuli design

Article Open access 25 May 2022

Optimising the classification of feature-based attention in frequency-tagged electroencephalography data

Article Open access 13 June 2022

Background & Summary

Brain-Computer Interfaces (BCIs), also called Brain-Machine Interfaces (BMIs), translate brain activity into commands to control external devices, such as computers, wheelchairs, or assistive robots¹. BCIs can detect human intention in the absence of physical inputs so they can be used to assist people with movement disorders and provide an additional communication channel between humans and machines².

Among the modalities that can be captured and decoded from the brain, the Steady-State Visual Evoked Potential (SSVEP) is one of the most widely used as it can be captured non-invasively using electroencephalography (EEG) with relatively high signal-to-noise ratio and requires minimal user training³. The SSVEP is an automatic response of the visual cortex in reaction to periodic visual stimulation^4,5. The SSVEP responses show the same frequencies as the stimulation frequencies as well as the harmonics of the frequencies^4,6. The existence of harmonics enables higher SSVEP classification accuracy⁷ but, at the same time, limits the selection of frequencies when constructing SSVEP-based BCIs⁸. Human brains respond to a constrained range of frequencies with optimal range identified as 12–18 Hz^4,9,10 so, in cases where a large number of commands to be shown at once (i.e., a large number of frequencies need to be selected for stimulation), the frequencies become very close to each other and this makes decoding a very challenging task.

To increase the capacity of SSVEP-based BCIs to produce large numbers of commands, multi-frequency SSVEP was first proposed in 2010¹¹ and different multi-frequency stimulation methods have since been developed^{12,13,14,15,16}. The multi-frequency stimulation methods combine multiple frequencies in each stimulus. Therefore, by using different combinations of input frequencies, more stimuli can be represented with a smaller number of input frequencies. This makes multi-frequency SSVEP superior to single-frequency SSVEP when the number of targets becomes large because multi-frequency SSVEP does not require as many frequencies as single-frequency SSVEP, so a larger frequency interval can be maintained¹⁷. However, it was not until 2020 that studies on understanding better frequency selection in multi-frequency SSVEP were conducted^18,19 and, in 2021, the first training-free decoding algorithm for multi-frequency SSVEP was developed²⁰. It was demonstrated that multi-frequency SSVEP has more complex frequency components compared to traditional single-frequency SSVEP, including the existence of interactions between the input frequencies^{12,13,14,15,20}. This feature creates redundancy in the information carried by the signal that can be used in decoding. Even though multi-frequency SSVEP has demonstrated its potential in delivering large numbers of commands, the research in this field is still lagging behind that of single-frequency SSVEP.

One way to facilitate research and encourage more people to study a topic is to create relevant datasets and make them widely accessible. The benchmark dataset²¹ for SSVEP-based BCIs has been used in over 200 scholarly works (based on Google Scholar citations) since its publication in 2017. Following this, more datasets on SSVEP have been published that focus on collecting SSVEP across multiple days and in multiple frequency bands²², SSVEP collected in a closer-to-real-world application setting²³, comparing SSVEPs collected with wet and dry EEG electrodes²⁴, SSVEP in the ageing population²⁵, and feature-based selective attention in SSVEP²⁶. BCI datasets that include SSVEP components were assembled to investigate BCI illiteracy²⁷, facilitate the development of BCIs when users are in mobile situations²⁸, and hybrid BCI combining EEG and other biosignals²⁹. However, all of these datasets use single-frequency SSVEP, and there is not yet a publicly available dataset for multi-frequency SSVEP.

In this work, we constructed the first Open Access dataset for multi-frequency SSVEP³⁰. Our dataset includes SSVEP collected from 35 participants with dry EEG electrodes. All participants were presented with single-frequency, dual-frequency, and tri-frequency visual stimulation using up to three different stimulation methods for each modality. EEG data is provided in complete session format to allow everyone to have access to all details in the recordings and thus simulate a real-time experiment experience. Matlab scripts are included to assist in separating data into trial lengths.

Methods

Participants

Thirty-five volunteers (aged 19 to 42 years, 25.91 mean ±5.30 standard deviation) participated in this experiment, who are free of neurological or facial muscle conditions. Out of the 35 participants, 25 were naïve to BCIs (had never previously participated in a BCI experiment); 27 were naïve to SSVEP-based BCIs (had never previously participated in an SSVEP experiment). For the eight experienced SSVEP-based BCI participants, the time since they last participated in an SSVEP experiment ranged from 4 months to 36 months (11.36 ± 10.38 months, mean ± standard deviation). Two participants were left handed and the rest were right handed. All participants had normal or corrected-to-normal vision (e.g., with glasses or contact lenses).

This study was approved by the University of Melbourne Human Research Ethics Committee (Project ID 24178). Written consent was collected from each participant. Each participant was compensated with an AUD $20 gift card.

EEG Setup

EEG was recorded with g.USBamp and g.SAHARA dry electrodes (g.tec medical engineering GmbH, Austria) inside a Faraday shielded room. Brain activities were measured from six channels, PO3, POz, PO4, O1, Oz, and O2, according to the international 10-10 system. Reference and ground electrodes were positioned on left and right mastoids, respectively. Dry electrodes were selected due to the ease of setting up without gelling, which makes it more convenient in real-world applications.

During data acquisition, a 0.5–100 Hz band-pass filter and a 50 Hz notch filter were applied to all channels in g.USBamp settings. Data was recorded at a sampling rate of 512 Hz.

Stimulation setup

An Alienware monitor AW2518HF (24.5 inch, 1920 × 1080, DELL Technologies, USA) was used to present all visual stimulation in this study. Participants sat in a chair at a distance of 70 cm from the screen measured from their eyes and with their head centred to the screen.

Stimulation was delivered through an interface programmed in Unity (Unity Technologies, USA) that ran on an EliteBook 840 G5 laptop (Hewlett-Packard, USA) with Core i7-8550U CPU @ 1.80 GHz, 16 GB RAM (Intel, USA) and UHD Graphics 620 integrated graphics unit (Intel, USA). The programmed interface displayed stimuli (targets) in white squares of size 108 × 108 pixels on a black background and 108 pixels gaps between adjacent targets in both vertical and horizontal directions. The interface was set to a 120 Hz refresh rate.

The frequencies used in this study were 7, 11, 13, 17, 19, and 23 Hz. These are prime numbers that are in the most responsive range of SSVEP⁴. The combinations of these six frequencies made six single-frequency targets, 15 dual-frequency targets (${C}_{2}^{6}=15$), and 20 tri-frequency targets (${C}_{3}^{6}=20$). Table 1 lists the frequencies and frequency combinations used in single-, dual-, and tri-frequency stimulation.

Table 1 Frequencies (in Hz) used in each target in single-, dual-, and tri-frequency stimulation.

Full size table

The stimuli layouts in single-, dual-, and tri-frequency tests are shown in Fig. 1. Simple flicking was used in delivering single-frequency stimulation. Three different stimulation methods were used in dual-frequency stimulation and two were used in tri-frequency stimulation. Details will be explained below; visual representations of the stimulation methods can be found in Fig. 2.

Single-frequency stimulation

Single-frequency stimulation was delivered as square waves flickering at full brightness. The six targets were laid out in a 2 × 3 matrix, as shown in Fig. 1a. The signal for each frequency was generated using

$${\rm{u}}=\frac{1}{2}{\rm{sgn}}(\sin (2\pi ft))+\frac{1}{2},$$

(1)

where f is the stimulation frequency and sgn() is the sign function.

Dual-frequency stimulation

Three different methods were used in dual-frequency stimulation: two frequency superposition methods (OR and ADD)¹⁵ and checkerboard¹².

In dual-frequency superposition, two square waves are superimposed,

$${{\rm{S}}}_{{\rm{OR,2}}}={{\rm{u}}}_{{\rm{1}}}\vee {{\rm{u}}}_{{\rm{2}}}$$

(2)

$${{\rm{S}}}_{{\rm{ADD,2}}}=\frac{1}{2}{{\rm{u}}}_{{\rm{1}}}+\frac{1}{2}{{\rm{u}}}_{{\rm{2}}},$$

(3)

where u₁ and u₂ are signals generated from Eq. (1) using two different frequencies f₁ and f₂ In frequency superposition OR with two stimulation frequencies (S_{OR, 2}), the OR logic is applied to the two square waves as shown in Eq. (2), where the stimulation is ON (1) when either (or both) of the signals is ON, and OFF (0) when both of the signals are OFF. Frequency superposition ADD with two frequencies (S_{ADD, 2}) is achieved by reducing the brightness of each signal by half, then summing the brightness from the two signals, as described in Eq. (3).

The checkerboard method delivers the two stimulation signals separately, with its two patterns represented in the alternating squares. In this study, 8-by-8 checkerboards are used in place of each solid square stimulus.

The fifteen dual-frequency targets were shown in a 3 × 5 layout (Fig. 1b).

Tri-frequency stimulation

Similar to dual-frequency stimulation, frequency superposition OR and ADD were used in presenting tri-frequency stimulation. However, the checkerboard method was excluded as it does not support more than two frequencies shown at a time.

$${{\rm{S}}}_{{\rm{OR,3}}}={{\rm{u}}}_{{\rm{1}}}\vee {{\rm{u}}}_{{\rm{2}}}\vee {{\rm{u}}}_{{\rm{3}}}$$

(4)

$${{\rm{S}}}_{{\rm{ADD,}}3}=\frac{1}{3}{{\rm{u}}}_{1}+\frac{1}{3}{{\rm{u}}}_{2}+\frac{1}{3}{{\rm{u}}}_{3},$$

(5)

where u₃ is the signal generated with a third frequency. In tri-frequency stimulation with frequency superposition, the formulations are similar to those in dual-frequency stimulation. In OR (S_{OR, 3}), instead of two signals, we now add a third signal u₃ as shown in Eq. (4). In ADD (S_{ADD, 3}), the brightness of each signal is reduced to one third, as shown in Eq. (5).

The twenty tri-frequency targets were laid out in a 4 × 5 grid (Fig. 1c).

Experimental protocol

Experiment structure

The experiment consisted of nine sessions, with session 1 testing single-frequency stimulation, sessions 2–5 testing dual-frequency, and sessions 6–9 testing tri-frequency. Three-minute breaks were provided between the sessions. A 10 minute break was placed between sessions 5 and 6 when the participant had finished all single- and dual-frequency sessions and before they started tri-frequency sessions. All breaks were adjusted to the participant’s need to minimise fatigue. Figure 3 depicts the structure of the experiment. The whole experiment required 2 hours to complete including the preparation and clean-up time (dry electrodes were used so experimenter only need to remove the cap from the participant during the clean-up). The experiment was completed in one sitting.

Each setup was tested four times. In session 1, the single-frequency setup (T1) was tested four times in a row. In sessions 2–5, the three dual-frequency setups (T21: frequency superposition OR; T22: frequency superposition ADD; T23: checkerboard) were tested once in each session. Therefore, each session included three tests in a balanced randomised sequence. Table 2 lists all sequences used in the experiment and the participants that used each sequence. Sessions 6–9 tested the two tri-frequency setups (T31: frequency superposition OR; T32: frequency superposition ADD) with each session running each setup once. The tri-frequency sessions followed an AB-BA-BA-AB sequence alternating with BA-AB-AB-BA between the participants (participants with odd indices followed AB-BA-BA-AB, even indices followed BA-AB-AB-BA). Figure 4 shows the structure of each session. Test sequences for participant 1 are labelled in this figure as an example.

Table 2 Sequences for dual-frequency sessions (sessions 2–5) and the list of participants that used each dual-frequency sequence.

Full size table

Trial structure

Trials were the smallest components in this experiment. Each trial started with a 1 s cue (green frame) to show the participant which target they should attend to. This was followed by a 5 s stimulation period with a fixation point provided to help them maintain attention on the target. Visual feedback (solid green or red block for correct or incorrect, respectively) was shown to the participant for 0.5 s after stimulation. Then the screen turned to solid black for 0.5 s as a resting period. Each trial was 7 s in total. Note that, in tri-frequency tests, the feedback and rest were swapped to allow sufficient time for the decoder to produce an output. Figure 2 shows the structure of the trials.

Test structure

A test refers to the action of going through all targets on the screen once each. In a single-frequency test (T1), one test has six trials as there are six targets. In a dual-frequency test (T21, T22, T23), one test has 15 trials. In a tri-frequency test (T31, T32), one test has 20 trials. In a test, participants went through the targets in a fixed sequence: from left to right and top to bottom. However, the stimuli were randomly shuffled on the screen to reduce undesirable bias. By the end of each test, a score is shown on the screen informing the participants of the number of correctly decoded trials, as shown in Fig. 2.

Online decoding

Data were processed online with four training-free decoders operating in parallel to keep the experiment compact while minimising the effect of inaccurate modelling of each individual’s SSVEP responses in the decoding process. Canonical Correlation Analysis (CCA)³¹ for single frequency only, Multi-Frequency CCA (MFCCA)²⁰ for multi frequency only, and Linear Diophantine Equation (LDE) decoding algorithms³² were used. The recorded EEG during the 5 s stimulation period were used in decoding.

CCA

Canonical Correlation Analysis (CCA)³¹ is a decoding algorithm that focuses on comparing the time-domain correlation ρ of the recorded multi-channel EEG X and predefined templates Y based on knowledge of the set of frequencies used. CCA looks for the weight vectors W_X and W_Y, which constructs x = X^TW_X and y = Y^TW_y, and maximises the correlation between x and y,

$$\mathop{max}\limits_{{{\bf{W}}}_{{\bf{X}}},{{\bf{W}}}_{{\bf{Y}}}}\rho ({\bf{x}},{\bf{y}})=\frac{E\left[{{\bf{x}}}^{T}{\bf{y}}\right]}{\sqrt{E\left[{{\bf{x}}}^{T}{\bf{x}}\right]E\left[{{\bf{y}}}^{T}{\bf{y}}\right]}}=\frac{E\left[{{\bf{W}}}_{{\bf{X}}}^{T}{{\bf{XY}}}^{T}{{\bf{W}}}_{{\bf{Y}}}\right]}{\sqrt{E\left[{{\bf{W}}}_{{\bf{X}}}^{T}{{\bf{XX}}}^{T}{{\bf{W}}}_{{\bf{X}}}\right]E\left[{{\bf{W}}}_{{\bf{Y}}}^{T}{{\bf{YY}}}^{T}{{\bf{W}}}_{{\bf{Y}}}\right]}},$$

(6)

where E is the mathematical expectation. The template Y in CCA is constructed with the sine and cosine signals at the stimulation frequency f and its harmonics,

$${{\bf{Y}}}_{CCA}(t)=\left[\begin{array}{c}\sin (2\pi ft)\\ \cos (2\pi ft)\\ \sin (2\pi 2ft)\\ \cos (2\pi 2ft)\\ \vdots \\ \sin (2\pi {N}_{h}\,ft)\\ \cos (2\pi {N}_{h}\,ft)\end{array}\right],$$

(7)

where N_h is the number of harmonics included in the formulation. For each stimulation frequency, a template Y is constructed and corresponding correlation calculated. The frequency that results in the highest correlation between x and y is selected as the decoder output.

In this work, two CCA configurations were used with N_h = 1 (decoder 1) and N_h = 2 (decoder 2).

MFCCA

Multi-Frequency Canonical Correlation Analysis (MFCCA)²⁰ extends CCA to include the interactions between input frequencies into the template formulation. The templates Y in MFCCA are constructed with the sine and cosine signals at the stimulation frequencies and the integer linear combinations of the stimulation frequencies. Instead of bounding the size of Y with N_h as in CCA, it is bounded in MFCCA by order N_O, defined as the sum of absolute values of the coefficients in the combination. For example, in dual-frequency SSVEP with stimulation frequencies f₁ and f₂, the linear integer combination of the two frequencies ${c}_{1}{f}_{1}+{c}_{2}{f}_{2}$, ${c}_{1},{c}_{2}\in {\mathbb{Z}}$ has order ${N}_{{\rm{O}}}=| {c}_{1}| +| {c}_{2}| $.

An example of the template formulation with two input frequencies up to order 2 is

$${{\bf{Y}}}_{\mathrm{MFCCA},{N}_{{\rm{O}}}=2}(t)=\left[\begin{array}{c}\sin (2\pi {f}_{1}t)\\ \cos (2\pi {f}_{1}t)\\ \sin (2\pi {f}_{2}t)\\ \cos (2\pi {f}_{2}t)\\ \sin (2\pi (2{f}_{1})t)\\ \cos (2\pi (2{f}_{1})t)\\ \sin (2\pi (2{f}_{2})t)\\ \cos (2\pi (2{f}_{2})t)\\ \sin (2\pi ({f}_{1}+{f}_{2})t)\\ \cos (2\pi ({f}_{1}+{f}_{2})t)\\ \sin (2\pi |{f}_{1}-{f}_{2}|t)\\ \cos (2\pi |{f}_{1}-{f}_{2}|t)\end{array}\right].$$

(8)

The two configurations selected for MFCCA in this work are N_O = 1 and N_O = 2 (the best performing settings as identified by Mu and colleagues³²).

LDE

The Linear Diophantine Equation (LDE) decoder³² is capable of decoding both single-frequency and multi-frequency SSVEP. In LDE, the top N_p frequency peaks in the recorded SSVEP are first identified, then the coefficient(s) of the identified peak frequency in relation to the input stimulation frequency/frequencies are calculated through solving the formulated LDE. The frequency/frequency pair that has the highest number of integer solutions in solving the LDEs and lowest sum of orders is regarded as the decoder output.

The two LDE configurations for both single-frequency and multi-frequency decoding are selected as N_p = 9, N_O = 4 (the best performing setting as identified by³²) and N_p = 12, N_O = 2 (decoders 3 and 4, respectively).

Data Records

The dataset³⁰ can be accessed on Figshare from https://doi.org/10.26188/22015694.

The dataset includes raw EEG data collected from 35 participants accompanied by metadata containing non-identifiable details of the participants. Data available in both.mat and.csv format. Matlab scripts are provided to assist users in preparing the data (.mat) in a more accessible form.

All nine sessions of EEG data from all participants are included in the dataset. Sessions are in data files named “P##_Ses#” where ## is a two-digit index for the participant, e.g. “P01”, and the last # is the session number (1–9). All data in .csv format are included in data_in_csv.zip.

EEG Data

EEG data were recorded in Simulink and Matlab 2015a (MathWorks Inc., USA). In all recordings, the data has 10 rows. The first row records timestamps, rows 2 to 7 are the six EEG channels (PO3, POz, PO4, O1, Oz, O2, respectively), row 8 contains triggers, row 9 is processed from row 8 to show stimulation periods, and row 10 has decoder outputs in the online experiment.

The trigger signal in row 8 labels both onsets and offsets of the visual stimulation, where a positive integer labels the onset of each stimulation period and −1 labels the end of stimulation in each trial. The value of each onset trigger is the frequency index in each trial: 1–6 in single-frequency, 1–15 in dual-frequency, and 1–20 in tri-frequency. The decoder output in row 10 is an 8-digit integer (sometimes appears as 7-digit as the ‘0’ on the first digit is omitted) where the first two digits are the index (01–20) of the target decoded by decoder 1; the third and forth digits are the index of the target decoded by decoder 2, etc.

The script dataset_processData.m extracts the data (.mat) in sessions to data in trials based on the trigger information. The extracted data will be stored in a separate folder, still keeping one participant for each folder. Trials are named “P##_T#_R#_#”, where T# is the test name (e.g. T21), R# is the number of the repetition of the test (R1-R4), and the last # is the trial number/frequency index, which is equal to the trigger value.

Metadata

Metadata dataset_metadata.xlsx includes non-identifiable participant information: sex and gender, age, dominant hand, and whether they have previous experience with EEG-based BCI and SSVEP-based BCI.

Technical Validation

The data quality was validated through inspection of both time domain and frequency domain signal profiles, order distribution in multi-frequency tests, signal-to-noise ratios, and decoding accuracies.

Signal profile in time and frequency domains

Figure 5 shows examples of the averaged time domain waveforms of the recorded SSVEPs overlaid on the waveforms of the stimulation signals. Here, 11 Hz in the single-frequency test T1 (Fig. 5a), 7 and 11 Hz in dual-frequency tests T21, T22, and T23 (Fig. 5b,d), and 7, 11, and 17 Hz in tri-frequency tests T31 and T32 (Fig. 5e,f) are shown as examples. The averaged waveforms were obtained by averaging the SSVEP from all participants in all four repeats, then band-pass filtering between 5 and 45 Hz, and finally cutting the 5 s data into five 1 s epochs with no overlap and averaging across all.

In single-frequency SSVEP, the waveform matches the stimulation signal very well, with harmonics visible. Dual-frequency and tri-frequency SSVEP waveforms still follow the corresponding stimulation signals in general, but the patterns are less prominent due to the additional stimulation frequencies and the complex interactions between them. It can also be observed from the plots that different stimulation methods trigger different SSVEP responses. The responses from frequency superposition ADD and the checkerboard pattern seem to follow the stimulation signal closer than the responses from frequency superposition OR.

Figure 6 shows the frequency domain components in the recorded SSVEPs. It is expected that clear peaks should be observed at the stimulation frequencies as well as their harmonics in single-frequency SSVEP⁴, and harmonics and integer linear combinations between the stimulation frequencies in multi-frequency SSVEP¹⁵. We can see by the markers labelling input frequencies and their harmonics and interactions that the expected frequency domain features are clearly visible for all stimulation types.

The frequency domain characteristics are further shown as estimated power spectral density (PSD) in Fig. 7. The PSDs were calculated using the short-time Fourier transform with the 5 s SSVEP recordings averaged across all participants in all repetitions. Each subplot in Fig. 7 shows the PSDs of each trial, or target, in each test. The stimulation frequencies of each trial/target in each test can be found in Table 1. From the figure, except the highlights on the stimulation frequencies and their harmonics and interactions, we can see that the power distribution across the spectrum is relatively consistent with slightly higher power in alpha and low-beta ranges as the number of stimulation frequency increases. This is partially due to the more complex frequency characteristics in multi-frequency SSVEPs where the integer linear combinations of the input frequencies can also be found in the recorded SSVEP.

From the above observations, we can conclude that the recorded EEG have the expected SSVEP responses.

Order profile

One important feature in multi-frequency SSVEP is the order of the interactions, which is defined as the sum of absolute values of the coefficients of interactions²⁰. Fig. 8 shows the distribution of orders in the top 10 peaks in each trial in the five multi-frequency tests. Plots on the left hand side show the number of times each order was observed in the top 10 peaks (left/blue axis) and the total number of possible combinations (right/red axis). The plots on the right then show the percentage of the combinations observed out of all possibilities at each order using the information in the left hand side plots. The percentage of occurrence on the right agrees with the previous observations that harmonics and interactions at lower order have higher chance of being observed in the top peaks²⁰.

The order distributions across the different stimulation methods only have slight variations when comparing among the same number of stimulation frequencies. However, the distributions in dual-frequency and tri-frequency show a clear difference. The decrease in the observed higher order peaks in tri-frequency may be attribute to the large number of overlapped frequencies in the harmonics and interactions, as can be seen in Fig. 6, and those peaks being identified and labelled with a lower order and so excluded in the higher order bins.

Signal-to-Noise Ratio (SNR)

Narrow-band and wide-band SNRs²³ were calculated to further demonstrate signal quality. The narrow-band SNR is the ratio between the power at the stimulation frequencies and the sum of powers in the ten neighbours of the stimulation frequencies on the spectrum (five on each side). Wide-band SNR considers the whole spectrum by taking the ratio between the sum of powers at the stimulation frequencies along with their harmonics as well as interactions (in multi-frequency) and the sum of powers of the rest of the frequencies in the spectrum. The mathematical formulations of the SNRs (in dB) in n (narrow-band) or w (wide-band) and SF (single-frequency) or MF (multi-frequency) scenarios are provided below.

Single-frequency narrow-band SNR:

$${{\rm{SNR}}}_{{\rm{n,SF}}}=1{0\log }_{10}\frac{P(F)}{{\sum }_{k=1}^{5}\left[P\left(F-k\Delta f\right)+P\left(F+k\Delta f\right)\right]},$$

(9)

where F is the frequency of interest (stimulation frequency), P is the Power Spectral Density (PSD) of the signal, and Δf is the frequency resolution of the PSD.

Single-frequency wide-band SNR:

$${{\rm{SNR}}}_{{\rm{w,SF}}}=1{0\log }_{10}\frac{{\sum }_{k=1}^{{N}_{h}}P(kF)}{{\sum }_{f=0}^{{f}_{s}/2}P(f)-{\sum }_{k=1}^{{N}_{h}}P(kF)},$$

(10)

where N_h is the number of harmonics to be considered, f_s is the sampling frequency, and f_s/2 denotes the Nyquist frequency.

Multi-frequency narrow-band SNR:

$${{\rm{SNR}}}_{{\rm{n,MF}}}=1{0\log }_{10}\frac{{\sum }_{i=1}^{{N}_{f}}P({F}_{i})}{{\sum }_{i=1}^{{N}_{f}}{\sum }_{k=1}^{5}\left[P({F}_{i}-k\Delta f)+P({F}_{i}+k\Delta f)\right]},$$

(11)

where N_f is the number of frequencies in the stimulation (dual-frequency N_f = 2, tri-frequency N_f = 3) and F_i then denotes the i^th stimulation frequency.

Multi-frequency wide-band SNR:

$${{\rm{SNR}}}_{{\rm{MF}}}=1{0\log }_{10}\frac{{\sum }_{i=1}^{n}P({\mathscr{F}}(i))}{{\sum }_{f=0}^{{f}_{s}/2}P(f)-{\sum }_{i=1}^{n}P({\mathscr{F}}i)},$$

(12)

where ${\mathscr{F}}$ is the set of frequencies including the stimulation frequencies, their harmonics and integer linear combinations up to order N_O, ${{\mathscr{F}}}_{{N}_{{\rm{O}}}}=\{{f}_{1},{f}_{2},\cdots \,,{f}_{n}\}$.

Before calculating SNRs, signals were filtered with a second-order Infinite Impulse Response (IIR) notch filter at 100 Hz with quality factor 35 to remove the harmonic of power line noise, and were averaged across all channels. All trials were considered in producing the SNR histograms.

We first compared the SNRs in T1 (single-frequency) with existing SSVEP datasets. To make the comparison as fair as possible, all trials in T1 were band-pass filtered under the same condition (between 3 and 100 Hz, using Matlab function “bandpass” with ‘ImpulseResponse’ set to ‘iir’, 0.85 ‘Steepness’, and 60 dB ‘StopbandAttenuation’), 5 s data were used in calculating the PSD to obtain a consistent 0.2 Hz frequency resolution, and number of harmonics N_h = 5. Figure 9 directly compares the narrow-band and wide-band SNRs in T1 and two publicly available SSVEP datasets: BETA dataset²³ and Benchmark dataset²¹. It can be seen from the figures that the wide-band SNR in this study is similar to that in the BETA dataset, but the narrow-band SNR is around 10 dB lower than that in the two datasets. With the differences in setup taken into consideration, the results demonstrated a satisfactory quality of signals recorded in this dataset. There are some differences between the studies. First, in this work, a wider and, on average, higher frequency range was used compared to the other two studies, which may lower the SNR because SNR decreases as frequency increases²³. Second, dry EEG electrodes were used in data collection compared to wet electrodes used in the two existing datasets. Dry electrodes are known to be more sensitive to artefacts and lead to lower decoding accuracy²⁴; however, they are more practical with the simplified set up procedure (no gelling). Third, different recording devices, sampling rates, and channel selections were used in the studies.

To examine the SNRs in multi-frequency SSVEPs, we compared the SNRs in single-frequency SSVEPs and dual- and tri-frequency SSVEPs. Different from above, we applied a band-pass filter between 5–120 Hz in Matlab to cover the 5^th harmonic of 23 Hz. Considering that complex interactions in multi-frequency SSVEP may result in adjacent integer frequencies both considered as signal, all trials were zero-padded to 10 s for a 0.1 Hz frequency resolution. This guarantees the 10 neighbours in narrow-band SNR calculation do not land on the signal frequencies. Figure 10 presents the distributions of narrow-band and wide-band SNRs in all single-, dual-, and tri-frequency trials. The histograms show the distributions of SNRs in all trials. The figures show a similar narrow-band SNR distribution in all cases with a small reduction in negative skewness as number of stimulation frequencies increases. In wide-band SNR, however, a clear reduction in variance with a positive shift in mean can be observed as number of stimulation frequencies increases. Overall, the SNRs fall within a reasonable range and demonstrated the quality of the signals in this dataset.

Decoding accuracy

In addition to the analyses of the signal characteristics, decoding accuracies were also investigated. Figure 11 summarises the decoding accuracies from all participants in all tests. Each participant is labelled with a different colour. Boxes show 25–75 percentiles, whiskers show maximum and minimum values excluding outliers, red plus signs mark outliers that are more than 1.5 times the interquartile range (box size) away from the boxes, solid magenta lines label median values, and cyan dashed lines label mean values. Asterisks label statistical significance between the two groups at 5% level (p < 0.05). Comparisons were done only between different stimulation methods with the same number of input frequencies (i.e., 2 F OR vs. 2 F ADD, 2 F OR vs. 2 F CB, 2 F ADD vs. 2 F CB, 3 F OR vs. 3 F ADD) with the Wilcoxon signed rank test.

Detailed accuracies from each participant in each test are listed in Table 3. The listed accuracy from each participant was calculated as the average accuracy they achieved in the four repeats of each test based on the number of trials correctly identified in online decoding divided by the total number of trials.

Table 3 Accuracies of each participant in each test with mean and standard error of the mean (SEM) shown in the bottom of each column.

Full size table

From Fig. 11 and Table 3, we can see that the average accuracies decrease as numbers of input frequencies increase. Overall, 85.7% (30/35) of participants achieved single-frequency accuracy over 60%, 48.6% (17/35) over 80%, 28.6% (10/35) over 90%, and 20% (5/35) over 95%. This is comparable to previous participant performances with dry electrodes²⁴.

It is worth noting that the tasks are at similar levels of difficulties for the participants in terms of visually fixating on flickering blocks presented on the computer screen. A major contributor to the differences in decoding accuracies is the modelling accuracy of the multi-frequency SSVEP that was used in the decoding algorithms. This is also one of the reasons why we created this dataset. By making such a dataset public, we welcome others to join this research field to uncover the fundamentals in this complex response and improve its performance.

For future research, training-based decoding algorithms could also be explored to advance multi-frequency SSVEP, especially those that were shown to work well in single-frequency SSVEP decoding such as Task-Related Component Analysis (TRCA)³³ and Task-Discriminant Component Analysis (TDCA)³⁴.

Usage Notes

The most straightforward way to use the data is to load it in Matlab (MathWorks Inc., USA) in .mat format. When working with data in .csv format, please keep in mind that these are large matrices that may need to be carefully taken care of in the file reading process.

The code dataset_processData.m provided with the dataset cuts data (.mat) into trials for easier access to the SSVEP recordings. Options are provided in the code to select participant(s) and session(s). A pdf version of the code is also included (dataset_processData.pdf).

Data may be used in part or in full session form to simulate an online BCI at 512 Hz sampling rate.

Metadata dataset_metadata.xlsx is also included to provide general non-identifiable participant information.

Code availability

Provided code can be found in the same repository as the dataset³⁰, named dataset_processData.m. This code is written and tested in Matlab R2020a. No additional toolbox is required to run this code. At the top of the code, there are options to set folder name and path with variable folderName. Select participants and sessions of interest for processing (cut data from whole sessions into trials) with variables Participants and Sessions. A pdf version of the code is also included (dataset_processData.pdf).

References

Wolpaw, J. R., Birbaumer, N., McFarland, D. J., Pfurtscheller, G. & Vaughan, T. M. Brain-computer interfaces for communication and control. Clinical Neurophysiology 113, 767–791 (2002).
Article PubMed Google Scholar
Zabcikova, M., Koudelkova, Z., Jasek, R. & Lorenzo Navarro, J. J. Recent advances and current trends in brain-computer interface research and their applications. International Journal of Developmental Neuroscience 82, 107–123 (2022).
Article PubMed Google Scholar
Nicolas-Alonso, L. F. & Gomez-Gil, J. Brain computer interfaces, a review. Sensors 12, 1211–1279 (2012).
Article PubMed PubMed Central ADS Google Scholar
Regan, D. Human Brain Electrophysiology: Evoked Potentials and Evoked Magnetic Fields in Science and Medicine (Elsevier, 1989).
Zander, T. O., Kothe, C., Welke, S. & Rötting, M. Utilizing secondary input from passive brain-computer interfaces for enhancing human-machine interaction. In International Conference on Foundations of Augmented Cognition, 759–771 (Springer, 2009).
Herrmann, C. S. Human EEG responses to 1–100 Hz flicker: resonance phenomena in visual cortex and their potential correlation to cognitive phenomena. Experimental Brain Research 137, 346–353 (2001).
Article CAS PubMed Google Scholar
Müller-Putz, G. R., Scherer, R., Brauneis, C. & Pfurtscheller, G. Steady-state visual evoked potential (SSVEP)-based communication: impact of harmonic frequency components. Journal of Neural Engineering 2, 123 (2005).
Article PubMed ADS Google Scholar
Volosyak, I., Cecotti, H. & Graser, A. Optimal visual stimuli on LCD screens for SSVEP based brain-computer interfaces. In 2009 4th International IEEE/EMBS Conference on Neural Engineering, 447–450 (IEEE, 2009).
Beverina, F., Palmas, G., Silvoni, S., Piccione, F. & Giove, S. User adaptive BCIs: SSVEP and P300 based interfaces. PsychNology Journal 1, 331–354 (2003).
Google Scholar
Kuś, R. et al. On the quantification of SSVEP frequency responses in human EEG in realistic BCI conditions. PLoS One 8, e77536 (2013).
Article PubMed PubMed Central ADS Google Scholar
Shyu, K.-K., Lee, P.-L., Liu, Y.-J. & Sie, J.-J. Dual-frequency steady-state visual evoked potential for brain computer interface. Neuroscience Letters 483, 28–31, https://www.sciencedirect.com/science/article/pii/S0304394010009547 (2010).
Article CAS PubMed Google Scholar
Hwang, H.-J., Kim, D. H., Han, C.-H. & Im, C.-H. A new dual-frequency stimulation method to increase the number of visual stimuli for multi-class SSVEP-based brain–computer interface (BCI). Brain Research 1515, 66–77, https://www.sciencedirect.com/science/article/pii/S0006899313004903 (2013).
Article CAS PubMed Google Scholar
Chen, X., Chen, Z., Gao, S. & Gao, X. Brain–computer interface based on intermodulation frequency. Journal of Neural Engineering 10, 066009, https://doi.org/10.1088/1741-2560/10/6/066009 (2013).
Article PubMed ADS Google Scholar
Chang, M. H., Baek, H. J., Lee, S. M. & Park, K. S. An amplitude-modulated visual stimulation for reducing eye fatigue in SSVEP-based brain–computer interfaces. Clinical Neurophysiology 125, 1380–1391, https://www.sciencedirect.com/science/article/pii/S1388245713012005 (2014).
Article PubMed Google Scholar
Mu, J., Grayden, D. B., Tan, Y. & Oetomo, D. Frequency superposition – a multi-frequency stimulation method in SSVEP-based BCIs. In 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 5924–5927 (IEEE, 2021).
Siribunyaphat, N. & Punsawad, Y. Steady-state visual evoked potential-based brain–computer interface using a novel visual stimulus with quick response (QR) code pattern. Sensors 22, 1439 (2022).
Article PubMed PubMed Central ADS Google Scholar
Mu, J., Grayden, D. B., Tan, Y. & Oetomo, D. Experimental validation on dual-frequency outperforms single-frequency ssvep with large numbers of targets within a given frequency range. In 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) (IEEE, 2023).
Liang, L. et al. Optimizing a dual-frequency and phase modulation method for SSVEP-based BCIs. Journal of Neural Engineering 17, 046026, https://doi.org/10.1088/1741-2552/abaa9b (2020).
Article PubMed ADS Google Scholar
Mu, J., Grayden, D. B., Tan, Y. & Oetomo, D. Frequency set selection for multi-frequency steady-state visual evoked potential-based brain-computer interfaces. Frontiers in Neuroscience 16 (2022).
Mu, J., Tan, Y., Grayden, D. B. & Oetomo, D. Multi-frequency canonical correlation analysis (MFCCA): a generalised decoding algorithm for multi-frequency SSVEP. In 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 6151–6154 (IEEE, 2021).
Wang, Y., Chen, X., Gao, X. & Gao, S. A benchmark dataset for SSVEP-based brain–computer interfaces. IEEE Transactions on Neural Systems and Rehabilitation Engineering 25, 1746–1752 (2017).
Article PubMed Google Scholar
Choi, G.-Y., Han, C.-H., Jung, Y.-J. & Hwang, H.-J. A multi-day and multi-band dataset for a steady-state visual-evoked potential–based brain-computer interface. GigaScience 8, giz133 (2019).
Article PubMed PubMed Central Google Scholar
Liu, B., Huang, X., Wang, Y., Chen, X. & Gao, X. BETA: A large benchmark database toward SSVEP-BCI application. Frontiers in Neuroscience 14, 627 (2020).
Article PubMed PubMed Central Google Scholar
Zhu, F., Jiang, L., Dong, G., Gao, X. & Wang, Y. An open dataset for wearable SSVEP-based brain-computer interfaces. Sensors 21, 1256 (2021).
Article PubMed PubMed Central ADS Google Scholar
Liu, B., Wang, Y., Gao, X. & Chen, X. eldBEta: a large eldercare-oriented benchmark database of SSVEP-BCI for the aging population. Scientific Data 9, 1–12 (2022).
Article Google Scholar
Renton, A. I., Painter, D. R. & Mattingley, J. B. Optimising the classification of feature-based attention in frequency-tagged electroencephalography data. Scientific Data 9, 1–17 (2022).
Article Google Scholar
Lee, M.-H. et al. EEG dataset and OpenBMI toolbox for three BCI paradigms: an investigation into BCI illiteracy. GigaScience 8, giz002 (2019).
Article PubMed PubMed Central ADS Google Scholar
Lee, Y.-E., Shin, G.-H., Lee, M. & Lee, S.-W. Mobile BCI dataset of scalp- and ear-EEGs with ERP and SSVEP paradigms while standing, walking, and running. Scientific Data 8, 1–12 (2021).
Article CAS ADS Google Scholar
Sadeghi, S. & Maleki, A. A comprehensive benchmark dataset for SSVEP-based hybrid BCI. Expert Systems with Applications 200, 117180 (2022).
Article Google Scholar
Mu, J., Liu, S., Burkitt, A. N. & Grayden, D. B. Multi-frequency steady-state visual evoked potential dataset. figshare https://doi.org/10.26188/22015694 (2023).
Lin, Z., Zhang, C., Wu, W. & Gao, X. Frequency recognition based on canonical correlation analysis for SSVEP-based BCIs. IEEE Transactions on Biomedical Engineering 53, 2610–2614 (2006).
Article PubMed Google Scholar
Mu, J., Tan, Y., Grayden, D. B. & Oetomo, D. Linear diophantine equation (LDE) decoder: a training-free decoding algorithm for multifrequency SSVEP with reduced computation cost. Asian Journal of Control 25, 3292–3304 (2023).
Article MathSciNet Google Scholar
Nakanishi, M. et al. Enhancing detection of SSVEPs for a high-speed brain speller using task-related component analysis. IEEE Transactions on Biomedical Engineering 65, 104–112 (2017).
Article PubMed Google Scholar
Liu, B. et al. Improving the performance of individually calibrated SSVEP-BCI by task-discriminant component analysis. IEEE Transactions on Neural Systems and Rehabilitation Engineering 29, 1998–2007 (2021).
Article PubMed Google Scholar

Download references

Acknowledgements

This research was funded by the Australian Government through the Australian Research Council Training Centre in Cognitive Computing for Medical Technologies (project number IC170100030).

Author information

Authors and Affiliations

Department of Biomedical Engineering, The University of Melbourne, Parkville, Victoria, 3010, Australia
Jing Mu, Shuo Liu, Anthony N. Burkitt & David B. Grayden
Graeme Clark Institute, The University of Melbourne, Parkville, Victoria, 3010, Australia
Jing Mu & David B. Grayden

Authors

Jing Mu
View author publications
You can also search for this author in PubMed Google Scholar
Shuo Liu
View author publications
You can also search for this author in PubMed Google Scholar
Anthony N. Burkitt
View author publications
You can also search for this author in PubMed Google Scholar
David B. Grayden
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.M., A.N.B. and D.B.G. designed the study. J.M. and S.L. conducted the experiment. J.M., A.N.B. and D.B.G. analysed the results. J.M. drafted the manuscript. S.L., A.N.B. and D.B.G. reviewed the manuscript.

Corresponding author

Correspondence to Jing Mu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Mu, J., Liu, S., Burkitt, A.N. et al. Multi-frequency steady-state visual evoked potential dataset. Sci Data 11, 26 (2024). https://doi.org/10.1038/s41597-023-02841-5

Download citation

Received: 06 April 2023
Accepted: 11 December 2023
Published: 04 January 2024
DOI: https://doi.org/10.1038/s41597-023-02841-5

Subjects

Abstract

Similar content being viewed by others

An open dataset for human SSVEPs in the frequency range of 1-60 Hz

Improving user experience of SSVEP BCI through low amplitude depth and high frequency stimuli design

Optimising the classification of feature-based attention in frequency-tagged electroencephalography data

Background & Summary

Methods

Participants

EEG Setup

Stimulation setup

Single-frequency stimulation

Dual-frequency stimulation

Tri-frequency stimulation

Experimental protocol

Experiment structure

Trial structure

Test structure

Online decoding

CCA

MFCCA

LDE

Data Records

EEG Data

Metadata

Technical Validation

Signal profile in time and frequency domains

Order profile

Signal-to-Noise Ratio (SNR)

Decoding accuracy

Usage Notes

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links