Introduction

Many disciplines in the natural sciences are increasingly dealing with densely sampled multidimensional datasets. The scientific workflows to obtain and process them are becoming increasingly complex due to the provenance and structure of the data and the information needed to be extracted and analyzed1,2. In materials science and condensed matter physics, various spectroscopic and structural characterization techniques produce experimental data of distinct formats and characteristics. Their creation and understanding require customized processing and analysis pipelines designed by specialists in the respective fields. The growing incentive for building experimental materials science databases3 that complement established theoretical counterparts4 calls for open-source and reusable workflows for data processing5,6 that transform raw data to shareable formats for downstream query, analysis and comparison by non-specialists of the experimental techniques7,8. Among the various properties associated with materials, the electronic band structure (EBS) of condensed matter systems is of vital importance to the understanding of their electronic properties in and out of equilibrium. Multidimensional photoemission spectroscopy (MPES)9,10,11 is an emerging technique that bears the potential of high-throughput EBS characterization through band mapping experiments and holds promise as an enabling technology for building experimental EBS databases, where data integration requires traceable knowledge of the processing steps between the archived and the raw format. Here we present an open-source workflow that focuses on band mapping data from MPES. In the following, we briefly introduce the technology of MPES and the associated data processing, before providing details on the workflow from raw data to database integration.

MPES, also called momentum microscopy (MM), is born out of the recent integration of time-of-flight (TOF) electron spectrometers with delay-line detectors (DLDs) and improved electron-optic lens designs12,13,14,15. Compared with the earlier generations of angle-resolved photoemission spectroscopy (ARPES)16,17 using hemispherical analyzers to measure the 2D energy-momentum distribution of the photoemitted electrons18, MPES is capable of recording single-electron events simultaneously sorted into the (kx, ky, E) coordinates (E: electron energy, kx, ky: parallel momentum components) in band mapping experiments, obviating the need for scanning across sample orientations and subsequent data merging as is the case for similar experiments using a single hemispherical analyzer. Operation of the TOF DLD in MPES requires a pulsed photon source and is directly compatible with 3rd and 4th generation light sources19 as well as laboratory-based table-top setups20,21,22,23, harnessing their high repetition rates in the range of multi-kilohertz to megahertz to drastically improve the detection speed and efficiency. Mapping of the 3D band structure with sufficient signal-to-noise ratio (SNR) may be achieved on the timescale of minutes. The technological convergence opens up the possibilities to record 3D datasets in dependence of one or more additional parameters, such as spatial location I(x, y, kx, ky, E), probe photon energy, I(kx, ky, E, kz)10, spin-polarization, I(kx, ky, E, S)9, or pump-probe time in time-resolved MM, I(kx, ky, E, t)24 within a reasonable time frame.

From the data perspective, the pulsed sources with high repetition rates generate densely sampled data at rates of multiple megabytes per second (MB/s), which has brought about challenges in data processing and management compared with conventional ARPES experiments. The raw data in MPES are single photoelectron events registered by the DLD and the physical quantities related to the detected events are streamed in parallel to the storage files in a hierarchical file format (e.g. HDF525). A typical dataset involves 107–1010 detected events with a total size of up to a few hundred gigabytes (GBs), depending on the number of coordinates measured (3D or 4D) and the required SNR. Unlike the large 2D or 3D image-based datasets, such as those obtained in various forms of optical26,27 and electron microscopy techniques28,29, processing and conversion of tabulated single-event data requires additional steps of statistical computing for conversion into standard images. This motivates the current workflow development for efficient data processing and analysis. In data processing and calibration, experiments performed at different facilities share similar procedures going from the raw events to the multidimensional hypervolume with calibrated axes, which is the basis for archiving and downstream analysis. To maintain reproducibility for the particular data source characteristics, data structure and processing procedure, we have summarized the workflow (see Fig. 1) into two open-source software packages (hextof-processor30 and mpes31), with similar design principles for coping with large-scale facility and table-top experiments, respectively. The core of our approach includes distributed statistical processing at the single-event level using parameters calibrated and determined from preprocessed volumetric datasets, which enables effective instrument diagnostics, artifact correction, and sample condition monitoring. The algorithms involved balance physical knowledge and existing methods in image processing and computer vision. The workflow is illustrated next with data obtained at some of the electron momentum microscopes currently in operation, such as the HEXTOF (high energy X-ray time-of-flight) measurement system24 at the free-electron laser source FLASH32 at DESY, and the table-top high harmonic generation-based setup at the Fritz Haber Institute (FHI)21 involving a commercial TOF and DLD (METIS 1000, SPECS GmbH). We use the material example of tungsten diselenide (WSe2) measured at both experimental setups to demonstrate the workflow execution, because in momentum space, the patent features of WSe2 band structure and the nonequilibrium dynamics initiated by optical excitation of WSe2 have been thoroughly studied in the past (see Methods)24,33,34,35,36. We expect the workflow described here to serve as a blueprint for upcoming software platforms in similar setups to be installed in other facilities or laboratories worldwide.

Fig. 1
figure 1

Schematic of the workflow in MPES. The data acquisition in MPES starts from (1) photoelectrons liberated by the extreme UV (XUV) or X-ray photons travelling through the lens systems and the TOF tube to trigger detection events on the delay-line detector (DLD). (2) Single-event data acquisition is monitored and controlled by the measurement controller computer. The raw data are first streamed and stored onto a hard drive in HDF5 format (.h5) and subsequently processed in the workflow through (3) file reduction (optional), (4)(6) distributed binning, (5) artifact correction and axis calibrations, carried out at the single-event or the binned data levels. At the end of the workflow, other data formats are generated (such as HDF5, MAT or TIFF) for (7) storage, visualization or downstream analysis for extracting relevant physical parameters. Critical parameters within the workflow may be exported (as workflow parameters files), shared and reused for processing other datasets.

Results

Workflow description

The workflow schematic shown in Fig. 1 starts with raw single-event data from measurements. The data are (i) binned in a distributed fashion in the measurement coordinates, including each of the photoelectrons’ position on the detector (X, Y), its TOF, a digital encoder (ENC) axis, and others, if more than four dimensions are acquired in parallel. The binned histogram is (ii) used to estimate the numerical transforms for distortion correction and axis calibration. Next, these transforms are (iii) applied to the raw single-event data to convert the measurement coordinates to the physical axes, (kx, ky, E, tpp) and others for higher dimensions (see also Fig. 2). Finally, the single-event data are (iv) binned in the transformed grid to yield 3D, 3D + t or other higher-dimensional data with the correct axis values. The outcome may be exported in different formats for storage, visualization and downstream analysis.

Fig. 2
figure 2

Examples of workflow components. Illustrations are given for artifact correction and axis calibration. Characteristic 1D distributions of the measured X, Y, TOF, ENC and an arbitrary axis are shown on the very left. U(0,1) represent uniformly distributed random noise added to suppress digitization artifacts (jittering or dithering). The transforms (g’s) are calibration functions that convert the values in the measurement axes to the physical ones. The transform \({\mathcal{L}}\) (X,Y) corrects the symmetry distortion, while the spherical timing aberration and space charge are compensated for by ΔTOFsph and ΔTOFsc, respectively. Binning of the corrected single-event data over the calibrated physical axes yields a multidimensional hypervolume (right picture) of photoemission intensity data along with the physical axes values.

Tasks and software infrastructure

Processing billion-count single-event data requires user interaction for data checking and distributed processing to reduce the time consumption. The general tasks in the workflow include the transformation of data streams to multidimensional histograms, artifact correction and axis calibration. These operations can be efficiently decomposed into column-wise operations of the distributed dataframe format offered by the dask package37 in Python. While the use of dask dataframes provides the common foundation for interactivity with single events in hextof-processor and mpes, they distinguish themselves by the experimental requirements.

At large-scale facilities, experiments often record a large number of machine parameters that need to be stored, though only a small number of relevant parameters are needed for downstream processing. Therefore, the hextof-processor package includes a parameter sampling step to retrieve intermediate tabulated data in the Apache Parquet format (https://parquet.apache.org/), a column-based data structure optimized for computational efficiency. This approach reduces the processing overhead in searching through the raw data files every time when data are queried during the subsequent processing. As an open-source project, other beamtime-specific functionalities are added by users to the existing framework at every new experimental run. The mpes package adapts to the much simpler file structure produced at table-top experimental setups and makes direct use of the HDF5 raw data. It comes with added functionalities motivated by the existing issues encountered in data acquisition and downstream processing such as axis calibration, masking, alignment and different forms of artifact correction. The softwares come with detailed documentation and examples online for users to gain familiarity (see Code availability).

Artifact correction

Artifacts in MPES data come from mechanical imperfections, stray fields (electric and magnetic), uncertainties in the alignment of the sample, light beams and the multistage electron-optic lens systems as well as the data digitization process. Minimizing and correcting instrumental imperfections plays an important role in the validity of downstream analysis. We carry out artifact correction sequentially at the level of single photoelectron events or the data hypervolume obtained from multidimensional histogramming (see Fig. 2). The outcomes are illustrated using the correction of (1) digitization artifact (see Fig. 3) and (2) spherical timing aberration artifact (see Fig. 4), with technical details in Methods.

Fig. 3
figure 3

Digitization artifact correction by histogram jittering. Removal of the digitization artifact is illustrated with a 2D k-E slice across the Brillouin zone center (at 0 \({{\rm{\AA }}}^{-1}\) momentum) of the band mapping dataset measured at FHI on WSe2. The images before and after histogram jittering and their difference are shown in (ac) respectively. A zoomed-in section of the data are shown in the insets in (ac). The effective removal of the digitization artifact is further demonstrated in the momentum-integrated energy distribution curves in (d). The traces in (d) are computed by averaging horizontally over their corresponding 2D images in (ac).

Fig. 4
figure 4

Spherical timing aberration correction. The correction is demonstrated using W4f core-level data measured at FLASH. The energy spacing between the W4f7/2 and W4f5/2 levels is about 2.1 eV34. (a) Illustration of the geometric origin of the spherical timing aberration in the time-of-flight (TOF) drift tube. (b). Comparison of the W4f spectra at the center and on the edge of the detector plane. The energy spectra are extracted from the corresponding regions, marked by the dots in the same blue and red colors, respectively, in (c). The white stripes crossing at the detector center block the exposed edges of the four-quadrant detector quadrants. (d). The uncorrected and corrected radial-averaged peak TOF positions for the W4f7/2 core level.

Axis calibration

To transform the measurement axes of the DLD into physically relevant axes for electronic band mapping, calibrations are required, as shown in Fig. 2. The calibration functions are constructed with parameters derived from comparing physical knowledge of the materials (e.g. Brillouin zone size, Fermi level position) with the corresponding scales in data. They are applied either to the binned data hypervolume, or to the single-electron events raw data individually in a distributed fashion before binning. Details on the calibration data transforms are provided in Methods.

Data storage and format

The simplistic form of the output data hypervolume derived from single-electron events includes non-negative scalar values of the photoemission intensity and the calibrated real-valued axes coordinates, including kx, ky, E, and other parameter dependencies such as the pump-probe time delay tpp. These values are exported as HDF5, MAT or TIFF, with the metadata included as attributes of the files.

Workflow archiving and reuse

Computational workflows are valued by their reproducibility38. Archiving and sharing the workflow parameters among users of the beamlines or facilities allow comparison between experimental runs and reuse for the simultaneous benefits of machine diagnostics and experimental efficiency. To achieve this, we store critical parameters generated within the workflow in a separate file as workflow parameters (see Fig. 1) during each step, including the numerical values used in binning, the intermediate parameters and coefficients of the correction and calibration functions, etc. They can be reused when loading into the processing of other datasets.

Data visualization

The adaptation of established scientific visualization methods in the physical sciences39,40 to band mapping data should incorporate the requirements and knowledge of the data characteristics in this field of research. The band mapping data in 3D (multi-megavoxel) and 3D + t (multi-gigavoxel) include the inherent symmetries from the electronic band structure of the material, but the intensity modulations in the photoemission process41, dynamics and sample condition disrupt the original symmetry. The overall goal is to emphasize the features of interest while exploiting the symmetry to simplify the visualization (see Methods). The output files from the processing pipeline are compatible with open-source visualization software such as matplotlib42, ParaView39 and Blender43.

Downstream analysis integration

Typical photoemission data analysis involves extracting electronic band structure parameters, physical coupling constants and lifetimes via fitting of lineshapes16 or dynamical models44, often carried out specific to the material under study. At the end of our distributed workflow, the data size is on the order of a few to tens of gigabytes, which can be directly loaded into memory on users’ local machines for downstream data analysis with custom routines.

Experimental metadata

The metadata of the data files have a tree structure and contain information of the experimental setting, parameters of the pulsed light source, the detector and the sample under study. A list of top-level metadata parameters is presented in Table 1. A full and current list of all metadata parameters, including the top-level parameters and their constituent lower-level parameters, along with their definitions, units and values, is provided in Supplementary Tables 14. For database integration, an accompanying data parser (parser-mpes, see Code availability) for MPES data has been written in accordance with existing standards45 for computational materials science in NOMAD8, featuring an electronic version of the metadata parameter list in the file mpes.nomadmetainfo.json online. The metadata parameter list and the data parser are versioned and are updated based on the corresponding changes in the data structure for photoemission spectroscopy experiments. The existing WSe2 photoemission data have been integrated into the experimental section of the materials science database NOMAD (see Data availability).

Table 1 Top-level metadata parameters.

Discussion

We have designed and implemented an open-source, end-to-end workflow for processing single-event data produced in multidimensional photoemission spectroscopy, linking to downstream tasks, providing guidelines and software for integrating processed data into the NOMAD experimental materials science database. The distributed processing takes full advantage of the single-event data streams directly accessible from the TOF delay-line detector for event-wise correction and calibration and converts the raw events to the calibrated data hypervolume for project-specific downstream analysis. The functionalities within the workflow are publicly accessible through the software packages we have developed (hextof-processor30 and mpes31). The processing workflow is archived at each step of operation and the processed data may be integrated into experimental database with user-specified metadata. The methods described here are applicable to all existing types of multidimensional photoemission band mapping measurements beyond the static and time-dependent settings described here.

Our end-to-end workflow from raw data to processed data to database integration provides a fast-track and all-in-one solution to the demands for open experimental data and reproducible research in the materials science community7,8. The public repositories for the software packages are the foundations for phased future extension and integration with existing analytical tools in the photoemission spectroscopy community. The modular structure of the packages introduced here allows targeted upgrades by both temporary and dedicated maintainers and users. Casting the workflow in the Python programming environment provides the foundation for convenient incorporation of existing image processing and machine-learning resources46 for further exploration and understanding of the band mapping datasets, which contain rich information owing to the complex nature of the photoemission process16,18. This is especially beneficial for broader adoption of photoemission since the interpretation of photoemission data is often linked to the observed or extracted outstanding features such as local intensity extrema, dispersion kinks and satellites, lineshape parameters and pattern symmetry16, therefore, the access to experimental data and the potential integration with existing electronic structure-related software5,47,48,49 will facilitate method developments and the direct comparison between experimental results and theoretical band structure calculations within the same programming platform.

Methods

Sample preparation

Single-crystalline samples of 2D bulk WSe2 (2H stacking) were purchased from HQ Graphene. Crystals of size around 5 mm × 5 mm × 1 mm were used directly for the measurements. To prepare a clean surface by cleaving, we attached a cleaving pin upright to the sample surface using conducting epoxy (EPOTEC H20) outside the vacuum chamber and removed the pin by mechanical force in ultrahigh vacuum.

Photoemission experiments

The measurements were conducted using the HEXTOF instrument24 at the DESY FLASH PG-2 beamline50 with the free-electron laser (FEL) as well as a laboratory source21 with a METIS electron momentum microscope (SPECS METIS 1000) installed at the FHI. In the measurements at FLASH, the FEL was tuned to 36.5 eV (or 34.0 nm) and 109 eV (or 11.4 nm), the optical pump pulse had a center wavelength of 775 nm. The measurements at the FHI used a 21.7 eV home-built extreme UV source based on high harmonic generation in Ar gas driven by an optical parametric chirped-pulse amplifier operating at 500 kHz repetition rate51. The optical pump pulse is centered at 800 nm. In both FEL and laboratory experiments, the near-infrared light pulses promote the electronic population at the K and K′ high-symmetry points (corresponding to \(\bar{{\rm{K}}}\) and \(\bar{{\rm{K}}{\prime} }\) points, respectively, in the projected Brillouin zone obtained from photoemission, as shown in Fig. 5) in momentum space to the excited states via direct optical transitions. The nonequilibrium electronic dynamics are probed via valence and conduction band photoemission35 as well as core-level photoemission36, using s-polarized extreme UV and soft X-ray probe pulses, respectively.

Fig. 5
figure 5

Typical visual representations of the volumetric band mapping data. The examples are illustrated using band mapping data of the layered semiconductor WSe2, measured with the HEXTOF instrument at FLASH (a,c) and the METIS detector at the FHI (b). The visualizations are (a). the orthoslice representation, (b). the band-path diagram (right) with the momentum path labelled in dashed blue line in the momentum kx-ky plane (left), and (c). the cut-out view. All color scales represent photoemission intensity. The letters label the high symmetry points of the hexagonal Brillouin zone of WSe233. The intensity features on the upper side nearby the \(\bar{{\rm{K}}}\) and \(\bar{{\rm{K}}{\prime} }\) points result from nonequilibrium electronic populations following photoexcitation of electrons to the conduction band, while the intensities below are from valence band photoemission (see Methods for experimental conditions).

Digitization artifact

The time-to-digital converter (TDC) outputs digitized data according to the binning width of the on-board electronics. Data conversion from one digitized format to another in a rebinning process often creates a picket fence-like effect (see Fig. 3). This phenomenon originates from the incommensurate bin size in the two rounds of sampling processes (binning and rebinning). To solve the problem, one introduces a slight amount of uniformly distributed noise, with an amplitude equal to half of the original bin size, to the single-event values when carrying out the bin counts. This is similar to the histogram jittering (or dithering) technique52,53 used in statistical visualization and computer graphics. Mathematically, the uniformly distributed noise U(0,1) bounded in the range [0,1] is added before binning to a univariate data stream, S = {Si} via,

$$S{{\prime} }_{i}={S}_{i}+\frac{{w}_{b}}{2}\times U(0,1).$$
(1)

here, wb is the bin width. For binning of multivariate data streams, such as the detector X position (or kx), Y position (or ky), and the photoelectron TOF (or E), we adopt the same approach individually for each dimension. The effect of jittering in reducing the digitization artifact is demonstrated in Fig. 3.

Spherical timing aberration

Electrons entering the TOF tube at different lateral positions travel through different path lengths to reach the detector, which is the origin of the spherical timing aberration as illustrated in Fig. 4. The lateral position-dependent time delay may be expressed as,

$$\Delta {{\rm{TOF}}}_{{\rm{sph}}}(r)=(\sqrt{1+{r}^{2}/{d}^{2}}-1){{\rm{TOF}}}_{0},$$
(2)

where r is the radial distance from the center of the DLD and TOF0 is the TOF normalization constant. For a typical field-free region length of d1 m in the TOF tube and a DLD screen radius of r = 50 mm, \(\Delta {\rm{TOF}}/{{\rm{TOF}}}_{0}\approx 1.25\,\times \,1{0}^{-3}\). Assuming TOF0 = 0.5 μs, the spherical timing aberration in TOF scale is \(\Delta {{\rm{TOF}}}_{{\rm{sph}}}\approx 0.62\) ns, which is larger than the DLD’s temporal resolution of 0.15 ns. The effect of the spherical timing aberration is visible for a few eV energy range with fine bins but quite small on a large energy range. To illustrate this effect, we use the W 4f core-level data presented in Fig. 4b. For every (X, Y) position on the detector the peak of W 4f7/2 was fitted with a Voigt profile and the peak positions are shown in Fig. 4c. As the spectra from deep core levels typically do not show dispersion, the deviation from fitting corresponds to the spherical timing aberration of the electron optics. In order to compensate for the spherical timing aberration, we first transform the data from Cartesian to the polar coordinates (see Fig. 4c), and then fit the radial-averaged peak position to a polynomial function of the radius,

$$\Delta {{\rm{TOF}}}_{{\rm{sph}}}(r)=\frac{{r}^{2}{{\rm{TOF}}}_{0}}{2{d}^{2}}-\frac{{r}^{4}{{\rm{TOF}}}_{{\rm{0}}}}{8{d}^{4}}+{\rm{O}}({r}^{6}).$$
(3)

The fitting results together with the corrected radial distribution are presented in Fig. 4d.

Symmetry distortion

Photoemission patterns in the (kx, ky) plane (i.e. an energy slice) may exhibit distorted symmetry due to the influence of various factors from the instrument, the sample and the experimental geometry on the trajectory of low-energy photoelectrons. Correction of the symmetry distortion yet preserving the intensity features requires the use of symmetry-related landmarks to solve for the symmetrization coordinate transform in the framework of nonrigid image registration54. In typical situations with an excellent electron lens alignment, the energy dependence of the momentum distortion within the focused phase space volume covering an energy range of several eV is negligible, so the same coordinate transform can be applied to all energy slices in the volumetric data (including both valence and conduction bands) or simultaneously to all single events.

Other single-experiment artifacts

(1) Momentum center shift: The momentum center of the emergent photoelectrons travelling through the electron-optic system may experience an energy-dependent shift owing to the slight misalignment in the system or the influence of stray fields. Correction of the center shift requires an energy-dependent center alignment of energy slices. The shift along the energy (or TOF) axis may be estimated using phase correlation55 or mutual information-based56 sequential image registration methods, in which the series of energy slices are treated as an image sequence. In a well-shielded and well-aligned electron-optic lens system, generally, the momentum center shift is negligible in the focused photoelectron energy range. (2) Space-charge effect (SCE): The secondary photoelectron clouds originating from the probe and pump pulses cause a “doming effect” of the photoemission intensity distribution around the momentum center of the band structure. This is especially visible in systems with a clear Fermi edge9,11 or non-dispersing shallow core levels, which may be used as references for calibrating the parameters used for the flattening transform by applying a momentum-dependent shift \(\Delta {{\rm{T}}{\rm{O}}{\rm{F}}}_{{\rm{s}}{\rm{c}}}({k}_{x},{k}_{y})\) in the TOF (or the calibrated energy) coordinate of the single-event data.

Momentum calibration

The scaling factors for momentum calibration are computed by comparing the positions of known high symmetry points in the band structure with their corresponding locations in an energy slice. Suppose A and B are two high symmetry points identifiable (e.g. as local extrema) from the experimental data with pixel positions (XA, YA) and (XB, YB), and momentum positions, (\({k}_{x}^{A}\), \({k}_{y}^{A}\)) and (\({k}_{x}^{B}\), \({k}_{y}^{B}\)), respectively. We calculate the pixel-to-momentum scaling ratios, fX and fY, along the X (column) and Y (row) directions of a 2D k-space image, respectively. Then, the momentum coordinate (kx, ky) at each pixel position (X, Y) may be derived.

$${f}_{D}=\left({k}_{d}^{A}-{k}_{d}^{B}\right)/\left({D}_{A}-{D}_{B}\right)$$
(4)
$${k}_{d}={f}_{D}\times (D-{D}_{A})\quad (D,d=X,x\,{\rm{or}}Y,y)$$
(5)

Energy calibration

The calibration requires a set of band mapping data measured at different bias voltages (applied between the material sample and the ground), usually sampled with a spacing of 0.5 V in a range of ± 3–5 V around the normally applied bias voltage for a particular sample. The calibration proceeds by finding the TOF feature (e.g. local extrema) correspondences in the 1D energy distribution curves (EDCs) at different biases using the dynamic time warping algorithm57. The transformation from the TOF to the photoelectron energy E is approximated as a polynomial function,

$$E({\rm{TOF}})=\mathop{\sum }\limits_{i=0}^{n}{a}_{i}{{\rm{TOF}}}^{i}$$
(6)

The approximation is sufficiently accurate within a range of 20 eV, sufficient to cover the entire valence band and some low-lying conduction bands of typical materials. The polynomial coefficients are determined using nonlinear least squares by solving \(\Delta T\cdot {\boldsymbol{a}}=\Delta E\), in which \({\boldsymbol{a}}={({a}_{1},{a}_{2},...)}^{T}\) is the coefficient vector while the constant offset a0 is determined by manual alignment to an energy reference, such as the Fermi level or valence band maximum. The vector ΔE and the matrix ΔT contain, respectively, the pairwise differences of the bias voltages and the polynomial terms of differential TOF values. To calibrate a large energy range including multiple core levels, a piecewise polynomial may be used11.

Pump-probe delay calibration

The time origin (“time zero”) in time-resolved photoemission spectroscopy, i.e. the temporal overlap of pump and probe pulses, is determined by fitting of a characteristic trace extracted from the data. Since the readings of the digital encoder (see Fig. 2) are sampled linearly, equally-spaced pump-probe delays are directly convertible from the readings using linear interpolation, given the boundary values of the translation stage positions and the corresponding delay times. For unequally-spaced delays, a delay marker is first added to each data point as a separate column after data acquisition to group together the encoder reading ranges that correspond to the specific time delays. The data binning is carried out over the delay marker column instead of the equally-sampled encoder readings.

Visualization strategies

We discuss here three methods for the display of volumetric band mapping data, which are, at the same time, the basis for visualizing 3D + t data with time as an animated axis. (1) The orthoslice representation includes orthogonal 2D planes selected in specific regions in the 3D volume39, which highlights specific slices deep within the data less visible in a volumetrically rendered view (see Fig. 5a). Along this line, we have developed a software package, 4Dview58, to explore 4D data using simultaneously linked orthoslices, which also features contrast adjustment and data integration within a hypervolume of interest. (2) The band-path plot (see Fig. 5b) is a 2D representation of the 3D band mapping volume generated by combining a series of 2D cuts along selected momentum paths (or k-paths) traversing a list of so-called high-symmetry points59,60. This representation captures the largest dispersion within the band structure. For volumetric data, the same path may be sampled from all the full energy range to produce the plot shown in Fig. 5b. The analysis and visualization modules in the mpes package include functionalities to compose customized band-path plots. (3) The cut-out view (see Fig. 5c) exposes a specific part of interest in the volumetric data, while not losing the rest39. The analysis module in the mpes package provides ways to generate precise cut-outs using position landmarks (e.g. high-symmetry points labelled in Fig. 5) and inequalities.