An optimization case study for solving a transport robot scheduling problem on quantum-hybrid and quantum-inspired hardware

We present a comprehensive case study comparing the performance of D-Waves’ quantum-classical hybrid framework, Fujitsu’s quantum-inspired digital annealer, and Gurobi’s state-of-the-art classical solver in solving a transport robot scheduling problem. This problem originates from an industrially relevant real-world scenario. We provide three different models for our problem following different design philosophies. In our benchmark, we focus on the solution quality and end-to-end runtime of different model and solver combinations. We find promising results for the digital annealer and some opportunities for the hybrid quantum annealer in direct comparison with Gurobi. Our study provides insights into the workflow for solving an application-oriented optimization problem with different strategies, and can be useful for evaluating the strengths and weaknesses of different approaches.


Introduction
Quantum computing (QC) is a field that has witnessed a rapid increase in interest and development over the past few decades since it was theoretically shown that quantum computers can provide an exponential speedup for certain tasks (Deutsch, Jozsa 1992;Grover 1996;Shor 1994).Translating this potential into a practically relevant quantum advantage, however, has proven to be a very challenging endeavor.Nevertheless, the emerging field is considered to have a highly disruptive potential for many domains, for example in machine learning (Schuld, Sinayskiy, Petruccione 2015), chemical simulations (Cao et al. 2019) and optimization (Li et al. 2020), the domain of this work.Due to the fact that optimization problems are of utmost importance also for industrial applications, we investigated a potential advantage of quantum and quantum-inspired technology for the so-called transport robot scheduling problem (TRSP), a real-world use-case in optimization that is derived from an industrial application of an automatized robot in a high-throughput laboratory.The optimization task is to plan a time-efficient schedule for the robot's movements as it transports chemical samples between a rack and multiple machines to conduct experiments.This is an NP-hard problem which for certain instances can be challenging to solve using classical computing techniques, and hence is an attractive candidate to search for an advantage with non-classical techniques.
In our study, we compared the solution quality and runtime of different solvers on a large set of instances of the problem.As solvers, we considered D-Wave's hybrid Leap framework (LBQM) that makes use of the D-Wave quantum annealer (D-Wave Systems Inc. 2020), Fujitsu's digital annealer (FDA) (Nakayama et al. 2021), Fujitsu's digital annealer hybrid framework (FDAh), as well as the industry-grade Gurobi solver (Gurobi Optimization, LLC 2023).As a key element of this work, we provide three different models for the TRSP that follow different design philosophies.This is justified by the different ways in which the problem task can be modelled and the inherent differences in the problem formulations that the solvers addressed can accept.LBQM, FDA and FDAh are restricted to a formulation as a quadratic unconstrained binary optimization (QUBO), whereas a mixed integer program (MIP) with integer and float variables can be used by Gurobi, which makes a comparison of multiple formulations meaningful.
The TRSP considered in this paper is a special combination of different scheduling problems that, to our knowledge, has not been considered before.Scheduling problems have been studied intensively for several decades and classical algorithms exist for numerous variants (Brucker 2007;Pinedo 2016).Since most of the industry-relevant scheduling problems are NP-hard, these classical algorithms mainly consist of meta-heuristics or use general-purpose MIP solvers, which basically solve the problem using a branch and bound approach with several additional improvements like cutting planes.In addition to classical algorithmic developments, a considerable amount of research has also been done in hardware-based parallel computing, especially in general purpose computation on graphics processing unit (GPGPU) parallelization (Chakroun et al. 2013;Awasthi et al. 2016).The problem discussed in this work is an extension of the typical job shop scheduling problem (JSSP), where the inclusion of a robot adds additional restrictions.More specifically, the studied scheduling problem falls into the category of robotic cell scheduling and automated guided vehicles (AGV) scheduling problems.Most work on robotic cell scheduling deals with infinite cyclic schedules (Dawande et al. 2007).This comprises polynomial-time algorithms and hardness results (Steiner, Xue 2005), MIP techniques (Phillips, Unger 1976;Brucker, Burke, Groenemeyer 2012;Feng, Che 2013) and heuristic approaches (Liu, Kozan 2017).Many efficiently solvable and hard special cases have been identified (Shabtay, Arviv 2016) and heuristics have been proposed for some of the hard cases (Stern, Vitner 1990).Those problems differ from our use case in one way or another.The problems considered by the above-cited papers allow, unlike our use case, that the jobs can wait at a machine after their completion before being picked up by the robot.Robotic cell scheduling problems without this possibility have been studied by (Agnetis 2000;Agnetis, Pacciarelli 2000), whose problems differ from our, among others, in the considered objective function.Our objective function, the total job completion time, has been extensively studied for flow shop scheduling problems without a robot (Pinedo 2016;Hall, Sriskandarajah 1996;Allahverdi 2016;Röck 1984), the latter of which shows that the no-wait variant is strongly NP-hard on two machines.Apart from the no-wait constraint, the problem considered in our work is characterized by the fact that jobs have to go to the last machine several times.Such settings are known as a re-entrant flow shops, for which (Jing, Huang, Tang 2011) developed a heuristic algorithm.
We are mainly interested in the performance of non-standard solution approaches using quantum or quantum-inspired solvers in this study.Because these solvers rely on heuristics, benchmarks for real-world applications are a highly relevant research topic.Most quantum optimization approaches fall into two major groups, one for gate-based hardware and one for annealing-based hardware (Alexeev et al. 2021).The majority of gate-based approaches to optimization use parameterized gates to find the ground state of a Hamiltonian related to the cost function of the optimization problem in a quantumclassical hybrid fashion, for example via the quantum approximate optimization algorithm (QAOA) (Farhi, Goldstone, Gutmann 2014;Blekos et al. 2023).
Approaches based on quantum annealing also seek to find the ground state of a Hamiltonian, but by aiming for an adiabatic change from an initial state that can be easily prepared.In contrast to actual quantum computing devices, other classical software and hardware components are merely inspired by quantum computing, for example FDA (Aramon et al. 2019) and Toshiba's Simulated Bifurcation Machine (TSB) (Tatsumura, Dixon, Goto 2019).Typically, optimization tasks for quantum solvers and the aforementioned quantum-inspired technologies are modeled as QUBO problems (Kochenberger et al. 2014).An in-depth analysis of pure QUBO comparison on four quantum and quantum-inspired solvers can be found in (Oshiyama, Ohzeki 2022).In their work, the authors compare the solutions of a library of quadratic benchmark problems on the D-Wave quantum annealer, FDA, and TSB against each other.
QC has already been successfully used for optimization in various fields.For example, in (Mizuno, Komatsuzaki 2023), chemical reaction networks are optimized with quantum computing.In (Streif et al. 2021), it is shown that using the QAOA, it is possible to beat some classical heuristic algorithms on the binary paint shop problem.However, some work has shown that the current circuit model algorithms are not always adequate enough to reach significant convergence required for a good solution (Awasthi et al. 2023).Quantum annealing has proven to offer some advantage against the classical simulated annealing algorithm for a spin-glass problem, using D-Wave hardware (Raymond et al. 2023), but this is no conclusive evidence.In one of the more recent works on quantum annealing (Schuetz et al. 2022), the authors suggest a nature inspired hybrid quantum algorithm for robot trajectory optimization for PVC sealing in a real industrial setting.In (Ebadi et al. 2022), the authors present a solution to the maximum independent set (MIS) problem using a Rydberg atom device, along with a claim of a possible super-linear quantum speed-up against classical simulated annealing.Other classical algorithms might still be superior to a quantum approach on current devices (Albash, Lidar 2018).Several works consider scheduling problems (Yarkoni et al. 2021;Carugno, Ferrari Dacrema, Cremonesi 2022;Tomasiewicz et al. 2020).In (Geitz et al. 2022), an AGV transportation problem using different classical and quantum approaches is studied and (Ikeda, Nakamura, Humble 2019) investigates a nurse scheduling problem with the usage of a quantum annealer.
The remaining manuscript is structured as follows.We provide a detailed description of the TRSP and its mathematical modeling in Section 2. In Section 3, we describe the design of our numerical study and list the problem instances and solvers that we use.The results of this study are presented in Section 4. Finally, we conclude our study in Section 5. Detailed model descriptions, solver information, further information on the benchmark setup and instance lists are contained in the supplementary material (referenced by a preceding "S" to the label it is referring to).

Transport Robot Scheduling Problem
In this section, we present a detailed explanation of the TRSP, which is a real-world use case derived from one of BASF's high-throughput laboratories.This optimization problem is about finding the most timeefficient route of a transport robot tasked with moving chemical samples from one processing machine to another.In the following, we first provide a general description of the problem setup and then present different modeling approaches.These models build the foundation of the subsequent benchmarks.

Problem Description
The laboratory we are modeling consists of a sample rack and three different processing machines: a water mixer, a sample shaker and a photo booth.And, finally, the robot itself that is tasked with carrying chemical samples from one place to another with the goal to conduct chemical experiments.Only the experimental plan (i.e., how each sample has to be processed in the laboratory) is predefined in advance, but not the specific order of the experiments.Initially, a certain number of samples is stored on the rack.Each of these samples needs to be first taken to the water mixer, then to the sample shaker.Once the sample shaking is completed, one or more photos have to be taken of each sample at the photo booth.Consecutive photos need to be taken after specific (i.e., predefined) time intervals, where the first photo of each sample has to be taken immediately after the shaking process.Finally, each sample has to be brought back to the rack.The processing times for different samples on the same machine can be different as specified by the experimental plan.We assume that each machine can only hold (and process) one single sample at any given time, or remain idle, and the processing steps cannot be interrupted before their completion.It is required that a machine starts processing a sample as soon as the sample is brought by the robot.Moreover, we assume that a sample has to be moved by the robot in-between two processing steps.Hence, a sample has to be lifted from a machine (and the machine is made available) as soon as it finishes processing.
By definition, the robot requires exactly one time unit to move from any place to any other, with or without a sample, and picking up or dropping a sample does not require extra time.Like the machines, the robot can transport only a single sample at any given time or drive empty or remain idle.In particular, it is not possible that the robot places a sample at a machine and picks up another at the same time.
The objective of this scheduling task is to minimize the sum of sample completion times, i. e., the sum of the times when the samples arrive at the the rack after their last photo has been taken.The solution of this optimization problem is a sequence of tasks for the robot that yields an efficient laboratory operation.

Mathematical Modeling
In our benchmark, we test three modeling approaches against each other.On the quantum and quantuminspired side we consider a QUBO formulation, whereas on the classical side we use two MIP formulations.First, a so-called sequence model and second, a so-called time-indexed model.In the following, we first introduce the common terminology for all modeling approaches.Next, we shortly sketch the main features of each model.For a more detailed description, we refer to Section S1.The motivation for the development of multiple models is to carry out a comparison between the solutions obtained by the most suitable problem encoding for quantum and classical solvers.This ensures that we are comparing the best of both worlds (classical and quantum), and do not restrict ourselves to a model which is more suitable for quantum over classical computing.

Common Terminology
The processing machines are addressed by M1 for the water mixer, M2 for the sample shaker and M3 for the photo booth.The scheduling time is discretized into time slots which all have length of one time unit.The transport robot takes one time unit for each operation that is either transportation or empty traversal between the machines and the rack.In this way, each transport robot scheduling problem is uniquely determined by the number of samples to be scheduled N ≥ 1, the number of photos K ≥ 1, which agrees for each sample j ∈ {1, . . ., N }, the processing times pj,1, pj,2, pj,3 ∈ N>0 for machines M1, M2 and M3, which can vary for each sample j ∈ {1, . . ., N } and the time gaps g j,k ∈ N ≥2 to be kept between consecutive photos k and k + 1 for k ∈ {1, . . ., K − 1}, which also can vary for each sample j ∈ {1, . . ., N }.As an example, Fig. 1 provides a feasible schedule in form of a Gantt chart to visualize these parameters.

QUBO Model
A general QUBO reads min for some matrix Q ∈ R n×n , where x represents a vector of n binary optimization variables.Two challenging properties of QUBOs must be taken into account in the modeling.Since only binary variables are allowed, this implies that other types of variables must be avoided, i. e. a reformulation into a binary form is necessary.Second, the problem is unconstrained.This restriction can be overcome by using penalty terms, which are quadratic functions in the model variables that evaluate to a positive value when the current assignment of values to the variables leads to an infeasible solution.Typically, the penalty terms are designed to yield 0 if the corresponding solution is feasible, so that they do not contribute to the objective values of feasible solutions.More general information about QUBOs and their properties can be found, e. g., in (Kochenberger et al. 2014;Lucas 2014;Glover, Kochenberger, Du 2018).When a sample is processed on one of the machines or carried by the robot in the time-frame [t, t ′ ], a bar is drawn from t to t ′ in the respective row in a corresponding color.Empty movements of the robot are not drawn explicitly.For example, at time t = 13 the robot is at the rack as sample 1 has been brought to the sample rack from t = 12 to 13.It takes one unit of time for the robot to travel from the rack to the water mixer to pick up sample 2 at t = 14.From t = 22 to t = 23, the sample is brought from the photo booth to the rack and back, which is a consequence of the assumption that a sample has to be moved by the robot in-between two processing steps.The objective value of the depicted schedule is 19 + 26 = 45.
Our proposed QUBO model for the TRSP is based on the well-known starting time formulation (see e. g. (Carugno, Ferrari Dacrema, Cremonesi 2022)) and can be written as where F is the objective function and P1, . . ., P7 denote the penalty functions and ρ0, . . ., ρ7 ∈ R>0 are tunable parameters that have to be chosen such that the objective and penalty terms are suitably balanced.As in equation (1), n represents the total number of binary optimization variables.These have a distinct meaning that can be identified with three indices.Specifically, xj,m,t := 1, if sample j starts processing on machine Mm at time t, 0, otherwise for all j ∈ {1, . . ., N }, m ∈ {1, 2, 3} and t ∈ {1, . . ., T − 1}.Here, T denotes the time horizon, which is chosen in such a way that there is enough time to schedule all samples sequentially, implying that there is at least one feasible solution.It can be explicitly computed for each instance as described in Section S1.1.In terms of Fig. 1, one has, for example, x1,1,1 = 1 and x1,2,8 = 1.The penalty terms for the QUBO model have to be formulated using the binary optimization variables.This section only provides an example for such a term, a complete description can be found in Section S1.1.Specifically, we consider here the constraint that each sample must access the machines M1 and M2 exactly once, which can be achieved by This term evaluates to zero if and only if for each pair of sample j and machine Mm, the variable xj,m,t is 1 for precisely one time slot t.Since P1 is bounded below by 0 due to its quadratic nature, each local minimum of P1 is a feasible solution w.r.t. the rule of machine access to M1 and M2.The other penalty terms can be formulated similarly.
Finally, the objective function F sums up for each sample the time when the sample arrives at the rack after the entire scheduling process ("sum of sample completion times").For example, the objective function in the case of Fig. 1 evaluates to 45 time units.

MIP Models
MIPs have been used since the late 1950s as a tool for solving scheduling problems.It is not possible to model the disjunctive constraints resulting from the discrete ordering decisions only by means of starting time variables.Different types of binary variables have been proposed to achieve this.The main types are position variables x ijk indicating if job j is the kth job on machine i (Wagner 1959), linear ordering variables δ ijk deciding if job j is processed before job k on machine i (Manne 1960) and time-indexed variables xijt specifying that job j is started (or processed or completed) on machine i at time t (Bowman 1959;Pritsker, Waiters, Wolfe 1969).(Ku, Beck 2016) compared these three approaches experimentally for a job shop scheduling problem.
Due to the powerful nature of (mixed) integer programming in contrast to the restrictive nature of the QUBO models, we provide two MIP models to be solved using Gurobi, where we follow two stateof-the-art approaches for formulating scheduling problems as MIPs (Pinedo 2016).The first one, in the following named sequence model, makes use of continuous start time and binary linear ordering variables.The second model, called the time-indexed model, is restricted to a binary formulation comparable to the QUBO model, where we make use of time-indexed variables.The latter provides a model with a natural vicinity to the QUBO formulation whereas the sequence model exploits the features of MIP formulations.In this sense we provide a baseline from two different angles, one for each solution approach.

MIP: Sequence Model
In the sequence model, we model sequences of events that affect the behavior of the transport robot with respect to the machines and the photos of a sample.We define the set of events as An event e = (j, i, 0) represents either that a sample j is placed on machine Mi for i ∈ {1, 2} or to the (i − 2)th photo shoot for i > 2, an event (j, i, 1) corresponds to picking it up again.For each event e ∈ E we define an optimization variable τe ∈ R ≥0 to model the time for event e to happen.In terms of Fig. 1, we have, for example, τ (1,1,0) = 1 and τ (1,1,1) = 4.A simple formulation can be achieved by additionally introducing a binary variable for each pair e, f ∈ E, e ̸ = f of events that indicates if e occurs before f .We reduce the size of the model by exploiting the fact that the ordering of some events is fixed or coupled.For example, we do not need a variable that specifies the order in which a given sample is brought to the water mixer and to the sample shaker.This leads to three sets of linear ordering variables that can be found in Section S1.2, as well as the various constraints to ensure feasibility.The objective function (i.e., the sum of the sample completion times) can be easily expressed using the variables τe corresponding to events when a sample is picked up from the last photo.

MIP: Time-Indexed Model
The second constrained model makes use of discrete time-indexed variables similar to the QUBO model from Section S1.1.In this formulation, we model the behavior of the transport robot by defining certain routes a sample can be transported along, which include those from the rack to all machines and back or movements between subsequent machines.The numbering of the moves is shown in Fig. S1.
The constraints of the model are similar to the penalty terms of the QUBO Model and are listed in Section S1.3.The objective function (i.e., the sum of the sample completion times) is defined in terms of the ancilla optimization variables zj for j ∈ {1, . . ., N }, that are bounded below by the arrival time of sample j at the rack after the schedule has finished.

Benchmark Setup
In the present section, we describe the design of the benchmark.We start with an outline of the considered problem instances that are listed in more detail in Section S2.Subsequently, we describe the three different commercial technologies that we use.

Instances
To set the stage for our benchmark, we specify 260 test instances of our optimization problem of interest, each defined by a different set of parameters.Specifically, each instance is uniquely determined by the number of samples N , the number of photos K, the gaps g j,k between subsequent photos k and k + 1 for k ∈ {1, . . ., K − 1} and j ∈ {1, . . ., N }, and, finally, the processing times pj,1, pj,2, pj,3 of the water mixer, sample shaker and photo booth, respectively, as explained in Section 2.2.For the sake of simplicity, the processing time of the photo booth agrees for all samples of the same instance, that is pj,3 := p3 for all j ∈ {1, . . ., N }.
In Section S2, we describe the algorithm that was used to generate parameter sets for the benchmark instances.Since the resulting instances span a wide range of complexity, we divide the resulting benchmark library into two parts, where each part is defined by the number of binary variables in the corresponding QUBO formulation from Section 2.2.2 as explained in Section S1.1 in more detail.The first part, which we call library of minor instances, contains all 161 instances that have at least 2071 and at most 8080 binary variables.The second part, which we call library of major instances, contains the remaining 99 instances with at least 10 822 and at most 22 692 binary variables.The reason for that specific division is that 8192 is the maximal amount of variables that can be solved directly on Fujitsu's digital annealer.
We collect groups of instances (N, K) that have the same number of samples and photos as shown in Fig. S2, i. e., within those groups the leftover parameters pj,m and g j,k for j ∈ {1, . . ., N }, m ∈ {1, 2, 3} and k ∈ {1, . . ., K − 1} may vary.These groups can be understood as a collection of "similar" TRSPs in the sense that the complexity of the tasks to be solved is comparable.However, some instances may still be easier or more difficult to solve than others in practice.This grouping approach allows us to consider statistical metrics over several instances when we compare models and solvers.Moreover, it allows us to estimate the scaling behavior of different solution approaches.In Section S2, we list how many instances each group contains.

Quantum and classical solvers
In our benchmark, we solve the generated instances with a selection of model and solver combinations with the main goal to assess the performance of quantum and quantum-inspired technology.Specifically, we consider three solver candidates: 1. Gurobi: As a baseline, we use the branch and bound algorithm of Gurobi, which is a state-of-theart mathematical programming solver running on classical hardware (Gurobi Optimization, LLC 2023).In summary, it relies on an implicit enumeration that allows the original problem to be split into smaller sub-problems using a decision tree.The use of lower bounds derived from linear programming (LP) relaxations allows for a reduction of the search space.Gurobi is an all-purpose solver that can in principle solve the proposed optimization problems to a guaranteed optimality in a deterministic fashion (given sufficient time).In this work we utilized the cloud based service of Gurobi solver, which ran on a Intel(R) Xeon(R) Platinum 8275CL CPU (3.00 GHz with 8 physical cores).
2. D-Wave's hybrid Leap framework (LBQM): D-Wave provides cloud-based access to their adiabatic quantum computers with over 5000 qubits (D-Wave Systems Inc. 2020).By design, their hardware is specifically tailored to solve QUBOs.To this end, the QUBO is encoded in an Hamiltonian such that each optimization variable is represented by one qubit (Zbinden et al. 2020) and the ground state corresponds to the optimal solution.The quantum annealing mechanism aims to find the ground state by performing a suitable time evolution of the quantum system with a subsequent measurement of all qubits to reveal the optimal solution.The D-Wave hardware has only limited connectivity, which means that each qubit can only interact with a certain number of other qubits.This limitation restricts the correlations between optimization variables that can be represented by the Hamiltonian.Finding a suitable representation with these constraints is an NP-hard problem (Lobe, Lutz 2022) that has to be solved classically to configure the quantum annealer for a certain problem.In practice, the quantum annealer can typically only be used for QUBOs with much less than 5000 optimization variables.
For this reason, D-Wave also provides a hybrid software framework LBQM, which is a black-box algorithm for binary quadratic models (BQMs) that runs on both classical and quantum annealing hardware.It allows larger optimization problems that are too big for the quantum hardware to be handled by presenting only parts of the original problem to the quantum annealer.However, the exact mode of operation of LBQM is not publicly available.In this study, we use only the quantum annealer in a hybrid fashion via LBQM.The quantum machine used in the hybrid framework is the D-Wave Advantage System 4.1 and the region na-west-1.We choose to use a constant number of 1000 samples (or readouts) for all evaluations and use default settings for all parameters.
3. Fujitsu's digital annealer (FDA) and Fujitsu's digital annealer hybrid framework (FDAh): The digital annealer from Fujitsu can be considered as a quantum-inspired algorithm that runs on dedicated (classical) hardware (Aramon et al. 2019) and can be accessed using a cloud service.It is based on simulated annealing (Kirkpatrick, Gelatt, Vecchi 1983;Černý 1985) with two major differences.Firstly, the utilization of an efficient parallel-trial scheme to exploit the parallelization capabilities of the hardware and, secondly, a dynamic escape mechanism to avoid locally optimal solutions.The detailed hardware specifications are confidential.The solver supports QUBOs with up to 8192 variables.
In addition, the hybrid solver FDAh is provided to solver bigger problem instances by utilizing both dedicated and classical hardware (Nakayama et al. 2021) similar to D-Wave's LBQM.In this study, we use both FDA and FDAh.Both solvers require a set of parameters that specify how the annealing is done, which also include the number of repetitions and parallel runs on the chip.
The specific parameters we used for FDA and FDAh are provided in Section S3.
In a small pre-study, we excluded a few other solvers; see Section S4.The main scope of the paper is to benchmark the performance of quantum-hybrid and quantum-inspired technologies on the TRSP on a high level against an all-purpose solver with an out-of-the-box performance.In this sense, we also exclude meta-heuristics that are tailor-made to the problem as well.
Each instance can be modelled with each of the three modeling approaches from Section 2. However, not all solvers are applicable to all problem formulations and all instances.The MIP sequence model is solved with Gurobi for all instances.The time-indexed model is solved with Gurobi only for the minor instances.The QUBO model is solved with LBQM and FDA for minor instances.For major instances, the QUBO model is only solved with FDAh.
We call each valid model and solver combination an approach and use a unique name to refer to it.Summarized, we consider Gurobi with the sequence model (SE-GU), Gurobi with the time-indexed model (TI-GU), LBQM with the QUBO model (QU-LBQM), FDA with the QUBO model (QU-FDA) and FDAh with the QUBO model (QU-FDAh).An overview over all approaches is shown in Fig. 2.
For all problems, we prescribe a runtime limit of 3600 seconds for Gurobi.This limit was determined on a heuristic basis, since initial experiments have shown that Gurobi can solve the considered problem instances on this time scale with a practically relevant quality.This time limit exceeds the runtimes of LBQM, FDA and FDAh by far to provide Gurobi enough time to return solutions that are suitable for a relative comparison (see Fig. 4).
Both LBQM and FDA also require a time limit for each run, which scales with the problem size in the QUBO formulation as follows.The time limit for LBQM is set to be min{100, 1.5 • n 100 } seconds, where n is the number of variables in the QUBO formulation for the minor instances.The runtime of the digital annealer is implicitly set with the steps parameter, where each step taken in the annealing process takes a constant amount of time.We set the number of steps to be 10 7 for the instances with 2071 ≤ n ≤ 4096, 5 • 10 7 for the ones with 4096 < n ≤ 6000 and 10 8 for the instances with 6000 < n ≤ 8080 variables in the QUBO formulation.Lastly the major instances computed with the hybrid framework FDAh based on the digital annealer require a time limit as well.For this we distributed the available time of 5 hours to the instances, correspondingly to their number of variables.This computes approximatively as n • 0.0117 seconds where n is the number of variables in the QUBO formulation.
The benchmark setup is summarized in Table 1, where we recall the approaches from Fig. 2. The table also contains the values of the QUBO parameters ρ0, . . ., ρ7 from Eq. (S16) that were chosen for LBQM, FDA and FDAh, respectively.The choice was made according to previous experiments with smaller problem instances.For this purpose, a typical strategy is to iteratively increase the parameter ρi if the corresponding penalty term Pi is non-vanishing.Additionally, one needs to make sure that the parameter ρ0 for the target function is set such that it is not in favor to violate penalty terms and a good optimization is achieved.Some solutions of the library of minor instances have not been solved to feasibility by LBQM, i. e., the solution vector returned does not translate to a feasible schedule of the TRSP.Those instances can be identified by having an objective value of at least 10 4 , which is the minimum of the penalty parameters chosen for the QUBO model according to Table 1.This can be seen as follows: the parameters of the library of minor instances are bounded as N ≤ 9, K ≤ 4, p3,j ≤ 3, p1,j ≤ 8, p2,j ≤ 4, g1,j ≤ 5, g2,j ≤ 12 and g3,j ≤ 24 for j = 1, . . ., N .Using those upper bounds we compute a maximal time horizon of T = 648 time units for those instances.It follows that the sum of sample completion times is bounded above by 9 • 648 = 5832 < 10 4 , i. e., a solution to an instance of the library of minor instances is feasible if and only if it has an objective value below 10 4 .Of course this does neither apply to the library of major instances nor to the solutions of FDA or FDAh as they have lower penalty parameters due to prestudies with the smallest instances.In a general setup a way to identify infeasible solutions is to store the penalty term 7 i=1 Pi(x) and evaluate the solution with it.The solution is feasible in this case if and only if the penalty term evaluates to 0 on it.
Table 1: Benchmark setup: Summary of problem instances from Section 3.1 and solvers from Section 3.2 for the optimization problems (or models) from Section 2.

Property
Minor Approach (cf.Fig. 2) Used for minor instances Used for major instances Approach (cf.Fig. 2) Minor instance limit Major instance limit SE-GU 3600 s 3600 s TI-GU 3600 s -QU-LBQM min{100, 1.5 4 Benchmark Results In the current section, we present the results of our previously described benchmark, which is summarized in Table 1.For this purpose, we first show the results for the minor instances and subsequently the results for the major instances.

Results for Minor Instances
In Fig. 3, we show the objective values and runtimes of several approaches as scatter plots.All runtimes are end-to-end runtimes, that is, we consider the entire evaluation pipeline, beginning with the submission of the problem to the solver and ending with the return of a solution, including potential network delays.The programmatic construction of the optimization problem for the application programming interface (API) of the solver based on the instance data is not part of the runtime.From Fig. 3a, we can observe that both the SE-GU and TI-GU solutions reach a better objective value than the solutions from QU-LBQM and QU-FDA.When comparing objective values, it has to be taken run and look for the earliest computation time at which Gurobi has reached an objective value that is less than or equal to the corresponding objective value returned by the competing solvers for the same instance.We call this earliest computation time the relative runtime.Specifically, we consider the relative runtime of TI-GU w.r.t.QU-LBQM (TI-GU@QU-LBQM), the relative runtime of SE-GU w.r.t.QU-LBQM (SE-GU@QU-LBQM), the relative runtime of TI-GU w.r.t.QU-FDA (TI-GU@QU-FDA) and the relative runtime of SE-GU w.r.t.QU-FDA (SE-GU@QU-FDA).In the special case that Gurobi is not able to find an objective value of the desired quality within its limit of 3600 seconds (which only occurs for some major instances), this time limit is used in place of the earliest computation time.Exemplarily, we consider a specific instance to visualize TI-GU@QU-LBQM and TI-GU@QU-FDA in Fig. 4.

Objective values
Runtimes / s Figure 4: Visualization of the relative runtime of TI-GU w.r.t.QU-LBQM and QU-FDA, denoted by TI-GU@QU-LBQM and TI-GU@QU-FDA, respectively.Here, consider the example instance (7, 4, 3)(3); see supplementary material.The orange dots (connected by lines for better visualization) mark the resulting objective values of TI-GU at the corresponding time steps.The horizontal upper, blue and lower, green line mark the final objective value of QU-FDA and QU-LBQM, respectively, on the same instance.The blue and green lines intersect with the orange lines at some point.The time coordinate of the next lower TI-GU objective value after this intersection represents the relative runtime of TI-GU w.r.t. the solver, which is marked as a vertical line in the corresponding color.In other words, the relative runtime represents how long TI-GU has to run until it reaches an objective value that is at least as good as the result from QU-LBQM or QU-FDA, respectively.
The results of this analysis are shown in Fig. 5.This plot shows that QU-LBQM is not able to compete with SE-GU.All problems from the first 4 out of 9 instance groups have been solved with SE-GU in under 1 second while the remaining instances in less than 10 seconds, whereas the QU-LBQM runtimes range between 50 and 100 seconds.However, LBQM finds a comparable solution faster than TI-GU for most problems with 6 or more samples and remains competitive for smaller problems.A Welch-t test confirms that the mean of TI-GU runtime is larger than the one of QU-LBQM runtime with a significance over 99%.
Furthermore, Fig. 5b shows that QU-FDA is outperformed by SE-GU as well.Analogous to Fig. 5a, the instances in groups (4, 4), (5, 3), (5, 4) and (6, 3) have been solved by SE-GU in 1 second or less.But in contrast to Fig. 5a, the other groups have their median between 1 second and 10 seconds, i. e., which reflects that the target objectives from QU-FDA are lower than those from QU-LBQM (see Fig. 3a).Nonetheless, the time taken for SE-GU to reach the solution quality of QU-FDA is 10 to 100 times smaller.Regarding TI-GU, QU-FDA finds a comparable solution almost always faster with a few exceptions.

Results for Major Instances
The results for major instances are presented in analogy to the results for minor instances from the previous section.In Fig. 6, we show the runtime and the target value of the solvers on the corresponding models as scatter plots.The objective values of QU-FDAh are worse than the ones of SE-GU with a significance of over 97%, but Fig. 6b shows that the runtime of SE-GU increases strictly until it reaches the upper bound for the computation time of 3600 seconds, which happens for ca.15 samples.On the other hand, the computation time of QU-FDAh ranges between 120 and 300 seconds, where only a slight increase can be seen.
Analogously to Fig. 5b, we evaluate the earliest computation times of SE-GU model to reach objective values equal to or lower than the objective values obtained from QU-FDAh, denoted by the relative runtime of SE-GU w.r.t.QU-FDAh (SE-GU@QU-FDAh).The results are shown in Fig. 7.
In Fig. 7, a strictly increasing computation time can be seen for SE-GU, whereas the QU-FDAh runtime remains almost constant.For the biggest instances with N = 20 samples, QU-FDAh has a clear advantage with respect to the computation time, whereas it is competitive to SE-GU for the instances with 15 samples.In this sense QU-FDAh finds a solution of comparable quality much faster for problems with 20 samples than SE-GU and the latter was not able to prove optimality for some of the instances with 20 samples.A Welch t-test confirms with a significance of over 99% that the QU-FDAh mean is lower than the SE-GU@QU-FDAh mean.

Conclusion and Outlook
This paper presents a thorough benchmarking of an industrially relevant use case of combinatorial optimization, the transport robot scheduling problem (TRSP) with the goal to achieve a time-optimal robot schedule, as motivated by a BASF high-throughput laboratory.We solve a large set of instances for this optimization problem with varying difficulty using three commercially available solvers: (i) the D-Wave's hybrid Leap framework, (ii) the quantum-inspired Fujitsu digital annealer and (iii) the classical state-of-the-art solver Gurobi.To this end, we develop several mathematical models: a quadratic unconstrained binary optimization (QUBO) model for the quantum and digital annealer and two different mixed integer program (MIP) models for Gurobi, which we call time-indexed and sequence model, respectively.Modeling the same problem in different, solver-specific forms helps us to optimally assess the capabilities of each solver.In total, we compare five different approaches (i.e., model and solver combinations as sketched in Fig. 2): (i) Gurobi with the time-indexed model (TI-GU), (ii) Gurobi with the sequence model (SE-GU), (iii) D-Wave's hybrid Leap framework (LBQM) with the QUBO model (QU-LBQM), (iv) Fujitsu's digital annealer (FDA) with the QUBO model (QU-FDA) and (v) Fujitsu's digital annealer hybrid framework (FDAh) with the QUBO model (QU-FDAh).For our performance study, we separated all problem instances into two groups.First, the minor instances with problems less than 10 000 binary variables in the QUBO formulation and, second, the major instances with problems with more than 10 000 and up to 22 000 variables.For practical reasons, we only solve the minor instances with SE-GU, TI-GU, QU-LBQM and QU-FDA, whereas the major instances are only solved with SE-GU and QU-FDAh, respectively.Our benchmark reveals insights both regarding the objective values of the optimization problem (i.e., the sum of sample completion times) as well as the end-to-end runtimes for the considered approaches.Regarding the objective values, we observe for minor instances that SE-GU and TI-GU give similar results, outperforming QU-FDA, which in turn outperforms QU-LBQM (cf.Fig. 3a).For major instances, SE-GU outperforms QU-FDAh (cf.Fig. 6a).Regarding the runtime, we find that for smaller instances TI-GU takes the highest time and SE-GU takes mostly the lowest.Between these two extremes, QU-FDA and QU-LBQM take about the same amount of time (cf.Fig. 3b).However, the runtime of SE-GU significantly increases with increasing instance complexity.This same observation continues for the large instances, for which the runtime of SE-GU is mostly larger than that of QU-FDAh (cf.Fig. 6b).
To get further insights into the relationship between objective value and runtime, we also studied the relative runtime of Gurobi, that is the time that Gurobi took to find an objective value that is at least as good as the final result from another approach.For minor instances, we find that the relative runtimes of SE-GU w.r.t.QU-LBQM and QU-FDA, respectively, are strictly lower than the runtimes of QU-LBQM and QU-FDA, i. e., Gurobi found solutions of comparable quality faster than the quantum and quantum-inspired approaches (cf.Figs.5a and 5b).This is not surprising since SE-GU tended to find better objectives in shorter time.For major instances, the relative runtimes of SE-GU w.r.t.QU-FDAh increase significantly with increasing instance complexity and clearly exceed the runtime of QU-FDAh for the biggest instances (cf.Fig. 7).Thus, QU-FDAh shows an advantage on some bigger instances.Although the resulting objective values of QU-FDAh were not optimal, the approach shows a clear advantage on some bigger instances when compared to SE-GU on a similar time scale.
Our benchmark spans instances of different scales and therefore allows qualitative estimation of the scaling behavior of different approaches.Specifically, we observe that TI-GU and SE-GU show a runtime that scales exponentially with the instance complexity (as estimated by the number of samples and photos), whereas the runtime of QU-LBQM, QU-FDA and QU-FDAh remains almost constant.The quality of the solutions is not significantly determined by the instance complexity.Further research is needed to investigate and quantify these observations in more detail.
Summarized, no general advantage of the quantum and quantum-inspired solvers was found.However, for certain instances the quantum-inspired hybrid usage of the Fujitsu digital annealer turned out to be a very promising alternative to Gurobi and was clearly superior to the usage of D-Wave's hybrid Leap framework.Our study is not a conclusive result but rather an application-oriented case study that provides a snapshot of the current technology and leaves room for performance improvements on the modeling as well as the solver side.For example, an improvement of the quantum annealer inside the hybrid framework might be possible with additional problem-specific fine-tuning of the annealing schedule or other hardware-related parameters.Moreover, the recently released constrained quadratic model (CQM) solver from D-Wave also promises to provide much better performance compared to the solver used in this work.Especially in an agile field such as quantum computing, a technology snapshot such as ours can hardly provide any forecasts about future developments.Therefore, in order to preserve an up-to-date assessment, further practical evaluations for real-world use cases will be necessary.The methods and results from this project can serve as a blueprint or at least point of reference for this kind of ongoing research.

Figure 1 :
Figure 1: An example Gantt chart of a robot transport scheduling problem with N = 2 samples and K = 2 photos.Tasks associated with sample one (two) are colored blue (red).When a sample is processed on one of the machines or carried by the robot in the time-frame [t, t ′ ], a bar is drawn from t to t ′ in the respective row in a corresponding color.Empty movements of the robot are not drawn explicitly.For example, at time t = 13 the robot is at the rack as sample 1 has been brought to the sample rack from t = 12 to 13.It takes one unit of time for the robot to travel from the rack to the water mixer to pick up sample 2 at t = 14.From t = 22 to t = 23, the sample is brought from the photo booth to the rack and back, which is a consequence of the assumption that a sample has to be moved by the robot in-between two processing steps.The objective value of the depicted schedule is 19 + 26 = 45.

Figure 2 :
Figure 2: Summary of model (see Section 2.2) and solver (see Section 3.2) combinations for the benchmarks.

Figure 6 :
Figure 6: Benchmark results for major instances as scatter plots.The results are grouped into sets of instances (N, K) as for previous the plots.Abbreviations according to Fig. 2.

Figure 7 :
Figure7: Benchmark results for major instances as scatter plots.We show the relative runtime of SE-GU w.r.t.QU-FDAh, denoted by SE-GU@QU-FDAh, in analogy to Fig.5.We also show the runtime of QU-FDAh from Fig.6b.The results are grouped into sets of instances (N, K) as for previous plots.Abbreviations according to Fig.2.