Refine Search

New Search

Results: 19

(searched for: doi:10.1145/3352460.3358313)
Save to Scifeed
Page of 1
Articles per Page
by
Show export options
  Select all
Lingling Lao, Dan E. Browne
Proceedings of the 49th Annual International Symposium on Computer Architecture; https://doi.org/10.1145/3470496.3527394

Abstract:
Simulating quantum systems is one of the most important potential applications of quantum computers. The high-level circuit defining the simulation needs to be compiled into one that complies with hardware limitations such as qubit architecture (connectivity) and instruction (gate) set. General-purpose quantum compilers work at the gate level and have little knowledge of the mathematical properties of quantum applications, missing further optimization opportunities. Existing application-specific compilers only apply advanced optimizations in the scheduling procedure and are restricted to the CNOT or CZ gate set. In this work, we develop a compiler, named 2QAN, to optimize quantum circuits for 2-local qubit Hamiltonian simulation problems, a framework which includes the important quantum approximate optimization algorithm (QAOA). In particular, we exploit the flexibility of permuting different operators in the Hamiltonian (no matter whether they commute) and propose permutation-aware techniques for qubit routing, gate optimization and scheduling to minimize compilation overhead. 2QAN can target different architectures and different instruction sets. Compilation results on four applications (up to 50 qubits) and three quantum computers (namely, Google Sycamore, IBMQ Montreal and Rigetti Aspen) show that 2QAN outperforms state-of-the-art general-purpose compilers and application-specific compilers. Specifically, 2QAN can reduce the number of inserted SWAP gates by 11.5X, reduce overhead in hardware gate count by 68.5X, and reduce overhead in circuit depth by 21X. Experimental results on the Montreal device demonstrate that benchmarks compiled by 2QAN achieve the highest fidelity.
Gokul Subramanian Ravi, Kaitlin N. Smith, Pranav Gokhale, Andrea Mari, Nathan Earnest, Ali Javadi-Abhari, Frederic T. Chong
2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA) pp 288-303; https://doi.org/10.1109/hpca53966.2022.00029

Abstract:
Variational Quantum Algorithms (VQA) are one of the most promising candidates for near-term quantum advantage. Traditionally, these algorithms are parameterized by rotational gate angles whose values are tuned over iterative execution on quantum machines. The iterative tuning of these gate angle parameters make VQAs more robust to a quantum machine’s noise profile. However, the effect of noise is still a significant detriment to VQA’s target estimations on real quantum machines — they are far from ideal. Thus, it is imperative to employ effective error mitigation strategies to improve the fidelity of these quantum algorithms on near-term machines.While existing error mitigation techniques built from theory do provide substantial gains, the disconnect between theory and real machine execution characteristics limit the scope of these improvements. Thus, it is critical to optimize mitigation techniques to explicitly suit the target application as well as the noise characteristics of the target machine.We propose VAQEM, which dynamically tailors existing error mitigation techniques to the actual, dynamic noisy execution characteristics of VQAs on a target quantum machine. We do so by tuning specific features of these mitigation techniques similar to the traditional rotation angle parameters -by targeting improvements towards a specific objective function which represents the VQA problem at hand. In this paper, we target two types of error mitigation techniques which are suited to idle times in quantum circuits: single qubit gate scheduling and the insertion of dynamical decoupling sequences. We gain substantial improvements to VQA objective measurements — a mean of over 3x across a variety of VQA applications, run on IBM Quantum machines.More importantly, while we study two specific error mitigation techniques, the proposed variational approach is general and can be extended to many other error mitigation techniques whose specific configurations are hard to select a priori. Integrating more mitigation techniques into the VAQEM framework in the future can lead to further formidable gains, potentially realizing practically useful VQA benefits on today’s noisy quantum machines.
Poulami Das, Christopher A. Pattison, Srilatha Manne, Douglas M. Carmean, Krysta M. Svore, Moinuddin Qureshi, Nicolas Delfosse
2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA) pp 259-273; https://doi.org/10.1109/hpca53966.2022.00027

Abstract:
Quantum computers promise computational advantages for many important problems across various application domains. Unfortunately, physical quantum devices are highly susceptible to errors that limit us from running most of these quantum applications. Quantum Error Correction (QEC) codes are required to implement Fault-Tolerant Quantum Computers (FTQC) on which computations can be performed without encountering errors. Error decoding is a critical component of quantum error correction and is responsible for transforming a set of qubit measurements generated by the QEC code, called the syndrome, into error locations and error types. For the feasibility of implementation, error decoders must not only identify errors with high accuracy, but also be fast and scalable to a large number of qubits. Unfortunately, most of the prior works on error decoding have focused primarily only on the accuracy and have relied on software implementations that are too slow to be of practical use. Furthermore, these studies only look at designing a single decoder and do not analyze the challenges involved in scaling the storage and bandwidth requirements when performing error correction in large systems with thousands of qubits.In this paper, we present AFS, an accurate, fast, and scalable decoder architecture that is designed to operate in the context of systems with hundreds of logical qubits. We present the hardware implementation of AFS, which is based on the Union Find decoding algorithm and employs a three-stage pipelined design. AFS provides orders of magnitude higher accuracy compared to recent SFQ-based hardware decoders (logical error rate of 6×10 −10 for physical error rate of 10 −3 ) and low decoding latency (42ns on average), while being robust to measurement errors introduced while extracting syndromes during the QEC cycles. We also reduce the amount of decoding hardware required to perform QEC simultaneously on all the logical qubits by co-designing the micro-architecture across multiple decoding units. Our proposed Conjoined-Decoder Architecture (CDA) reduces the storage overhead by 70% (10MB to 2.8MB). Finally, we reduce the bandwidth overheads required to transmit syndromes from the qubits to the decoders by exploiting the sparsity in the syndromes and compressing the data. Our proposed Syndrome Compression reduces the bandwidth requirement by 30x, on an average.
Poulami Das, Aditya Locharla, Cody Jones
Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems; https://doi.org/10.1145/3503222.3507707

Abstract:
The error rates of quantum devices are orders of magnitude higher than what is needed to run most quantum applications. To close this gap, Quantum Error Correction (QEC) encodes logical qubits and distributes information using several physical qubits. By periodically executing a syndrome extraction circuit on the logical qubits, information about errors (called syndrome) is extracted while running programs. A decoder uses these syndromes to identify and correct errors in real time, which is necessary to prevent accumulation of errors. Unfortunately, software decoders are slow and hardware decoders are fast but less accurate. Thus, almost all QEC studies so far have relied on offline decoding. To enable real-time decoding in near-term QEC, we propose LILLIPUT– a Lightweight Low Latency Look-Up Table decoder. LILLIPUT consists of two parts– First, it translates syndromes into error detection events that index into a Look-Up Table (LUT) whose entry provides the error information in real-time. Second, it programs the LUTs with error assignments for all possible error events by running a software decoder offline. LILLIPUT tolerates an error on any operation in the quantum hardware, including gates and measurements, and the number of tolerated errors grows with the size of the code. LILLIPUT utilizes less than 7% logic on off-the-shelf FPGAs enabling practical adoption, as FPGAs are already used to design the control and readout circuits in existing systems. LILLIPUT incurs a latency of a few nanoseconds and enables real-time decoding. We also propose Compressed LUTs (CLUTs) to reduce the memory required by LILLIPUT. By exploiting the fact that not all error events are equally likely and only storing data for the most probable error events, CLUTs reduce the memory needed by up-to 107x (from 148 MB to 1.38 MB) without degrading the accuracy.
Swamit Tannu, Poulami Das, Ramin Ayanzadeh, Moinuddin Qureshi
Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems; https://doi.org/10.1145/3503222.3507703

Abstract:
Quantum computers with hundreds of qubits will be available soon. Unfortunately, high device error-rates pose a significant challenge in using these near-term quantum systems to power real-world applications. Executing a program on existing quantum systems generates both correct and incorrect outcomes, but often, the output distribution is too noisy to distinguish between them. In this paper, we show that erroneous outcomes are not arbitrary but exhibit a well-defined structure when represented in the Hamming space. Our experiments on IBM and Google quantum computers show that the most frequent erroneous outcomes are more likely to be close in the Hamming space to the correct outcome. We exploit this behavior to improve the ability to infer the correct outcome. We propose Hamming Reconstruction (HAMMER), a post-processing technique that leverages the observation of Hamming behavior to reconstruct the noisy output distribution, such that the resulting distribution has higher fidelity. We evaluate HAMMER using experimental data from Google and IBM quantum computers with more than 500 unique quantum circuits and obtain an average improvement of 1.37x in the quality of solution. On Google’s publicly available QAOA datasets, we show that HAMMER sharpens the gradients on the cost function landscape.
Xin-Chuan Wu, Marc Grau Davis, Frederic T. Chong, Costin Iancu
2021 International Conference on Rebooting Computing (ICRC) pp 35-46; https://doi.org/10.1109/icrc53822.2021.00016

Abstract:
The current phase of quantum computing is in the Noisy Intermediate-Scale Quantum (NISQ) era. On NISQ devices, two-qubit gates such as CNOTs are much noisier than single-qubit gates, so it is essential to minimize their count. Quantum circuit synthesis is a process of decomposing an arbitrary unitary into a sequence of quantum gates, and can be used as an optimization tool to produce shorter circuits to improve overall circuit fidelity. However, the time-to-solution of synthesis grows exponentially with the number of qubits. As a result, synthesis is intractable for circuits on a large qubit scale. In this paper, we propose a hierarchical, block-by-block opti-mization framework, QGo, for quantum circuit optimization. Our approach allows an exponential cost optimization to scale to large circuits. QGo uses a combination of partitioning and synthesis: 1) partition the circuit into a sequence of independent circuit blocks; 2) re-generate and optimize each block using quantum synthesis; and 3) re-compose the final circuit by stitching all the blocks together. We perform our analysis and show the fidelity improvements in three different regimes: small-size circuits on real devices, medium-size circuits on noisy simulations, and large-size circuits on analytical models. Our technique can be applied after existing optimizations to achieve higher circuit fidelity. Using a set of NISQ benchmarks, we show that QGo can reduce the number of CNOT gates by 29.9% on average and up to 50% when compared with industrial compiler optimizations such as t|ket). When executed on the IBM Athens system, shorter depth leads to higher circuit fidelity. We also demonstrate the scalability of our QGo technique to optimize circuits of 60+ qubits, Our technique is the first demonstration of successfully employing and scaling synthesis in the compilation tool chain for large circuits. Overall, our approach is robust for direct incorporation in production compiler toolchains to further improve the circuit fidelity.
Published: 27 October 2021
npj Quantum Information, Volume 7, pp 1-11; https://doi.org/10.1038/s41534-021-00493-0

Abstract:
The variational quantum eigensolver is currently the flagship algorithm for solving electronic structure problems on near-term quantum computers. The algorithm involves implementing a sequence of parameterized gates on quantum hardware to generate a target quantum state, and then measuring the molecular energy. Due to finite coherence times and gate errors, the number of gates that can be implemented remains limited. In this work, we propose an alternative algorithm where device-level pulse shapes are variationally optimized for the state preparation rather than using an abstract-level quantum circuit. In doing so, the coherence time required for the state preparation is drastically reduced. We numerically demonstrate this by directly optimizing pulse shapes which accurately model the dissociation of H2 and HeH+, and we compute the ground state energy for LiH with four transmons where we see reductions in state preparation times of roughly three orders of magnitude compared to gate-based strategies.
Poulami Das, Swamit Tannu, Moinuddin Qureshi
MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture; https://doi.org/10.1145/3466752.3480044

Abstract:
Near-term quantum computers contain noisy devices, which makes it difficult to infer the correct answer even if a program is run for thousands of trials. On current machines, qubit measurements tend to be the most error-prone operations (with an average error-rate of 4%) and often limit the size of quantum programs that can be run reliably on these systems. As quantum programs create and manipulate correlated states, all the program qubits are measured in each trial and thus, the severity of measurement errors increases with the program size. The fidelity of quantum programs can be improved by reducing the number of measurement operations. We present JigSaw, a framework that reduces the impact of measurement errors by running a program in two modes. First, running the entire program and measuring all the qubits for half of the trials to produce a global (albeit noisy) histogram. Second, running additional copies of the program and measuring only a subset of qubits in each copy, for the remaining trials, to produce localized (higher fidelity) histograms over the measured qubits. JigSaw then employs a Bayesian post-processing step, whereby the histograms produced by the subset measurements are used to update the global histogram. Our evaluations using three different IBM quantum computers with 27 and 65 qubits show that JigSaw improves the success rate on average by 3.6x and up-to 8.4x. Our analysis shows that the storage and time complexity of JigSaw scales linearly with the number of qubits and trials, making JigSaw applicable to programs with hundreds of qubits.
Tirthak Patel, Devesh Tiwari
Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems; https://doi.org/10.1145/3445814.3446743

Abstract:
Current Noisy Intermediate-Scale Quantum (NISQ) computers are useful in developing the quantum computing stack, test quantum algorithms, and establish the feasibility of quantum computing. However, different statistically significant errors permeate NISQ computers. To reduce the effect of these errors, recent research has focused on effective mapping of a quantum algorithm to a quantum computer in an error-and-constraints-aware manner. We propose the first work, QRAFT, to leverage the reversibility property of quantum algorithms to considerably reduce the error beyond the reduction achieved by effective circuit mapping.
, Olivia Angiuli, Yongshan Ding, Kaiwen Gui, Teague Tomesh, Martin Suchara, ,
IEEE Transactions on Quantum Engineering, Volume 1, pp 1-24; https://doi.org/10.1109/tqe.2020.3035814

Abstract:
Variational quantum eigensolver (VQE) is a promising algorithm for near-term quantum machines. It can be used to estimate the ground state energy of a molecule by performing separate measurements of O(N 4 ) terms. This quartic scaling appears to be a significant obstacle to practical applications. However, we note that it empirically reduces to O(N 3 ) when we partition the terms into linear-sized commuting families that can be measured simultaneously. We confirm these empirical observations by studying the MIN-COMMUTING-PARTITION problem at the level of the fermionic Hamiltonian and its encoding into qubits. Moreover, we provide a fast, precomputable procedure for creating linearly sized commuting partitions by solving a round-robin scheduling problem via flow networks. In addition, we demonstrate how to construct the quantum circuits necessary for simultaneous measurement, and we discuss the statistical implication of simultaneous measurement. Our results are experimentally validated by a ground state estimation of deuteron with low shot budget on a 20-qubit IBM machine.
Tirthak Patel, Devesh Tiwari
SC20: International Conference for High Performance Computing, Networking, Storage and Analysis pp 1-16; https://doi.org/10.1109/sc41405.2020.00019

Abstract:
Noisy Intermediate-Scale Quantum (NISQ) machines are being increasingly used to develop quantum algorithms and establish use cases for quantum computing. However, these devices are highly error-prone and produce output, which can be far from the correct output of the quantum algorithm. In this paper, we propose VERITAS, an end-to-end approach toward designing quantum experiments, executing experiments, and correcting outputs produced by quantum circuits post their execution such that the correct output of the quantum algorithm can be accurately estimated.
Tirthak Patel, Abhay Potharaju, Baolin Li, Rohan Basu Roy, Devesh Tiwari
SC20: International Conference for High Performance Computing, Networking, Storage and Analysis pp 1-15; https://doi.org/10.1109/sc41405.2020.00050

Abstract:
Noisy Intermediate-Scale Quantum (NISQ) computers are being increasingly used for executing early-stage quantum programs to establish the practical realizability of existing quantum algorithms. These quantum programs have uses cases in the realm of high-performance computing ranging from molecular chemistry and physics simulations to addressing NP-complete optimization problems. However, NISQ devices are prone to multiple types of errors, which affect the fidelity and reproducibility of the program execution. As the technology is still primitive, our understanding of these quantum machines and their error characteristics is limited. To bridge that understanding gap, this is the first work to provide a systematic and rich experimental evaluation of IBM Quantum Experience (QX) quantum computers of different scales and topologies. Our experimental evaluation uncovers multiple important and interesting aspects of benchmarking and evaluating quantum program on NISQ machines. We have open-sourced our experimental framework and dataset to help accelerate the evaluation of quantum computing systems.
Ellis Wilson, Sudhakar Singh, Frank Mueller
2020 IEEE International Conference on Quantum Computing and Engineering (QCE) pp 345-355; https://doi.org/10.1109/qce49297.2020.00050

Abstract:
Running quantum programs is fraught with challenges on on today's noisy intermediate scale quantum (NISQ) devices. Many of these challenges originate from the error characteristics that stem from rapid decoherence and noise during measurement, qubit connections, crosstalk, the qubits themselves, and transformations of qubit state via gates. Not only are qubits not “created equal”, but their noise level also changes over time. IBM is said to calibrate their quantum systems once per day and reports noise levels (errors) at the time of such calibration. This information is subsequently used to map circuits to higher quality qubits and connections up to the next calibration point. This work provides evidence that there is room for improvement over this daily calibration cycle. It contributes a technique to measure noise levels (errors) related to qubits immediately before executing one or more sensitive circuits and shows that just-in-time noise measurements can benefit late physical qubit mappings. With this just-in-time recalibrated transpilation, the fidelity of results is improved over IBM's default mappings, which only uses their daily calibrations. The framework assess two major sources of noise, namely readout errors (measurement errors) and two-qubit gate/connection errors. Experiments indicate that the accuracy of circuit results improves by 3–304% on average and up to 400% with on-the-fly circuit mappings based on error measurements just prior to application execution.
Mahabubul Alam, Abdullah Ash-Saki, Swaroop Ghosh
2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) pp 215-228; https://doi.org/10.1109/micro50266.2020.00029

Abstract:
The quantum approximate optimization algorithm (QAOA) is a promising quantum-classical hybrid algorithm to solve hard combinatorial optimization problems. The multi-qubit CPHASE gates used in the quantum circuit for QAOA are commutative i.e., the order of the gates can be altered without changing the output state. This re-ordering leads to the execution of more gates in parallel and a smaller number of additional SWAP gates to compile the QAOA-circuit. Consequently, the circuit-depth and cumulative gate-count become lower which is beneficial for circuit execution time and noise resilience. A less number of gates indicates a lower accumulation of gate-errors, and a reduced circuit-depth means less decoherence time for the qubits. However, finding the best-ordered circuit is a difficult problem and does not scale well with circuit size. This paper presents four generic methodologies to optimize QAOA-circuits by exploiting gate re-ordering. We demonstrate a reduction in gate-count by ≈23.0% and circuit-depth by ≈53.0% on average over a conventional approach without incurring any compilation-time penalty. We also present a variation-aware compilation which enhances the compiled circuit success probability by ≈62.7% for the target hardware over the variation unaware approach. A new metric, Approximation Ratio Gap (ARG), is proposed to validate the quality of the compiled QAOA-circuit instances on actual devices. Hardware implementation of a number of QAOA instances shows ≈25.8% improvement in the proposed metric on average over the conventional approach on ibmq 16 melbourne.
Pranav Gokhale, Ali Javadi-Abhari, Nathan Earnest, Yunong Shi, Frederic T. Chong
2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) pp 186-200; https://doi.org/10.1109/micro50266.2020.00027

Abstract:
Quantum computers are traditionally operated by programmers at the granularity of a gate-based instruction set. However, the actual device-level control of a quantum computer is performed via analog pulses. We introduce a compiler that exploits direct control at this microarchitectural level to achieve significant improvements for quantum programs. Unlike quantum optimal control, our approach is bootstrapped from existing gate calibrations and the resulting pulses are simple. Our techniques are applicable to any quantum computer and realizable on current devices. We validate our techniques with millions of experimental shots on IBM quantum computers, controlled via the OpenPulse control interface. For representative benchmarks, our pulse control techniques achieve both 1.6x lower error rates and 2x faster execution time, relative to standard gate-based compilation. These improvements are critical in the near-term era of quantum computing, which is bottlenecked by error rates and qubit lifetimes.
Yongshan Ding, Pranav Gokhale, Sophia Fuhui Lin, Richard Rines, , Frederic T. Chong
2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) pp 201-214; https://doi.org/10.1109/micro50266.2020.00028

Abstract:
One of the key challenges in current Noisy Intermediate-Scale Quantum (NISQ) computers is to control a quantum system with high-fidelity quantum gates. There are many reasons a quantum gate can go wrong – for superconducting transmon qubits in particular, one major source of gate error is the unwanted crosstalk between neighboring qubits due to a phenomenon called frequency crowding. We motivate a systematic approach for understanding and mitigating the crosstalk noise when executing near-term quantum programs on superconducting NISQ computers. We present a general software solution to alleviate frequency crowding by systematically tuning qubit frequencies according to input programs, trading parallelism for higher gate fidelity when necessary. The net result is that our work dramatically improves the crosstalk resilience of tunable-qubit, fixed-coupler hardware, matching or surpassing other more complex architectural designs such as tunable-coupler systems. On NISQ benchmarks, we improve worst-case program success rate by 13.3x on average, compared to existing traditional serialization strategies.
, Yung-Ching Fan, Devina Singh, Margaret R Martonosi, Frederic T Chong
Published: 22 April 2020
Quantum Science and Technology, Volume 5; https://doi.org/10.1088/2058-9565/ab8c2c

Abstract:
Quantum computing is a rapidly growing field with the potential to change how we solve previously intractable problems. Emerging hardware is approaching a complexity that requires increasingly sophisticated programming and control. Scaffold is an older quantum programming language that was originally designed for resource estimation for far-future, large quantum machines, and ScaffCC is the corresponding LLVM-based compiler. For the first time, we provide a full and complete overview of the language itself, the compiler as well as its pass structure. While previous works Abhari et al (2015 Parallel Comput.45 2–17), Abhari et al (2012 Scaffold: quantum programming language https://cs.princeton.edu/research/techreps/TR-934-12), have piecemeal descriptions of different portions of this toolchain, we provide a more full and complete description in this paper. We also introduce updates to ScaffCC including conditional measurement and multidimensional qubit arrays designed to keep in step with modern quantum assembly languages, as well as an alternate toolchain targeted at maintaining correctness and low resource count for noisy-intermediate scale quantum (NISQ) machines, and compatibility with current versions of LLVM and Clang. Our goal is to provide the research community with a functional LLVM framework for quantum program analysis, optimization, and generation of executable code.
Junde Li, Mahabubul Alam, Abdullah Ash Saki, Swaroop Ghosh
2020 21st International Symposium on Quality Electronic Design (ISQED) pp 335-340; https://doi.org/10.1109/isqed48828.2020.9136973

Abstract:
Quantum Approximate Optimization Algorithm (QAOA) provides approximate solution to combinatorial optimization problems. It encodes the cost function using a $p$ -level quantum circuit where each level consists a problem Hamiltonian followed by a mixing Hamiltonian. Despite the promises, few real-world applications (besides the pedagogical MaxCut problem) have exploited QAOA. The success of QAOA relies on the classical optimizer, variational parameter setting, and quantum circuit design and compilation. In this study, we implement QAOA and analyze its performance for a broader Quadratic Unconstrained Binary Optimization (QUBO) formulation to solve real-word applications such as, partially occluded object detection problem. Furthermore, we analyze the effects of above influential factors on QAOA performance. We propose a 3-level improvement of hybrid quantum-classical optimization for object detection. We achieve more than 13X execution speedup by choosing L-BFGS-B as classical optimizer at the first level and 5.50X additional speedup by exploiting parameter symmetry and more than 1.23X acceleration using parameter regression at the second level. We empirically show that the circuit will achieve better fidelity by optimally rescheduling gate operations (especially for deeper circuits) at the third level.
Page of 1
Articles per Page
by
Show export options
  Select all
Back to Top Top