Refine Search

New Search

Results: 6

(searched for: doi:10.1109/isca52012.2021.00036)
Save to Scifeed
Page of 1
Articles per Page
by
Show export options
  Select all
Andreas Abel, Jan Reineke
Proceedings of the 36th ACM International Conference on Supercomputing; https://doi.org/10.1145/3524059.3532396

Abstract:
Performance models that statically predict the steady-state throughput of basic blocks on particular microarchitectures, such as IACA, Ithemal, llvm-mca, OSACA, or CQA, can guide optimizing compilers and aid manual software optimization. However, their utility heavily depends on the accuracy of their predictions. The average error of existing models compared to measurements on the actual hardware has been shown to lie between 9% and 36%. But how good is this? To answer this question, we propose an extremely simple analytical throughput model that may serve as a baseline. Surprisingly, this model is already competitive with the state of the art, indicating that there is significant potential for improvement. To explore this potential, we develop a simulation-based throughput predictor. To this end, we propose a detailed parametric pipeline model that supports all Intel Core microarchitecture generations released between 2011 and 2021. We evaluate our predictor on an improved version of the BHive benchmark suite and show that its predictions are usually within 1% of measurement results, improving upon prior models by roughly an order of magnitude. The experimental evaluation also demonstrates that several microarchitectural details considered to be rather insignificant in previous work, are in fact essential for accurate prediction. Our throughput predictor is available as open source.
Sunjay Cauligi, Craig Disselkoen, Daniel Moghimi, Gilles Barthe, Deian Stefan
2022 IEEE Symposium on Security and Privacy (SP) pp 666-680; https://doi.org/10.1109/sp46214.2022.9833707

Abstract:
Spectre vulnerabilities violate our fundamental assumptions about architectural abstractions, allowing attackers to steal sensitive data despite previously state-of-the-art countermeasures. To defend against Spectre, developers of verification tools and compiler-based mitigations are forced to reason about microarchitectural details such as speculative execution. In order to aid developers with these attacks in a principled way, the research community has sought formal foundations for speculative execution upon which to rebuild provable security guarantees.This paper systematizes the community’s current knowledge about software verification and mitigation for Spectre. We study state-of-the-art software defenses, both with and without associated formal models, and use a cohesive framework to compare the security properties each defense provides. We explore a wide variety of tradeoffs in the expressiveness of formal frameworks, the complexity of defense tools, and the resulting security guarantees. As a result of our analysis, we suggest practical choices for developers of analysis and mitigation tools, and we identify several open problems in this area to guide future work on grounded software defenses.
Shuwen Deng, Bowen Huang, Jakub Szefer
2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA) pp 53-66; https://doi.org/10.1109/hpca53966.2022.00013

Abstract:
This paper evaluates new security threats due to the processor frontend in modern Intel processors. The root causes of the security threats are the multiple paths in the processor frontend that the micro-operations can take: through the Micro-Instruction Translation Engine (MITE), through the Decode Stream Buffer (DSB), also called the Micro-operation Cache, or through the Loop Stream Detector (LSD). Each path has its own unique timing and power signatures, which lead to the side- and covert-channel attacks presented in this work. Especially, the switching between the different paths leads to observable timing or power differences which, as this work demonstrates, could be exploited by attackers. Because of the different paths, the switching, and way the components are shared in the frontend between hardware threads, two separate threads are able to be mutually influenced and timing or power can reveal activity on the other thread. The security threats are not limited to multi-threading, and this work further demonstrates new ways for leaking execution information about SGX enclaves or a new in-domain Spectre variant in single-thread setting. Finally, this work demonstrates a new method for fingerprinting the microcode patches of the processor by analyzing the behavior of different paths in the frontend. The findings of this work highlight the security threats associated with the processor frontend and the need for deployment of defenses for the modern processor frontend.
Mengming Li, Chenlu Miao, Yilong Yang, Kai Bu
2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA) pp 98-112; https://doi.org/10.1109/hpca53966.2022.00016

Abstract:
Speculative execution attacks exploiting speculative execution to leak secrets have aroused significant concerns in both industry and academia. They mainly exploit covert or side channels over microarchitectural states left by mis-speculated and squashed instructions (i.e., transient instructions). Most such attacks target cache states. Existing cache-based defenses against speculative execution attacks fall into two categories, Invisible and Undo. Most Invisible defenses buffer execution metadata of speculative instructions and place them into the cache only if the speculatively executed instructions become determined. Motivated by the fact that mis-speculations are rare cases, Undo defenses allow speculative instructions to modify cache states. Upon a mis-speculation, they rollback cache states to the ones prior to the execution of transient instructions. However, Invisible defenses have been recently found insecure by the speculative interference attack. This calls for a deep security inspection of Undo defenses against speculative execution attacks.In this paper, we present unXpec as the first attack against Undo-based safe speculation. It exploits the secret-dependent timing channel exhibited through the rollback operations of Undo defenses. Specifically, the rollback process requires both invalidating cache lines brought into the cache by transient instructions and restoring evicted cache lines from the cache by transiently loaded data. This opens up a channel that encodes secret via the timing difference between when rollback involves much invalidation and restoration or not. We further leverage eviction sets to enforce more restoration operations. This yields a longer rollback time and thus a larger secret-dependent timing difference. We demonstrate the timing channel over the open-source CleanupSpec, a representative Undo solution. A single transient load can trigger a secret-dependent timing difference of 22 cycles (without eviction sets) of 32 cycles (with eviction sets), which is sufficiently exploitable for constructing a covert channel for speculative execution attacks. We run unXpec on the gem5 simulator with CleanupSpec enabled. The results show that unXpec can leak secrets at a high rate of 140 Kbps with an accuracy over 90%. Simply enforcing constant-time rollback to mitigate unXpec may induce an over 70% performance overhead.
Joonsung Kim, Hamin Jang, Hunjun Lee, Seungho Lee, Jangwoo Kim
MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture; https://doi.org/10.1145/3466752.3480079

Abstract:
The modern x86 processor (e.g., Intel, AMD) translates CISC-style x86 instructions to RISC-style micro operations (uops) as RISC pipelines are more efficient than CISC pipelines. However, this x86 decoding process requires complex hardware logic (i.e., x86 decoder) to identify variable-length x86 instructions, which incurs high translation overhead. To avoid this overhead, the x86 processors adopt a micro-operation cache (uop cache) to bypass the expensive x86 decoder by caching the decoded uops. In this paper, we find out modern uop caches suffer from (1) security vulnerability and (2) severe cache contention between co-located SMT cores. To understand these security and performance implications of the uop cache, we propose UC-Check to extract various undisclosed features by using carefully designed microbenchmarks. With the extracted features, (1) we present two attack scenarios exploiting the uop cache as a new timing side-channel and propose a secure architecture to mitigate these attacks with negligible overhead. In addition, (2) we propose a logical uop cache allocation technique to alleviate the cache contention problem. For the evaluation, we extract many undocumented features on a wide spectrum of modern x86 processors and show that our proposed schemes (e.g., security attack/defense, performance optimization) are directly applicable to commodity x86 processors. For example, our logical uop cache allocation improves uop cache hit ratios by up to 1.33 × and achieves up to 1.04 × throughput improvement. We release all software artifacts (e.g., microbenchmarks used for feature extraction, attack proof-of-concept codes, logical uop cache allocation) to the community so that the users can easily reproduce our results and gain insights for further research.
Page of 1
Articles per Page
by
Show export options
  Select all
Back to Top Top