Loïc Prieur-Drevon, Raphaël Beamonte, Naser Ezzati-Jivan and Michel Dagenais
Paper (2016)
Open Access document in PolyPublie |
|
Open Access to the full text of this document Accepted Version Terms of Use: All rights reserved Download (413kB) |
Abstract
Behaviors of distributed systems with many cores and/or many threads are difficult to understand. This is why dynamic analysis tools such as tracers are useful to collect run-time data and help programmers debug and optimize complex programs. However, manual trace analysis on very large traces with billions of events can be a difficult problem which automated trace visualizers and analyzers aim to solve. Trace analysis and visualization software needs fast access to data which it cannot achieve by searching through the entire trace for every query. A number of solutions have adopted stateful analysis which rearranges events into a more query friendly structures after a single pass through the trace. In this paper, we look into current implementations and model the behavior of previous work, the State History Tree (SHT), on traces with many thread creation and deletion. This allows us to identify which properties of the SHT are responsible for inefficient disk usage and high memory consumption. We then propose a more efficient data structure, the enhanced State History Tree (eSHT), to store and query computed states, in order to limit disk usage and reduce the query time for any state. Next, we compare the use of SHT and eSHT on traces with many attributes. We finally verify the scalability of our new data structure according to trace size. As shown by our results, the proposed solution makes near optimal use of disk space, reduces the algorithm's memory usage logarithmically for previously problematic cases, and speeds up queries on traces with many attributes by an order of magnitude. The proposed solution builds upon our previous work, enabling it to easily scale up to traces containing a million threads.
Uncontrolled Keywords
state system, tracing, parallel, distributed, data structure, stateful analysis, history tree, multi-dimensional indexes
Subjects: |
2700 Information technology > 2705 Software and development 2700 Information technology > 2713 Algorithms 2700 Information technology > 2720 Computer systems software |
---|---|
Department: | Department of Computer Engineering and Software Engineering |
Funders: | CRSNG/NSERC |
Grant number: | CRDPJ468687-14 |
PolyPublie URL: | https://publications.polymtl.ca/2994/ |
Conference Title: | IEEE International Congress on Big Data (BigData Congress 2016) |
Conference Location: | San Francisco, CA, USA |
Conference Date(s): | 2016-06-27 - 2016-07-02 |
Publisher: | IEEE |
DOI: | 10.1109/bigdatacongress.2016.19 |
Official URL: | https://doi.org/10.1109/bigdatacongress.2016.19 |
Date Deposited: | 13 Feb 2018 12:51 |
Last Modified: | 27 Sep 2024 00:38 |
Cite in APA 7: | Prieur-Drevon, L., Beamonte, R., Ezzati-Jivan, N., & Dagenais, M. (2016, June). Enhanced state history tree (eSHT): a stateful data structure for analysis of highly parallel system traces [Paper]. IEEE International Congress on Big Data (BigData Congress 2016), San Francisco, CA, USA (8 pages). https://doi.org/10.1109/bigdatacongress.2016.19 |
---|---|
Statistics
Total downloads
Downloads per month in the last year
Origin of downloads
Dimensions