<  Back to the Polytechnique Montréal portal

Traces synchronization in distributed networks

Éric Clément and Michel Dagenais

Article (2009)

Open Acess document in PolyPublie and at official publisher
[img]
Preview
Open Access to the full text of this document
Published Version
Terms of Use: Creative Commons Attribution
Download (580kB)
Show abstract
Hide abstract

Abstract

This article proposes a novel approach to synchronize a posteriori the detailed execution traces from several networked computers. It can be used to debug and investigate complex performance problems in systems where several computers exchange information. When the distributed system is under study, detailed execution traces are generated locally on each system using an efficient and accurate system level tracer, LTTng. When the tracing is finished, the individual traces are collected and analysed together. The messaging events in all the traces are then identified and correlated in order to estimate the time offset over time between each node. The time offset computation imprecision, associated with asymmetric network delays and operating system latency in message sending and receiving, is amortized over a large time interval through a linear least square fit over several messages covering a large time span. The resulting accuracy is such that it is possible to estimate the clock offsets in a distributed system, even with a relatively low volume of messages exchanged, to within the order of a microsecond while having a very low impact on the system execution, which is sufficient to properly order the events traced on the individual computers in the distributed system.

Subjects: 2700 Information technology > 2700 Information technology
2700 Information technology > 2702 Computer systems organization
Department: Department of Computer Engineering and Software Engineering
PolyPublie URL: https://publications.polymtl.ca/3654/
Journal Title: Journal of Computer Systems, Networks, and Communications (vol. 2009)
Publisher: Hindawi
DOI: 10.1155/2009/190579
Official URL: https://doi.org/10.1155/2009/190579
Date Deposited: 18 Jul 2019 14:41
Last Modified: 28 Sep 2024 05:56
Cite in APA 7: Clément, É., & Dagenais, M. (2009). Traces synchronization in distributed networks. Journal of Computer Systems, Networks, and Communications, 2009, 1-11. https://doi.org/10.1155/2009/190579

Statistics

Total downloads

Downloads per month in the last year

Origin of downloads

Dimensions

Repository Staff Only

View Item View Item