Cyclebite: Extracting Task Graphs From Unstructured Compute-Programs
Name:
Cyclebite_Extracting.pdf
Size:
2.324Mb
Format:
PDF
Description:
Final Published Version
Affiliation
Electrical and Computer Engineering Department, University of ArizonaIssue Date
2023-10-30Keywords
dynamic control flow graphepoch
memory dependency analysis
Produce-consume task graph
task partitioning
Metadata
Show full item recordPublisher
IEEE Computer SocietyCitation
B. R. Willis, A. Shrivastava, J. Mack, S. Dave, C. Chakrabarti and J. Brunhaver, "Cyclebite: Extracting Task Graphs From Unstructured Compute-Programs," in IEEE Transactions on Computers, vol. 73, no. 1, pp. 221-234, Jan. 2024, doi: 10.1109/TC.2023.3327504.Journal
IEEE Transactions on ComputersRights
© 2023 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
—Extracting portable performance in an application requires structuring that program into a data-flow graph of coarse-grained tasks (CGTs). Structuring applications that interconnect multiple external libraries and custom code (i.e., “Code From The Wild” (CFTW)) is challenging. When experts manually restructure a program, they trivialize the extraction of structure; however, this expertise is not broadly available. Automatic structuring approaches focus on the intersection of hot code and static loops, ignoring the data dependencies between tasks and significantly reducing the scope of analyzeable programs. This work addresses the problem of extracting the data-flow graph of CGTs from CFTW. To that end, we present Cyclebite. Our approach extracts CGTs from unstructured compute-programs by detecting CGT candidates in the simplified Markov Control Graph (MCG), and localizing CGTs in an epoch profile. Additionally, the epoch profile extracts the data dependence between CGTs required to build the data-flow graph of CGTs. Cyclebite demonstrates a robust selectivity for critical CGTs relative to the state-of-the-art (SoA), leading to a potential speedup of 12x on average and thread-scaling of 24x on average compared to modern compiler optimizers. We validate the results of Cyclebite and compare them to two SoA techniques using an input corpus of 25 open-source C/C++ libraries with 2,019 unique execution profiles. © 2023 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/Note
Open access articleISSN
0018-9340Version
Final Published Versionae974a485f413a2113503eed53cd6c53
10.1109/TC.2023.3327504
Scopus Count
Collections
Except where otherwise noted, this item's license is described as © 2023 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.