Show simple item record

dc.contributor.authorNeth, B.
dc.contributor.authorScogland, T.R.W.
dc.contributor.authorde Supinski, B.R.
dc.contributor.authorStrout, M.M.
dc.date.accessioned2021-06-24T23:50:56Z
dc.date.available2021-06-24T23:50:56Z
dc.date.issued2021-06
dc.identifier.citationNeth, B., Scogland, T. R. W., de Supinski, B. R., & Strout, M. M. (2021). Inter-loop optimizations in RAJA using loop chains. Proceedings of the International Conference on Supercomputing, 1–12.en_US
dc.identifier.urihttp://hdl.handle.net/10150/660339
dc.description.abstractTypical parallelization approaches such as OpenMP and CUDA provide constructs for parallelizing and blocking for data locality for individual loops. By focusing on each loop separately, these approaches fail to leverage sources of data locality possible due to inter-loop data reuse. The loop chain abstraction provides a framework for reasoning about and applying inter-loop optimizations. In this work, we incorporate the loop chain abstraction into RAJA, a performance portability layer for high-performance computing applications. Using the loop-chain-extended RAJA, or RAJALC, developers can have the RAJA library apply loop transformations like loop fusion and overlapped tiling while maintaining the original structure of their programs. By introducing targeted symbolic evaluation capabilities, we can collect and cache data access information required to verify loop transformations. We evaluate the performance improvement and refactoring costs of our extension. Overall, our results demonstrate 85-98% of the performance improvements of hand-optimized kernels with dramatically fewer code changes. © 2021 Association for Computing Machinery.en_US
dc.language.isoenen_US
dc.publisherAssociation for Computing Machineryen_US
dc.rights© 2021 Association for Computing Machinery.en_US
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en_US
dc.subjectC++en_US
dc.subjectData localityen_US
dc.subjectLoop chainsen_US
dc.subjectPerformance portabilityen_US
dc.subjectPolyhedral analysisen_US
dc.subjectRAJAen_US
dc.subjectSymbolic executionen_US
dc.titleInter-loop optimizations in RAJA using loop chainsen_US
dc.typeArticleen_US
dc.contributor.departmentUniversity of Arizonaen_US
dc.identifier.journalProceedings of the International Conference on Supercomputingen_US
dc.description.noteImmediate accessen_US
dc.description.collectioninformationThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.en_US
dc.eprint.versionFinal accepted manuscripten_US
refterms.dateFOA2021-06-24T23:50:57Z


Files in this item

Thumbnail
Name:
RAJALC.pdf
Size:
3.383Mb
Format:
PDF
Description:
Final Accepted Manuscript

This item appears in the following Collection(s)

Show simple item record