Automating Wavefront Parallelization for Sparse Matrix Computations
| dc.contributor.author | Venkat, Anand | |
| dc.contributor.author | Mohammadi, Mahdi Soltan | |
| dc.contributor.author | Park, Jongsoo | |
| dc.contributor.author | Rong, Hongbo | |
| dc.contributor.author | Barik, Rajkishore | |
| dc.contributor.author | Strout, Michelle Mills | |
| dc.contributor.author | Hall, Mary | |
| dc.date.accessioned | 2018-12-07T22:17:52Z | |
| dc.date.available | 2018-12-07T22:17:52Z | |
| dc.date.issued | 2016 | |
| dc.identifier.citation | A. Venkat et al., "Automating Wavefront Parallelization for Sparse Matrix Computations," SC '16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, Salt Lake City, UT, 2016, pp. 480-491. doi: 10.1109/SC.2016.40 | en_US |
| dc.identifier.issn | 978-1-4673-8815-3 | |
| dc.identifier.doi | 10.1109/SC.2016.40 | |
| dc.identifier.uri | http://hdl.handle.net/10150/631128 | |
| dc.description.abstract | This paper presents a compiler and runtime framework for parallelizing sparse matrix computations that have loop-carried dependences. Our approach automatically generates a runtime inspector to collect data dependence information and achieves wavefront parallelization of the computation, where iterations within a wavefront execute in parallel, and synchronization is required across wavefronts. A key contribution of this paper involves dependence simplification, which reduces the time and space overhead of the inspector. This is implemented within a polyhedral compiler framework, extended for sparse matrix codes. Results demonstrate the feasibility of using automatically-generated inspectors and executors to optimize ILU factorization and symmetric Gauss-Seidel relaxations, which are part of the Preconditioned Conjugate Gradient (PCG) computation. Our implementation achieves a median speedup of 2.97x on 12 cores over the reference sequential PCG implementation, significantly outperforms PCG parallelized using Intel's Math Kernel Library (MKL), and is within 6% of the median performance of manually-parallelized PCG. | en_US |
| dc.description.sponsorship | Scientific Discovery through Advanced Computing (SciDAC) program - U.S. Department of Energy Office of Advanced Scientific Computing Research [DE-SC0006947]; NSF [CNS-1302663, CCF-1564074] | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | IEEE | en_US |
| dc.relation.url | http://ieeexplore.ieee.org/document/7877119/ | en_US |
| dc.rights | Copyright © 2016, IEEE. | en_US |
| dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | |
| dc.title | Automating Wavefront Parallelization for Sparse Matrix Computations | en_US |
| dc.type | Article | en_US |
| dc.contributor.department | Univ Arizona, Dept Comp Sci | en_US |
| dc.identifier.journal | SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS | en_US |
| dc.description.collectioninformation | This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu. | en_US |
| dc.eprint.version | Final accepted manuscript | en_US |
| dc.source.beginpage | 480 | |
| dc.source.endpage | 491 | |
| refterms.dateFOA | 2018-12-07T22:17:53Z |
