Gpu-based and streaming-enabled implementation of pre-processing flow towards enhancing optical character recognition accuracy and efficiency
dc.contributor.author | Serhan, Gener | |
dc.contributor.author | Parker, Dattilo | |
dc.contributor.author | Dhruv, Gajaria | |
dc.contributor.author | Alexander, Fusco | |
dc.contributor.author | Ali, Akoglu | |
dc.date.accessioned | 2023-11-28T02:34:23Z | |
dc.date.available | 2023-11-28T02:34:23Z | |
dc.date.issued | 2023-09-20 | |
dc.identifier.citation | Serhan, G., Parker, D., Dhruv, G., Alexander, F., & Ali, A. (2023). Gpu-based and streaming-enabled implementation of pre-processing flow towards enhancing optical character recognition accuracy and efficiency. Cluster Computing, 1-13. | en_US |
dc.identifier.issn | 1386-7857 | |
dc.identifier.doi | 10.1007/s10586-023-04137-0 | |
dc.identifier.uri | http://hdl.handle.net/10150/670151 | |
dc.description.abstract | Research has demonstrated that digital images can be pre-processed through operations such as scaling, rotation, and blurring to enhance the accuracy of optical character recognition (OCR) by emphasizing important features within the image. Our study employed the open-source Tesseract OCR and found that accuracy can be improved through pre-processing techniques including thresholding, rotation, rescaling, erosion, dilation, and noise removal, based on a dataset of 560 phone screen images. However, our CPU-based implementation of this process resulted in an average latency of 48.32 ms per image, which can hinder the processing of millions of images using OCR. To address this challenge, we parallelized the pre-processing flow on the Nvidia P100 GPU and executed it through a streaming approach, which reduced the latency to 0.825 ms and achieved a speedup factor of 58.6x compared to the serial execution. This implementation enables the use of a GPU-based OCR engine to handle multiple sources of data streams with large-scale workloads. | en_US |
dc.description.sponsorship | National Science Foundation | en_US |
dc.language.iso | en | en_US |
dc.publisher | Springer Science and Business Media LLC | en_US |
dc.rights | © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. | en_US |
dc.rights.uri | https://rightsstatements.org/vocab/InC/1.0/ | en_US |
dc.subject | CUDA | en_US |
dc.subject | GPU | en_US |
dc.subject | mage processing | en_US |
dc.subject | Leptonica | en_US |
dc.subject | Optical Character Recognition (OCR) | en_US |
dc.subject | Tesseract | en_US |
dc.title | Gpu-based and streaming-enabled implementation of pre-processing flow towards enhancing optical character recognition accuracy and efficiency | en_US |
dc.type | Article | en_US |
dc.identifier.eissn | 1573-7543 | |
dc.contributor.department | Department of Bioethics and Medical Humanism, College of Medicine-Phoenix, University of Arizona | en_US |
dc.identifier.journal | Cluster Computing | en_US |
dc.description.note | 12 month embargo; first published 20 September 2023 | en_US |
dc.description.collectioninformation | This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu. | en_US |
dc.eprint.version | Final accepted manuscript | en_US |
dc.identifier.pii | 4137 | |
dc.source.journaltitle | Cluster Computing | |
dc.source.volume | 26 | |
dc.source.issue | 6 | |
dc.source.beginpage | 3407 | |
dc.source.endpage | 3419 |