The Craft and Coordination of Data Curation: Complicating Workflow Views of Data Science
Author
Thomer, Andrea K.Akmon, Dharma
York, Jeremy J.
Tyler, Allison R. B.
Polasek, Faye
Lafia, Sara
Hemphill, Libby
Yakel, Elizabeth
Affiliation
University of ArizonaIssue Date
2022-11-11
Metadata
Show full item recordCitation
Thomer, A. K., Akmon, D., York, J. J., Tyler, A. R. B., Polasek, F., Lafia, S., Hemphill, L., & Yakel, E. (2022). The Craft and Coordination of Data Curation: Complicating Workflow Views of Data Science. Proceedings of the ACM on Human-Computer Interaction, 6(CSCW2).Rights
© 2022 Copyright held by the owner/author(s). This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
Data curation is the process of making a dataset fit-for-use and archivable. It is critical to data-intensive science because it makes complex data pipelines possible, studies reproducible, and data reusable. Yet the complexities of the hands-on, technical, and intellectual work of data curation is frequently overlooked or downplayed. Obscuring the work of data curation not only renders the labor and contributions of data curators invisible but also hides the impact that curators' work has on the later usability, reliability, and reproducibility of data. To better understand the work and impact of data curation, we conducted a close examination of data curation at a large social science data repository, the Inter-university Consortium for Political and Social Research (ICPSR). We asked: What does curatorial work entail at ICPSR, and what work is more or less visible to different stakeholders and in different contexts? And, how is that curatorial work coordinated across the organization? We triangulated accounts of data curation from interviews and records of curation in Jira tickets to develop a rich and detailed account of curatorial work. While we identified numerous curatorial actions performed by ICPSR curators, we also found that curators rely on a number of craft practices to perform their jobs. The reality of their work practices defies the rote sequence of events implied by many life cycle or workflow models. Further, we show that craft practices are needed to enact data curation best practices and standards. The craft that goes into data curation is often invisible to end users, but it is well recognized by ICPSR curators and their supervisors. Explicitly acknowledging and supporting data curators as craftspeople is important in creating sustainable and successful curatorial infrastructures.Note
Open access articleEISSN
2573-0142DOI
10.1145/3555139Version
Final published versionSponsors
Institute of Museum and Library Servicesae974a485f413a2113503eed53cd6c53
10.1145/3555139
Scopus Count
Collections
Except where otherwise noted, this item's license is described as © 2022 Copyright held by the owner/author(s). This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.