Developing a vocabulary and ontology for modeling insect natural history data: example data, use cases, and competency questions
Author
Stucky, BrianBalhoff, James
Barve, Narayani
Barve, Vijay
Brenskelle, Laura
Brush, Matthew
Dahlem, Gregory
Gilbert, James
Kawahara, Akito
Keller, Oliver
Lucky, Andrea
Mayhew, Peter
Plotkin, David
Seltmann, Katja
Talamas, Elijah
Vaidya, Gaurav
Walls, Ramona
Yoder, Matt
Zhang, Guanyang
Guralnick, Rob
Stucky, Brian
Balhoff, James
Barve, Narayani
Barve, Vijay
Brenskelle, Laura
Brush, Matthew
Dahlem, Gregory
Gilbert, James
Kawahara, Akito
Keller, Oliver
Lucky, Andrea
Mayhew, Peter
Plotkin, David
Seltmann, Katja
Talamas, Elijah
Vaidya, Gaurav
Walls, Ramona
Yoder, Matt
Zhang, Guanyang
Guralnick, Rob
Stucky, Brian
Balhoff, James
Barve, Narayani
Barve, Vijay
Brenskelle, Laura
Brush, Matthew
Dahlem, Gregory
Gilbert, James
Kawahara, Akito
Keller, Oliver
Lucky, Andrea
Mayhew, Peter
Plotkin, David
Seltmann, Katja
Seltmann, Katja
Talamas, Elijah
Vaidya, Gaurav
Walls, Ramona
Yoder, Matt
Zhang, Guanyang
Guralnick, Rob
Stucky, Brian
Balhoff, James
Barve, Narayani
Barve, Vijay
Brenskelle, Laura
Brush, Matthew
Dahlem, Gregory
Gilbert, James
Kawahara, Akito
Keller, Oliver
Lucky, Andrea
Mayhew, Peter
Plotkin, David
Seltmann, Katja
Seltmann, Katja
Talamas, Elijah
Vaidya, Gaurav
Walls, Ramona
Yoder, Matt
Zhang, Guanyang
Guralnick, Rob
Stucky, Brian
Balhoff, James
Barve, Narayani
Barve, Vijay
Brenskelle, Laura
Brush, Matthew
Dahlem, Gregory
Gilbert, James
Kawahara, Akito
Keller, Oliver
Lucky, Andrea
Mayhew, Peter
Plotkin, David
Seltmann, Katja
Seltmann, Katja
Talamas, Elijah
Vaidya, Gaurav
Walls, Ramona
Affiliation
Univ Arizona, Bio5Univ Arizona, CyVerse
Issue Date
2019-03-13
Metadata
Show full item recordPublisher
PENSOFT PUBLCitation
Stucky B, Balhoff J, Barve N, Barve V, Brenskelle L, Brush M, Dahlem G, Gilbert J, Kawahara A, Keller O, Lucky A, Mayhew P, Plotkin D, Seltmann K, Talamas E, Vaidya G, Walls R, Yoder M, Zhang G, Guralnick R (2019) Developing a vocabulary and ontology for modeling insect natural history data: example data, use cases, and competency questions. Biodiversity Data Journal 7: e33303. https://doi.org/10.3897/BDJ.7.e33303Journal
BIODIVERSITY DATA JOURNALRights
© Stucky B et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0).Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
Insects are possibly the most taxonomically and ecologically diverse class of multicellular organisms on Earth. Consequently, they provide nearly unlimited opportunities to develop and test ecological and evolutionary hypotheses. Currently, however, large-scale studies of insect ecology, behavior, and trait evolution are impeded by the difficulty in obtaining and analyzing data derived from natural history observations of insects. These data are typically highly heterogeneous and widely scattered among many sources, which makes developing robust information systems to aggregate and disseminate them a significant challenge. As a step towards this goal, we report initial results of a new effort to develop a standardized vocabulary and ontology for insect natural history data. In particular, we describe a new database of representative insect natural history data derived from multiple sources (but focused on data from specimens in biological collections), an analysis of the abstract conceptual areas required for a comprehensive ontology of insect natural history data, and a database of use cases and competency questions to guide the development of data systems for insect natural history data. We also discuss data modeling and technology-related challenges that must be overcome to implement robust integration of insect natural history data.Note
Open access journalISSN
1314-28281314-2836
DOI
10.3897/BDJ.7.e3330310.3897/BDJ.7.e33303.figure1
10.3897/BDJ.7.e33303.suppl1
10.3897/BDJ.7.e33303.suppl2
10.3897/BDJ.7.e33303.suppl3
Version
Final published versionSponsors
National Science Foundation Postdoctoral Research Fellowship in Biology [1612335]; University of Florida Informatics Institute fellowship; iDigBio workshop grantAdditional Links
https://bdj.pensoft.net/article/33303/https://bdj.pensoft.net/article/33303/element/2/4993777/
https://bdj.pensoft.net/article/33303/element/5/4994016/
https://bdj.pensoft.net/article/33303/element/5/4994017/
https://bdj.pensoft.net/article/33303/element/5/4994468/
ae974a485f413a2113503eed53cd6c53
10.3897/BDJ.7.e33303
Scopus Count
Collections
Related items
Showing items related by title, author, creator and subject.
-
Phylotastic! Making tree-of-life knowledge accessible, reusable and convenientStoltzfus, Arlin; Lapp, Hilmar; Matasci, Naim; Deus, Helena; Sidlauskas, Brian; Zmasek, Christian; Vaidya, Gaurav; Pontelli, Enrico; Cranston, Karen; Vos, Rutger; et al. (BioMed Central, 2013)BACKGROUND:Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great "Tree of Life" (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user's needs. Such a system could become a sustainable community resource if implemented as a distributed system of loosely coupled parts that interact through clearly defined interfaces.RESULTS:With the aim of building such a "phylotastic" system, the NESCent Hackathons, Interoperability, Phylogenies (HIP) working group recruited 2 dozen scientist-programmers to a weeklong programming hackathon in June 2012. During the hackathon (and a three-month follow-up period), 5 teams produced designs, implementations, documentation, presentations, and tests including: (1) a generalized scheme for integrating components
-
LSST: From Science Drivers to Reference Design and Anticipated Data ProductsIvezić, Željko; Kahn, Steven M.; Tyson, J. Anthony; Abel, Bob; Acosta, Emily; Allsman, Robyn; Alonso, David; AlSayyad, Yusra; Anderson, Scott F.; Andrew, John; et al. (IOP PUBLISHING LTD, 2019-03-11)We describe here the most ambitious survey currently planned in the optical, the Large Synoptic Survey Telescope (LSST). The LSST design is driven by four main science themes: probing dark energy and dark matter, taking an inventory of the solar system, exploring the transient optical sky, and mapping the Milky Way. LSST will be a large, wide-field ground-based system designed to obtain repeated images covering the sky visible from Cerro Pachon in northern Chile. The telescope will have an 8.4 m (6.5 m effective) primary mirror, a 9.6 deg(2) field of view, a 3.2-gigapixel camera, and six filters (ugrizy) covering the wavelength range 320-1050 nm. The project is in the construction phase and will begin regular survey operations by 2022. About 90% of the observing time will be devoted to a deep-wide-fast survey mode that will uniformly observe a 18,000 deg(2) region about 800 times (summed over all six bands) during the anticipated 10 yr of operations and will yield a co-added map to r similar to 27.5. These data will result in databases including about 32 trillion observations of 20 billion galaxies and a similar number of stars, and they will serve the majority of the primary science programs. The remaining 10% of the observing time will be allocated to special projects such as Very Deep and Very Fast time domain surveys, whose details are currently under discussion. We illustrate how the LSST science drivers led to these choices of system parameters, and we describe the expected data products and their characteristics.
-
Sloan Digital Sky Survey IV: Mapping the Milky Way, Nearby Galaxies, and the Distant UniverseBlanton, Michael R.; Bershady, Matthew A.; Abolfathi, Bela; Albareti, Franco D.; Prieto, Carlos Allende; Almeida, Andres; Alonso-García, Javier; Anders, Friedrich; Anderson, Scott F.; Andrews, B.; et al. (IOP PUBLISHING LTD, 2017-06-29)We describe the Sloan Digital Sky Survey IV (SDSS-IV), a project encompassing three major spectroscopic programs. The Apache Point Observatory Galactic Evolution Experiment 2 (APOGEE-2) is observing hundreds of thousands of Milky Way stars at high resolution and. high signal-to-noise ratios in the near-infrared. The Mapping Nearby Galaxies at Apache Point Observatory (MaNGA) survey is obtaining spatially resolved spectroscopy for thousands of nearby galaxies (median z similar to 0.03). The extended Baryon Oscillation Spectroscopic Survey (eBOSS) is mapping the galaxy, quasar, and neutral gas distributions between z similar to 0.6 and 3.5 to constrain cosmology using baryon acoustic oscillations, redshift space distortions, and the shape of the power spectrum. Within eBOSS, we are conducting two major subprograms: the SPectroscopic IDentification of eROSITA Sources (SPIDERS), investigating X-ray AGNs. and galaxies in X-ray clusters, and the Time Domain Spectroscopic Survey (TDSS), obtaining spectra of variable sources. All programs use the 2.5 m Sloan Foundation Telescope at the. Apache Point Observatory; observations there began in Summer 2014. APOGEE-2 also operates a second near-infrared spectrograph at the 2.5 m du Pont Telescope at Las Campanas Observatory, with observations beginning in early 2017. Observations at both facilities are scheduled to continue through 2020. In keeping with previous SDSS policy, SDSS-IV provides regularly scheduled public data releases; the first one, Data Release 13, was made available in 2016 July.