Easy Semantification of Bioassays

Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. We propose a solution for automatically semantifying biological assays. Our solution contrasts the problem of automated semantification as labeling versus clustering where the two methods are on opposite ends of the method complexity spectrum. Characteristically modeling our problem, we find the clustering solution significantly outperforms a deep neural network state-of-the-art labeling approach. This novel contribution is based on two factors: 1) a learning objective closely modeled after the data outperforms an alternative approach with sophisticated semantic modeling; 2) automatically semantifying biological assays achieves a high performance F1 of nearly 83%, which to our knowledge is the first reported standardized evaluation of the task offering a strong benchmark model.

Saved in:
Bibliographic Details
Main Authors: Anteghini, Marco, D’Souza, Jennifer, Martins dos Santos, Vitor A.P., Auer, Sören
Format: Article in monograph or in proceedings biblioteca
Language:English
Published: Springer
Subjects:Automatic semantification, Bioassays, Clustering, Labeling, Open Research Knowledge Graph, Open science graphs, Supervised learning, Unsupervised learning,
Online Access:https://research.wur.nl/en/publications/easy-semantification-ofbioassays
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. We propose a solution for automatically semantifying biological assays. Our solution contrasts the problem of automated semantification as labeling versus clustering where the two methods are on opposite ends of the method complexity spectrum. Characteristically modeling our problem, we find the clustering solution significantly outperforms a deep neural network state-of-the-art labeling approach. This novel contribution is based on two factors: 1) a learning objective closely modeled after the data outperforms an alternative approach with sophisticated semantic modeling; 2) automatically semantifying biological assays achieves a high performance F1 of nearly 83%, which to our knowledge is the first reported standardized evaluation of the task offering a strong benchmark model.