The curation process involved several key steps:

  1. Data Selection and Collection: the BIOMATDB Consortium conducted an internal survey to examine which content types are: (1) currently being used and consulted by the biomaterials research community and (2) contain information of importance to characterise relevant biomaterials features, characteristics, and aspects related to biocompatibility evidence and clinical application scenarios. Based on the survey results, PubMed (relevant for 70% of respondents) was identified as the source of uttermost importance for the initial stages of the Biomaterial database. As such,  the aim for the initial stages of the Biomaterial Database was to use and process the biomaterials-relevant information, including both abstracts and citation information, from the PubMed database of biomedical scientific research.
  2. Metadata Curation and Classification: The aim of this process was to select relevantPubMed abstracts related to the materials and biomaterials domains, and classify them according to predefined labels to then apply Natural Language Processing (NLP) tools to extract valuable information (e.g., named entities, relationships). This required an extensive analysis of the available materials and existing resources, as well as their relevance and usefulness to the biomaterials field. Then, three different types of classification are performed as part of this data curation process: (A) Metadata Curation & Classification of data attached to PubMed records (i.e., MeSH and Substances), (B) Content Classification of PubMed Abstracts, and (C) Relation Classification of Text Semantic Annotation to Biomaterial Named Entity.
Was this article helpful?
YesNo

Can not find what you are looking for? Browse through at all articles or contact us.