Abstract
242165
Introduction: There is a pressing need for standardization in nuclear medicine terminology. The field is seeing rapid growth in nuclear oncology and theranostics, and at the same time, artificial intelligence and large language models are demanding large multi-institutional datasets. However, inconsistencies in terminology make it difficult to pool datasets and can hinder efforts to reproduce or build upon research studies. Existing medical ontologies, like RadLex, SNOMED-CT, and LOINC, aim to provide standardized terminologies, but they are incomplete and inadequate concerning nuclear medicine concepts. This work introduces NucLex, an initiative to create a controlled and publicly available ontology for standardizing nuclear medicine terms.
Methods: NucLex, an extension of the radiology ontology RadLex, was initiated by the SNMMI Artificial Intelligence Task Force. NucLex was developed in the OWL ontology language using Protégé software and the owl2ready Python library. To identify nuclear medicine terms to be added to NucLex, a data-driven approach based on natural language processing (NLP) was used. Noun phrases (i.e., "chunks") were mined from articles published in medical imaging journals over the last 5 years. Noun phrases that were mentioned with significantly higher frequencies in nuclear medicine-focused journals than in general radiology journals were automatically identified. These terms were then manually curated into a list, which was then organized hierarchically using ‘isa’ associations to indicate semantic relationships between different terms. NucLex was formatted such that it be easily integrated into the RadLex ontology.
Results: Over 1700 terms were initially identified using our NLP-based approach. Following manual review, the list was reduced to approximately 500 terms due to duplication (e.g., "SUV" and "standardized uptake value") or irrelevance (e.g., "animal studies"). Of these, approximately 300 terms have so far been organized hierarchically in OWL. All NucLex terms have been cross-searched in RadLex and Unified Medical Language System (UMLS) ontologies and corresponding identifiers are provided when available. NucLex is accessible at.
Conclusions: Using a data-driven approach, we have identified nuclear medicine terms for inclusion in NucLex. The ongoing development and expansion of NucLex hold significant potential for improving data sharing, enhancing research reproducibility, and advancing nuclear medicine by establishing consistent and well-defined terminology.