Biomedical Information Integration via Adaptive Large Language Model Construction
Created by W.Langdon from
gp-bibliography.bib Revision:1.8344
- @Article{Xue:JBHI,
-
author = "Xingsi Xue and Mu-En Wu and Fazlullah Khan",
-
title = "Biomedical Information Integration via Adaptive Large
Language Model Construction",
-
journal = "IEEE Journal of Biomedical and Health Informatics",
-
note = "Early Access",
-
keywords = "genetic algorithms, genetic programming,
Bioinformatics, Semantics, Biological system modelling,
Accuracy, Complexity theory, Encoding, Bidirectional
control, Optimisation, Large language models, ANN,
Terminology, Biomedical Information Integration",
-
ISSN = "2168-2208",
-
DOI = "
doi:10.1109/JBHI.2024.3496495",
-
abstract = "Integrating diverse biomedical knowledge information
is essential to enhance the accuracy and efficiency of
medical diagnoses, facilitate personalized treatment
plans, and ultimately improve patient outcomes.
However, Biomedical Information Integration (BII) faces
significant challenges due to variations in terminology
and the complex structure of entity descriptions across
different datasets. A critical step in BII is
biomedical entity alignment, which involves accurately
identifying and matching equivalent entities across
diverse datasets to ensure seamless data integration.
In recent years, Large Language Model (LLMs), such as
Bidirectional Encoder Representations from Transformers
(BERTs), have emerged as valuable tools for discerning
heterogeneous biomedical data due to their deep
contextual embeddings and bidirectionality. However,
different LLMs capture various nuances and complexity
levels within the biomedical data, and none of them can
ensure their effectiveness in all heterogeneous entity
matching tasks. To address this issue, we propose a
novel Two-Stage LLM construction (TSLLM) framework to
adaptively select and combine LLMs for Biomedical
Information Integration (BII). First, a Multi-Objective
Genetic Programming (MOGP) algorithm is proposed for
generating versatile high-level LLMs, and then, a
Single-Objective Genetic Algorithm (SOGA) employs a
confidence-based strategy is presented to combine the
built LLMs, which can further improve the
discriminative power of distinguishing heterogeneous
entities. The experiment uses OAEI's entity matching
datasets, i.e., Benchmark and Conference, along with
LargeBio, Disease and Phenotype datasets to test the
performance of TSLLM. The experimental findings
validate the efficiency of TSLLM in adaptively
differentiating heterogeneous biomedical entities,
which significantly outperforms the leading entity
matching techniques.",
-
notes = "Also known as \cite{10750399}",
- }
Genetic Programming entries for
Xingsi Xue
Mu-En Wu
Fazl Ullah
Citations