Developed by a team from the Instituto de Investigación en Inteligencia Artificial de Valencia (VRAIN) at the Universidad Politécnica de Valencia (UPV) and ValgrAI, it represents a crucial advance towards the full utilization of Electronic Health Records (EHRs), propels personalized medicine, and enables more extensive and diverse epidemiological studies.
Valencia, January 24, 2025. A team from the Instituto de Investigación en Inteligencia Artificial de Valencia (VRAIN) of the Universidad Politécnica de Valencia (UPV) and the Escuela Valenciana de Postgrado y Red de Investigación en Inteligencia Artificial (ValgrAI) has developed an innovative methodology that converts the text of medical records and clinical annotations into data that computers can understand, making it usable for medical research. This information has the potential to transform medical research and enable new treatments, a better understanding of side effects, and more effective prevention strategies.
The team led by VRAIN and ValgrAI includes Lluís F. Hurtado, María José Castro-Bleda, and Encarna Segarra, alongside Luis Marco Ruiz from the University Hospital North Norway, Aurelia Bustos Moreno from MedBravo, and Juan Francisco Vallalta from the company Lãberit.
The methodology they have developed combines two branches of artificial intelligence: natural language processing (NLP) and linked data (Linked Data). From these, it transforms clinical text into valuable and queryable data sets, thus facilitating clinical research.
Radiology Reports
The team has processed Spanish radiology reports using pre-trained language models (RoBERTa), tagging free text with biomedical terminologies associated with the Unique Concepts of the Unified Medical Language System (UMLS). Subsequently, these concepts were mapped to logical expressions in a graph database, allowing computers to understand and process complex queries to locate valuable data that, otherwise being free text, could not be analyzed to identify clinical findings of interest.
Clinical Information for Research
The implications of this development are very significant for health systems that manage large distributed databases. And it is especially useful in regional and national health systems that seek to reuse data from diverse registries and cohorts to advance towards one of the key goals of the European Data Space, which is the secondary use of clinical information for biomedical research.
As explained by the VRAIN researcher from the UPV who leads this study, Lluís F. Hurtado, “the approach we have developed bridges the gap between unstructured data and structured query mechanisms. This way, it allows researchers to create scalable and interoperable knowledge bases directly from clinical text.”
And he adds that “it represents a crucial advance towards the full utilization of free text data from Electronic Health Records (EHRs), propels personalized medicine, and enables more extensive epidemiological studies.”
About VRAIN
The Instituto de Investigación en Inteligencia Artificial de Valencia (VRAIN) at the UPV is comprised of eight research groups with more than 30 years of experience in various AI research lines.
The process of creating VRAIN began in 2019, from the merger of six research groups. In 2020, it merged with the Centro de Investigación en Métodos de Producción de Software PROS and was finally established as a University Research Institute with the approval of the Government of Valencia in 2021.
Currently, it has more than 178 researchers divided into nine research areas. These nine areas of research activity ensure that their developments are applied to a large number of strategic sectors such as health, mobility, earth sciences, smart cities, education, social networks, agriculture, industry, privacy/security, autonomous robots, services and energy, and environmental sustainability among others.
These activities have been funded by more than 135 projects obtained through competitive funding, mainly from the European Union, but also from the Plan Nacional de Investigación, the Plan Valenciano de Investigación, and Technology Transfer Projects.
Reference
Luís-F. Hurtado, Luis Marco-Ruiz, Encarna Segarra, Maria Jose Castro-Bleda, Aurelia Bustos-Moreno, Maria de la Iglesia-Vayá, Juan Francisco Vallalta-Rueda, Leveraging Transformers-based models and linked data for deep phenotyping in radiology, Computer Methods and Programs in Biomedicine, Volume 260, 2025, 108567, ISSN 0169-2607, https://doi.org/10.1016/j.cmpb.2024.108567