Enhancing biomedical search interfaces with images

July 17th, 2023

Categories: Applications, Data Mining, Software, User Groups, Visualization, Visual Analytics, Deep Learning, Machine Learning, Data Science

Search interface displaying the results of a boolean query.
Search interface displaying the results of a boolean query.

Authors

Trelles Trabucco, J., Arighi, C., Shatkay, H., Marai, G. E.

About

Motivation: Figures in biomedical papers communicate essential information with the potential to identify relevant documents in biomedical and clinical settings. However, academic search interfaces mainly search over text fields.

Results: We describe a search system for biomedical documents that leverages image modalities and an existing index server. We integrate a problem-specific taxonomy of image modalities and image-based data into a custom search system. Our solution features a front-end interface to enhance classical document search results with image-related data, including page thumbnails, figures, captions, and image-modality information. We demonstrate the system on a subset of the CORD-19 document collection. A quantitative evaluation demonstrates higher precision and recall for biomedical document retrieval. A qualitative evaluation with domain experts further highlights our solution’s benefits to biomedical search.

Availability and implementation: A demonstration is available at https://runachay.evl.uic.edu/scholar. Our code and image models can be accessed via https://github.com/uic-evl/bio-search. The dataset is continuously expanded.

Image caption: Search interface displaying the results of a boolean query. (A) The color-coded legend displays the image modalities in the first level of the taxonomy. (B) The search bar, with a keyword box showing a boolean query, a time filter, and an image modalities filter limiting results to documents containing \textit{radiology} or \textit{fluorescence microscopy} figures. (C) The metadata of a result includes the document title, year, venue, authors, and the hits on the abstract and full text. Below is the count of figures and modalities in the document. (D) The enhanced document surrogate shows the image content, including the page thumbnail, extracted figure, subfigures and modalities, and caption.

https://doi.org/10.1093/bioadv/vbad095

Resources

PDF

URL

Citation

Trelles Trabucco, J., Arighi, C., Shatkay, H., Marai, G. E., Enhancing biomedical search interfaces with images, Bioinformatics Advances, vol 3, no 1, pp. vbad095, July 17th, 2023. https://academic.oup.com/bioinformaticsadvances/article-abstract/3/1/vbad095/7225231