Skip to main content

Multimodal Retrieval of Genomics Data Visualizations

Top: Overview of the database system for retrieval and authoring genomics data visualizations. Bottom: Search interface (left) and authoring interface (right).

Abstract

To address the challenges of efficiently retrieving information from the vast landscape of genomics data visualizations, we introduce a database system designed for retrieving interactive genomics visualizations. The system supports multimodal retrieval to cater to diverse user needs, offering flexibility in search methods. Through a user interface, users can choose their preferred query approach: example images, natural language queries, or grammar-based queries. For each visualization, we construct a set of multimodal representations from the three corresponding modalities: a declarative specification using Gosling visualization grammar to define the structural framework, a pixel-based rendering generated from that specification, and a text description that details the visualization in natural language. To leverage both specialized knowledge from the grammar and general knowledge from a multimodal biomedical foundation model and a large language model, our approach incorporates three types of embedding methods: context-free grammar embeddings, multimodal embeddings, and textual embeddings. We designed the context-free grammar embeddings specifically tailored for grammar-based genomics visualizations, addressing previously underexplored aspects such as genomic tracks, views, and interactivity. The multimodal embeddings are derived from a state-of-the-art biomedical vision-language foundation model, while the textual embeddings are generated by our fine-tuned specification-to-text large language model; both capture generalized insights from large-scale training data. We experimented with different embedding methods across different variations of each modality to identify the strategies that maximize the top-k retrieval accuracy. The current collection comprises 3,200 visualization examples across about 50 categories, from single-view to coordinated multi-view visualizations, and covering a wide range of applications, such as single-cell epigenomics and structural variation analysis.

Citation

HN Nguyen, S L’Yi, TC Smits, S Gao, M Zitnik, N Gehlenborg. “Multimodal Retrieval of Genomics Data Visualizations”, OSF Preprints (2025). doi:10.31219/osf.io/zatw9_v1