Preserving Multilingual Quality While Tuning Query Encoder on English Only
Abstract
Experiments show that tuning the query part of a dual encoder on an English-only dataset can preserve or improve the quality of multilingual dense passage retrieval.
A dense passage retrieval system can serve as the initial stages of information retrieval, selecting the most relevant text passages for downstream tasks. In this work we conducted experiments with the goal of finding how much the quality of a multilingual retrieval could be degraded if the query part of a dual encoder is tuned on an English-only dataset (assuming scarcity of cross-lingual samples for the targeted domain or task). Specifically, starting with a high quality multilingual embedding model, we observe that an English-only tuning may not only preserve the original quality of the multilingual retrieval, but even improve it.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper