Ideas for research projects for the practical assignment of the IR course 2023/2024:

Project 1 - Build your own IR system

Build your own information retrieval system and carry out an experiment with it.

Background reading: the UMass CIIR book is excellent.

Features: no better way to learn how things work than by doing it from scratch. Need considerable programming experience._

Project 2 - Build a pandemic retrieval model

Text Retrieval Conference (TREC) ran a challenge for COVID-related topics. TREC-COVID challenge consisted of 5 rounds and in each round different IR teams submitted their search results over a collection of biomedical literature articles.

One striking finding of this challenge was that “bare-bones bag-of-words” retrieval models could perform better than some advanced (e.g., neural) models. Build your own search system and see how much you can push the performance of this system, and why it works.

Background reading: Research reports of participants

Features: Creates an understanding of classic retrieval models. Tuning, and programming are needed.

Projec 3 - Entity retrieval

Develop an entity retrieval model. You can choose LTR, neural, or BERT-based models, or you can choose a traditional model such as Pseudo Relevance Feedback.

Background reading: The topic of entity ranking will be introduced in the course. Other useful background is given in the RUSSIR lecture slides, the recent paper by Gerritse et al. (2022) for BERT-based entity retrieval mdoels, and an LTR approach for entity retrieval by Chen et al. (2016)

Dataset: DBpedia-Entity v2 Collection.The dataset is described in detail in the SIGIR 2017 resource paper and presented at SIGIR 2017 using this poster.

Features: Prior programming experience is needed.

Project 4 - Domain adaptation in Information Retrieval

Recent retrieval models make use of Large Language Models (LLMs) to rank documents and passages. These models, however, are often trained and tested on homogeneous datasets and domains. A recent study showed that sparse and dense LLM-based retrieval models are less robust to out-of-domain settings and some are even beaten by BM25 in zero-shot setup. In this project, you can make use of Parameter Efficient Fine Tuning (PEFT) methods to build LLM-based IR models that can be adapted to unseen domains in few- or zero-shot setup.

Background reading: Sparse and dense LLM-based models will be introduced in the Neural IR lecture of the course. There are a number of methods for Parameter Efficient Fine Tuning (PEFT) of Language Models, such as Lora. You can consult the Hugging Face PEFT library for more information on the methods and their usage.

Dataset: The BEIR Benchmark contains 18 publicly available datasets. You can experiment with some of the for your project.

Features: Prior programming experience is needed.

Project 5 - NLP for Information Retrieval

Incorporate your favorite NLP technique for retrieval, for example by using entity linking (e.g., REL), contextual embeddings (e.g., BERT, and GPT), query expansion, etc. Compare the results against a standard system, for example anserini.

Note that this is a rather broad topic and you need to specify the research question, dataset, and evaluation measures in your proposal. Talk to TAs to discuss your specific ideas.

Background reading: Jimmy Lin’s tutorial on NLP makes IR interesting and IR makes NLP useful!. You will also get lots of inspirations from the neural IR lecture.

Dataset: Potentially you can use MS Marco passage and document retrieval collections, TREC document retrieval collections, or other datasets, depending on the project.

Features: Combine/enrich your knowledge of NLP with Information Retrieval. Prior programming experience is needed.

Extreme personalization in conversational search may lead to undesired biases. There are multiple strategies to handle bias in a conversational setting. To choose the best strategy, you can run the Wizard of Oz experiments, where a person pretends to be a system and communicates with a user.

You can also use TaskMAD or Macaw platform to run the experiments and then evaluate the results.

Background reading: Diane Kelly’s FnTIR on interactive IR evaluation, and Gerritse et al. position paper.

Features: Design a user study to test the hypothesis. Requires no or light programming.

Project 7 - User behaviour in voice-only conversational systems

Study search behaviour in voice-only search systems (e.g., Siri and Google now), and see how and when users use these systems for their complex information needs. Do people use these systems for simple tasks (e.g., give me weather prediction for the weekend) or they use it for complex tasks as well (which candidate should I choose for the presidential election)?

Background reading: Trippas paper and Diane Kelly’s FnTIR on interactive IR evaluation.

Features: Design a user study to test the hypothesis. Requires no or light programming.

Evaluate the personalization done by well-known existing information retrieval systems. For this project, you could, for instance, measure the impact of personalization on search results (for different queries) or compare the personalization of different search engines. Additionally, the project could focus on evaluating or quantifying the differences (e.g. how do you weigh the relative position of the search results in calculating the differences) or on visualizing personalization or filter bubbles.

Background reading: a study by Hannák et al. (2017) comparing personalization across different search engines. For their experimental design, they use constructed user accounts; a study by Salehi, Du and Ashman (2015), that also uses constructed user accounts, to measure the difference between personalized and non-personalized search results on Google. This study specifically addressed search within the educational context, but other contexts might also be interesting (think about how the context might affect the design of your experiment); a study by Dillahunt, Brooks and Gulati (2015) that uses actual users to measure the impact of personalization. The last study also focused on visualizing the impact of personalization on search results.

Features: Not too much programming, collecting your own search results data, constructing fake user accounts or doing a study with users, evaluation/quantification of differences between search results, techniques for the visualization of personalization in search.

Project 9 - Re-ranking for diversity of search results

With talk about filter bubbles and bias, as well as Google recently announcing that they are aiming to show results from more diverse domains, it might be interesting to devote your project to re-ranking search results from existing information retrieval systems to prioritise diversity more. Here, multiple types of diversity are possible (e.g. content diversity, domain diversity, etcetera).

Background reading: A seminal paper by Carbonell and Goldstein (1998) about promoting diversity/novelty, as a way to combat redundancy. A probabilistic re-ranking method, proposed by Huang and Hu, that promotes diversity for biomedical information retrieval. A survey article by Kunaver and Požrl (2017) about diversity methods for recommender systems. Though the article focuses on recommender systems, it could still provide some inspiration for techniques that can be applied to information retrieval.

Features: Some tuning/configuring of an IR system, no or light programming.

Get help during assignment sessions and Discord, or go back to the main assignment page.