Understanding Hit Rate, MRR, and MMR Metrics

badrinarayan6645541 12 Jul, 2024
6 min read

Introduction

Imagine you’re at a bookstore looking for the perfect book. You want recommendations that are not only on your favorite genre but also varied enough to introduce you to new authors. Retrieval-Augmented Generation systems work similarly by combining the strengths of finding relevant information and generating creative responses. To measure how well these systems perform, we use metrics like Hit Rate, which checks how often the right recommendations show up, and Mean Reciprocal Rank (MRR), which looks at the order of those recommendations. Maximum Marginal Relevance (MMR) helps ensure that the suggestions are both relevant and diverse. By using these metrics, we can make sure that the recommendations are not just accurate but also varied and interesting.

Overview

  • Gain insight into Hit Rate, MMR, and their roles in evaluating Retrieval-Augmented Generation (RAG) systems.
  • Learn to use Maximum Marginal Relevance to balance relevance and diversity in retrieved results.
  • Master the computation of Hit Rate and Mean Reciprocal Rank (MRR) for assessing information retrieval effectiveness.
  • Develop skills to analyze and improve RAG systems using various performance metrics.

What is the Hit Rate?

Hit Rate is one of the measures used to assess how well recommendation systems work. The desired item’s frequency of appearance in the top-N recommendations is measured. Within the framework of RAG, Hit Rate denotes the frequency with which pertinent data is successfully included into the output that is produced.

How to Calculate Hit Rate?

The calculation of Hit Rate involves dividing the total number of queries by the frequency with which the pertinent item appears in the top-N recommendations. In terms of math, it is stated as:

How to Calculate Hit Rate?

Let’s get a better understanding with an example. We have three queries Q1, Q2, Q3. We also know the exact node to be picked for those queries. Actual Nodes for those queries are N1, N2, N3. Now upon sending those queries we receive nodes from our Retriever. The retrieved nodes for those queries are as mentioned below:

Hit Rate

We can see that our retriever has retrieved the correct node for Q1 and Q2. It did not perform well with Q3. Hence the Hit Rate is 1 for Q1, Q2 and 0 for Q3. Upon using our formula we can calculate the Hit Rate: 

Hit rate

Now that we understand the Hit Rate metric to evaluate our model. We will now see the challenges faced using Hit Rate as our evaluation metric. 

Challenge with Hit Rate

The major challenge that we face when using Hit Rate as our evaluation metric is that it does not take into account the position of the retrieved node. To understand it more, let’s see an example. Let’s say we have two retrievers – retriever 1 and retriever 2. And below image shows the retrieved nodes by both the retrievers. 

hit rate

From the above image we can see that both the retrievers have retrieved the correct node for Q1 and Q2 but not Q3. Hence they both get the same hit rate percentage.

Hit rate

But when inspecting them further we can see that retriever 1 has retrieved the correct node of Q1 at position three and retriever 2 has retrieved the correct node of Q1 at position one. Hence retriever 2 should get a higher score than retriever 1, but the Hit Rate does not take the position of retrieved nodes into account. Now here is where the new metric MRR (Mean Reciprocal Rank) comes into picture. 

Mean Reciprocal Rank (MRR)

One statistical metric used to assess an information retrieval system’s efficacy is Mean Reciprocal Rank (MRR). It is especially helpful in situations where a query is answered by the system returning a ranked list of things (like documents or answers). MRR is used to evaluate the retrieval component of the system’s performance in retrieving pertinent documents that facilitate the development of accurate and pertinent responses in the context of Retrieval-Augmented development (RAG).

How to Calculate MRR?

MRR

N: Number of queries, ranki is the rank position of the first relevant document for the i-th query.

Let’s see an example for MRR.

MRR

In the above image we can see that MRR for Q1 is ⅓ as the correct retrieved node is at 3rd position. Hence the MRR is calculated as

MRR

We can see that while the Hit Rate is 66.66% still the MRR is at 44.4% and for retrievers retrieving correct nodes at starting positions get more weightage. 

Maximum Marginal Relevance (MMR)

Maximum Marginal Relevance (MMR) re-ranks results to enhance both their relevance and diversity. In order to guarantee that the items returned are both relevant and sufficiently varied to address all facets of the query, MMR attempts to strike a balance between novelty and relevance.

How to Calculate MMR?

MMR

Here, D is the set of all candidate documents, R is the set of already selected documents, q is the query, Sim1 is the similarity function between a document and the query, and Sim2 is the similarity function between two documents. di and  dj are documents in D and R respectively.

The parameter λ (mmr_threshold) controls the trade-off between relevance (the first term) and diversity (the second term). When the mmr_threshold is close to 1, the system prioritizes relevance; when it is close to 0, it prioritizes diversity.

Let’s look into a simple example that illustrates MMR. We will use the same example as Hit Rate to demonstrate how MMR re-ranks the retrieved nodes.

MMR

To proceed with MMR let’s assume some variables like Relevance Score:

  • Rel(N2,Q1)=0.7
  • Rel(N3,Q1)=0.6
  • Rel(N1,Q1)=0.9
  • Rel(N3,Q2)=0.9
  • Rel(N5,Q2)=0.3
  • Rel(N1,Q2)=0.6
  • Rel(N1,Q3)=0.8
  • Rel(N2,Q3)=0.5
  • Rel(N4,Q3)=0.4

Similarity Score:

  • Sim(N2,N3)=0.2
  • Sim(N2,N1)=0.5
  • Sim(N3,N1)=0.3
  • Sim(N3,N5)=0.4
  • Sim(N5,N1)=0.6
  • Sim(N1,N2)=0.3
  • Sim(N1,N4)=0.4
  • Sim(N2,N4)=0.5

For simplicity, let’s set λ=0.5\lambda = 0.5λ=0.5 to give equal weight to relevance and diversity.

Calculation of MMR

The Maximum Marginal Relevance (MMR) is calculated by re-ranking retrieved documents to balance relevance and diversity, ensuring a relevant and varied list of results.

For Q1:

  • Initial retrieved nodes: [N2,N3,N1]
  • First selection based on highest relevance: N1 (Rel = 0.9)
  • Next, we calculate MMR for remaining nodes (N2 and N3):
    • MMR(N2)=0.5×0.7−0.5×max⁡(0.5,0.2)=0.1
    • MMR(N3)=0.5×0.6−0.5×max⁡(0.3,0.2)=0.15
  • Select N3 next, since it has the higher MMR score.
  • Only N2 remains.

Final order for Q1: [N1,N3,N2]

For Q2:

  • Initial retrieved nodes: [N3,N5,N1]
  • First selection based on highest relevance: N3 (Rel = 0.9)
  • Next, we calculate MMR for remaining nodes (N5 and N1):
    • MMR(N5)=0.5×0.3−0.5×max⁡(0.4,0.6)=−0.15
    • MMR(N1)=0.5×0.6−0.5×max⁡(0.3,0.6)=0
  • Select N1 next, since it has the higher (non-negative) MMR score.
  • Only N5 remains.

Final order for Q2: [N3,N1,N5]

For Q3:

  • Initial retrieved nodes: [N1,N2,N4]
  • First selection based on highest relevance: N1 (Rel = 0.8)
  • Next, we calculate MMR for remaining nodes (N2 and N4):
    • MMR(N2)=0.5×0.5−0.5×max⁡(0.3,0.5)=−0.1
    • MMR(N4)=0.5×0.4−0.5×max⁡(0.4,0.5)=−0.05
  • Select N4 next, since it has the higher (less negative) MMR score.
  • Only N2 remains.

Final order for Q3: [N1,N4,N2]

Using MMR, we re-rank the nodes to ensure a balance between relevance and diversity. The final re-ranked nodes are:

  • Q1: [N1,N3,N2]
  • Q2: [N3,N1,N5]
  • Q3: [N1,N4,N2]

Conclusion

Metrics like Hit Rate, Mean Reciprocal Rank and Maximal Marginal Relevance (MMR) are essential for assessing and improving the effectiveness of RAG systems. While MMR maintains a balance between relevance and diversity in the recovered results, Hit Rate, MRR concentrates on the frequency of retrieving pertinent information. RAG systems can greatly increase the calibre and applicability of the responses they create, which will increase user happiness and confidence, by optimizing these metrics.

Frequently Asked Questions

Q1. What’s the Hit Rate?

A. We determine it by dividing the total number of searches by the number of hits, or relevant items, in the top-N. We determine it by dividing the total number of searches by the number of hits, or relevant items, in the top-N.

Q2. What is MMR?

A. A re-ranking technique called Maximum Marginal Relevance (MMR) strikes a balance between the relevance and diversity of items obtained. By taking into account a document’s relevance to the query and how similar it is to previously selected items, it seeks to decrease redundancy.

Q3. What makes hit rate crucial for RAG systems?

A. In RAG systems, the Hit Rate—a measure of the frequency with which pertinent information is retrieved—is essential for producing precise and contextually relevant replies. Better success in retrieving relevant information is indicated by a greater hit rate.

Q4. What makes MMR crucial for RAG systems?

A. MMR minimises redundancy by ensuring that the collection of recovered documents is both diverse and pertinent. This facilitates the provision of thorough answers that address all facets of the inquiry.

badrinarayan6645541 12 Jul, 2024

Data science intern at Analytics Vidhya, specializing in ML, DL, and AI. Dedicated to sharing insights through articles on these subjects. Eager to learn and contribute to the field's advancements. Passionate about leveraging data to solve complex problems and drive innovation.

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear