Automationscribe.com
  • Home
  • AI Scribe
  • AI Tools
  • Artificial Intelligence
  • Contact Us
No Result
View All Result
Automation Scribe
  • Home
  • AI Scribe
  • AI Tools
  • Artificial Intelligence
  • Contact Us
No Result
View All Result
Automationscribe.com
No Result
View All Result

Enhancing RAG: Past Vanilla Approaches

admin by admin
February 25, 2025
in Artificial Intelligence
0
Enhancing RAG: Past Vanilla Approaches
399
SHARES
2.3k
VIEWS
Share on FacebookShare on Twitter

Retrieval-Augmented Era (RAG) is a strong approach that enhances language fashions by incorporating exterior info retrieval mechanisms. Whereas customary RAG implementations enhance response relevance, they typically battle in complicated retrieval situations. This text explores the restrictions of a vanilla RAG setup and introduces superior strategies to reinforce its accuracy and effectivity.

The Problem with Vanilla RAG

For example RAG’s limitations, think about a easy experiment the place we try and retrieve related info from a set of paperwork. Our dataset contains:

  • A major doc discussing finest practices for staying wholesome, productive, and in good condition.
  • Two extra paperwork on unrelated subjects, however include some related phrases utilized in completely different contexts.
main_document_text = """
Morning Routine (5:30 AM - 9:00 AM)
✅ Wake Up Early - Intention for 6-8 hours of sleep to really feel well-rested.
✅ Hydrate First - Drink a glass of water to rehydrate your physique.
✅ Morning Stretch or Mild Train - Do 5-10 minutes of stretching or a brief exercise to activate your physique.
✅ Mindfulness or Meditation - Spend 5-10 minutes practising mindfulness or deep respiration.
✅ Wholesome Breakfast - Eat a balanced meal with protein, wholesome fat, and fiber.
✅ Plan Your Day - Set objectives, overview your schedule, and prioritize duties.
...
"""

Utilizing a regular RAG setup, we question the system with:

  1. What ought to I do to remain wholesome and productive?
  2. What are the very best practices to remain wholesome and productive?

Helper Features

To reinforce retrieval accuracy and streamline question processing, we implement a set of important helper features. These features serve numerous functions, from querying the ChatGPT API to computing doc embeddings and similarity scores. By leveraging these features, we create a extra environment friendly RAG pipeline that successfully retrieves probably the most related info for consumer queries.

To help our RAG enhancements, we outline the next helper features:

# **Imports**
import os
import json
import openai
import numpy as np
from scipy.spatial.distance import cosine
from google.colab import userdata

# Arrange OpenAI API key
os.environ["OPENAI_API_KEY"] = userdata.get('AiTeam')
def query_chatgpt(immediate, mannequin="gpt-4o", response_format=openai.NOT_GIVEN):
    attempt:
        response = shopper.chat.completions.create(
            mannequin=mannequin,
            messages=[{"role": "user", "content": prompt}],
            temperature=0.0 , # Regulate for roughly creativity
            response_format=response_format
        )
        return response.decisions[0].message.content material.strip()
    besides Exception as e:
        return f"Error: {e}"
def get_embedding(textual content, mannequin="text-embedding-3-large"): #"text-embedding-ada-002"
    """Fetches the embedding for a given textual content utilizing OpenAI's API."""
    response = shopper.embeddings.create(
        enter=[text],
        mannequin=mannequin
    )
    return response.information[0].embedding
def compute_similarity_metrics(embed1, embed2):
    """Computes completely different similarity/distance metrics between two embeddings."""
    cosine_sim = 1- cosine(embed1, embed2)  # Cosine similarity

    return cosine_sim
def fetch_similar_docs(question, docs, threshold = .55, prime=1):
  query_em = get_embedding(question)
  information = []
  for d in docs:
    # Compute and print similarity metrics
    similarity_results = compute_similarity_metrics(d["embedding"], query_em)
    if(similarity_results >= threshold):
      information.append({"id":d["id"], "ref_doc":d.get("ref_doc", ""), "rating":similarity_results})

  # Sorting by worth (second factor in every tuple)
  sorted_data = sorted(information, key=lambda x: x["score"], reverse=True)  # Ascending order
  sorted_data = sorted_data[:min(top, len(sorted_data))]
  return sorted_data

Evaluating the Vanilla RAG

To judge the effectiveness of a vanilla RAG setup, we conduct a easy check utilizing predefined queries. Our aim is to find out whether or not the system retrieves probably the most related doc based mostly on semantic similarity. We then analyze the restrictions and discover attainable enhancements.

"""# **Testing Vanilla RAG**"""

question = "what ought to I do to remain wholesome and productive?"
r = fetch_similar_docs(question, docs)
print("question = ", question)
print("paperwork = ", r)

question = "what are the very best practices to remain wholesome and productive ?"
r = fetch_similar_docs(question, docs)
print("question = ", question)
print("paperwork = ", r)

Superior Strategies for Improved RAG

To additional refine the retrieval course of, we introduce superior features that improve the capabilities of our RAG system. These features generate structured info that aids in doc retrieval and question processing, making our system extra sturdy and context-aware.

To handle these challenges, we implement three key enhancements:

1. Producing FAQs

By robotically creating a listing of continuously requested questions associated to a doc, we develop the vary of potential queries the mannequin can match. These FAQs are generated as soon as and saved alongside the doc, offering a richer search house with out incurring ongoing prices.

def generate_faq(textual content):
  immediate = f'''
  given the next textual content: """{textual content}"""
  Ask related easy atomic questions ONLY (do not reply them) to cowl all topics lined by the textual content. Return the outcome as a json record instance [q1, q2, q3...]
  '''
  return query_chatgpt(immediate, response_format={ "sort": "json_object" })

2. Creating an Overview

A high-level abstract of the doc helps seize its core concepts, making retrieval more practical. By embedding the overview alongside the doc, we offer extra entry factors for related queries, bettering match charges.

def generate_overview(textual content):
  immediate = f'''
  given the next textual content: """{textual content}"""
  Generate an summary for it that tells in most 3 traces what's it about and use excessive stage phrases that may seize the details,
  Use phrases and phrases that might be almost definitely utilized by common particular person.
  '''
  return query_chatgpt(immediate)

3. Question Decomposition

As an alternative of looking with broad consumer queries, we break them down into smaller, extra exact sub-queries. Every sub-query is then in contrast towards our enhanced doc assortment, which now contains:

  • The unique doc
  • The generated FAQs
  • The generated overview

By merging the retrieval outcomes from these a number of sources, we considerably enhance the probability of discovering related info.

def decompose_query(question):
  immediate = f'''
  Given the consumer question: """{question}"""
break it down into smaller, related subqueries
that may retrieve the very best info for answering the unique question.
Return them as a ranked json record instance [q1, q2, q3...].
'''
  return query_chatgpt(immediate, response_format={ "sort": "json_object" })

Evaluating the Improved RAG

Implementing these strategies, we re-run our preliminary queries. This time, question decomposition generates a number of sub-queries, every specializing in completely different features of the unique query. Consequently, our system efficiently retrieves related info from each the FAQ and the unique doc, demonstrating a considerable enchancment over the vanilla RAG strategy.

"""# **Testing Superior Features**"""

## Generate overview of the doc
overview_text = generate_overview(main_document_text)
print(overview_text)
# generate embedding
docs.append({"id":"overview_text", "ref_doc": "main_document_text", "embedding":get_embedding(overview_text)})


## Generate FAQ for the doc
main_doc_faq_arr = generate_faq(main_document_text)
print(main_doc_faq_arr)
faq =json.masses(main_doc_faq_arr)["questions"]

for f, i in zip(faq, vary(len(faq))):
  docs.append({"id": f"main_doc_faq_{i}", "ref_doc": "main_document_text", "embedding":  get_embedding(f)})


## Decompose the first question
question = "what ought to I do to remain healty and productive?"
subqueries = decompose_query(question)
print(subqueries)




subqueries_list = json.masses(subqueries)['subqueries']


## compute the similarities between the subqueries and paperwork, together with FAQ
for subq in subqueries_list:
  print("question = ", subq)
  r = fetch_similar_docs(subq, docs, threshold=.55, prime=2)
  print(r)
  print('=================================n')


## Decompose the 2nd question
question = "what the very best practices to remain healty and productive?"
subqueries = decompose_query(question)
print(subqueries)

subqueries_list = json.masses(subqueries)['subqueries']


## compute the similarities between the subqueries and paperwork, together with FAQ
for subq in subqueries_list:
  print("question = ", subq)
  r = fetch_similar_docs(subq, docs, threshold=.55, prime=2)
  print(r)
  print('=================================n')

Listed below are among the FAQ that have been generated:

{
  "questions": [
    "How many hours of sleep are recommended to feel well-rested?",
    "How long should you spend on morning stretching or light exercise?",
    "What is the recommended duration for mindfulness or meditation in the morning?",
    "What should a healthy breakfast include?",
    "What should you do to plan your day effectively?",
    "How can you minimize distractions during work?",
    "How often should you take breaks during work/study productivity time?",
    "What should a healthy lunch consist of?",
    "What activities are recommended for afternoon productivity?",
    "Why is it important to move around every hour in the afternoon?",
    "What types of physical activities are suggested for the evening routine?",
    "What should a nutritious dinner include?",
    "What activities can help you reflect and unwind in the evening?",
    "What should you do to prepare for sleep?",
    …
  ]
}

Price-Profit Evaluation

Whereas these enhancements introduce an upfront processing value—producing FAQs, overviews, and embeddings—it is a one-time value per doc. In distinction, a poorly optimized RAG system would result in two main inefficiencies:

  1. Annoyed customers attributable to low-quality retrieval.
  2. Elevated question prices from retrieving extreme, loosely associated paperwork.

For programs dealing with excessive question volumes, these inefficiencies compound shortly, making preprocessing a worthwhile funding.

Conclusion

By integrating doc preprocessing (FAQs and overviews) with question decomposition, we create a extra clever RAG system that balances accuracy and cost-effectiveness. This strategy enhances retrieval high quality, reduces irrelevant outcomes, and ensures a greater consumer expertise.

As RAG continues to evolve, these strategies might be instrumental in refining AI-driven retrieval programs. Future analysis could discover additional optimizations, together with dynamic thresholding and reinforcement studying for question refinement.


Tags: ApproachesEnhancingRAGVanilla
Previous Post

Mistral-Small-24B-Instruct-2501 is now accessible on SageMaker Jumpstart and Amazon Bedrock Market

Next Post

How IDIADA optimized its clever chatbot with Amazon Bedrock

Next Post
How IDIADA optimized its clever chatbot with Amazon Bedrock

How IDIADA optimized its clever chatbot with Amazon Bedrock

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Popular News

  • How Aviva constructed a scalable, safe, and dependable MLOps platform utilizing Amazon SageMaker

    How Aviva constructed a scalable, safe, and dependable MLOps platform utilizing Amazon SageMaker

    401 shares
    Share 160 Tweet 100
  • Diffusion Mannequin from Scratch in Pytorch | by Nicholas DiSalvo | Jul, 2024

    401 shares
    Share 160 Tweet 100
  • Unlocking Japanese LLMs with AWS Trainium: Innovators Showcase from the AWS LLM Growth Assist Program

    401 shares
    Share 160 Tweet 100
  • Proton launches ‘Privacy-First’ AI Email Assistant to Compete with Google and Microsoft

    401 shares
    Share 160 Tweet 100
  • Streamlit fairly styled dataframes half 1: utilizing the pandas Styler

    400 shares
    Share 160 Tweet 100

About Us

Automation Scribe is your go-to site for easy-to-understand Artificial Intelligence (AI) articles. Discover insights on AI tools, AI Scribe, and more. Stay updated with the latest advancements in AI technology. Dive into the world of automation with simplified explanations and informative content. Visit us today!

Category

  • AI Scribe
  • AI Tools
  • Artificial Intelligence

Recent Posts

  • Pipelining AI/ML Coaching Workloads with CUDA Streams
  • Structured information response with Amazon Bedrock: Immediate Engineering and Instrument Use
  • Use OpenAI Whisper for Automated Transcriptions
  • Home
  • Contact Us
  • Disclaimer
  • Privacy Policy
  • Terms & Conditions

© 2024 automationscribe.com. All rights reserved.

No Result
View All Result
  • Home
  • AI Scribe
  • AI Tools
  • Artificial Intelligence
  • Contact Us

© 2024 automationscribe.com. All rights reserved.