GLOSSARY: Enterprises Need RAG, Not Fine-Tuning

Hey there! Are you wondering if your enterprise should use RAG or fine-tuning? In today’s edition of Sector 6, your AI Memelord, Mohit Pandey, will take you through the importance of RAG for enterprise and why fine-tuning can also be helpful sometimes.

Let’s dive right in —

RAG, or retrieval augmented generation, or what some call ‘fancier prompt engineering’, often comes up for discussion when talking about hallucinations in current LLMs.

Some people choose to fine-tune existing LLMs on their data to make them more useful, while others just connect to an external data source, which is what RAG basically is.

The most important reason for enterprises to RAG is to reduce hallucination and provide more accurate, relevant, and trustworthy outputs while maintaining control over the information sources.

Fine-tuning, on the other hand, with additional data is a viable option, but it carries the risk of the model “forgetting” some of its original training data. Moreover, it is mostly useful for changing the style of the generated text, rather than getting real-time updated information.

“99% of use cases need RAG, not fine-tuning”

That is what ML engineer and teacher Santiago said when talking about GPT-3.5 pricing. It is indeed more expensive for some companies to actually fine-tune the model than to use RAG.

However, considering both RAG and fine-tuning, Armand Ruiz, VP of product, IBM, said that fine-tuning and RAG are complementary LLM enhancement techniques. “The answer to RAG vs fine-tuning is not an either/or choice.”

Fine-tuning adapts the model’s core knowledge for specific domains, improving performance and cost-efficiency, while RAG injects up-to-date information during inference.

Considerations for choosing between RAG and fine-tuning include dynamic vs static performance, architecture, training data, model customisation, hallucinations, accuracy, transparency, cost, and complexity.

Is data annotation dying?

Speaking of data, Jason Corso, a professor of robotics at the University of Michigan, recently claimed that data annotation is a dying field. One may assume that the wave of generative AI would make data annotation jobs more abundant. But that is the exact same reason why these jobs are slowly becoming obsolete.

Though there are companies offering data annotation services in India, such as Karya, NextWealth, Appen, Scale AI, and LabelBox, AI is able to do 99% of the data labelling by itself, that too, perfectly accurately.

As Thomas Wolf of Hugging Face said, “It’s much easier to quickly spin and iterate on a pay-by-usage API than to hire and manage annotators. With model performance strongly improving and the privacy guarantee of open models, it will be harder and harder to justify making complex annotation contracts.”

These will be some dangerous times for data annotation companies. Click here to find out what will replace data annotation.

RAG is tricky, sometimes

Back to RAG, though everyone claims that RAG is the future (just like data annotation was the new job), it is also extremely prone to prompt injections and data leaks.

When GPT-4 Turbo was launched along with the Retrieval API, OpenAI tried to fix the hallucination problem. But with a little fancier prompt engineering, a user was able to download the original knowledge files from someone else’s GPTs, an app built with GPT Builder that essentially uses RAG.

Most believe that RAG makes more sense when trying to retrieve more information and doing keyword searches, which is true. The problem is that it does not eliminate the need for heavy computing as much as pre-training does, but remains a cheaper alternative.

This is a big security issue for this model. If you give access to your documents to the AI model, someone can “convince” it to let them download the original files.

Neither RAG, nor Fine-tuning is dead

A lot of people said that RAG would make fine-tuning obsolete. But it was the same set of people who proclaimed that the launch of LLMs with larger context windows, such as Claude-3, would make RAG obsolete. But both of those are still alive and well.

If you can ignore these flaws for dynamic knowledge control, RAG lets you tweak and expand its internal knowledge without the hassle of retraining the entire model. Building from the ground up can be a costly and time-consuming endeavour.

RAG is advancing day by day and will continue to improve, eventually becoming more beneficial for enterprises. Does your company RAG?

NEWS BYTES

ManageEngine told AIM that the company is planning to invest another 10 million dollars in GPU and infrastructure in the next year.

Google has announced the general availability of Gemini in the Gmail side panel, extending its capabilities beyond Google Docs, Sheets, Slides, and Drive.

Tata Electronics has signed an MoU with Synopsys to collaborate on process technology bring-up and a foundry design platform to accelerate the successful ramp of customer products in India’s first fab being built by Tata Electronics in Dholera, Gujarat.

Pixxel has signed the 350th contract under the iDEX program to manufacture miniaturised multi-payload satellites for the Indian Air Force.

Motorola and Google Cloud recently announced a new multi-year relationship to bring Google’s generative AI models to Motorola phones, including the brand-new series of Razr smartphones.

Intuit's WiDS 2024: Celebrating Women's Achievements in Data Science

WiDS 2024 by Intuit is all set to empower and elevate women working in data science. Attend this one-day in-person conference to learn from and network with peers.

Click here to register.

AI Conclave Wonders

Cypher 2024 marks a significant expansion as it celebrates its 8th edition by branching out to the USA in addition to its already established presence in India.

Browse through the links below to learn more about the different editions of Cypher 2024.

These links will guide you to comprehensive event information, including agendas, speakers, registration details, and more.

Enjoying Sector 6 (formerly AIM Daily XO)? Share it with colleagues or friends – they can sign up here.

We love hearing from our readers! Have thoughts on our new format? Questions, comments, or ideas are always welcome. If there’s a specific topic in AI or analytics that you're curious about, tell us!

Reach out to us at info@analyticsindiamag.com.

Stay tuned for more insights in our next edition!

Curated with ♥️

This email was sent by info@aimmediahouse.com to alexvarboffin.abbb@blogger.com

Not interested? Unsubscribe | Manage Preference | Update profile

Analytics India Magazine | 280, 2nd floor, 5th Main, 15 A cross, Sector 6, HSR layout Bengaluru, Karnataka 560102

GLOSSARY

Поиск по этому блогу

Search1

среда, 26 июня 2024 г.

Enterprises Need RAG, Not Fine-Tuning

Комментариев нет:

Отправить комментарий