Generative AI Application — RAG vs Fine tuning Decision Tree and Process

Balamurugan Balakreshnan

5 min readApr 8, 2024

Introduction

Introduction to create generative ai applications with RAG or fine tuning.
Video is created by AI using Azure AI Text to Speech Avatar and Script generated by Azure Open AI GPT 4 Vision.
You tube video link — https://youtu.be/JgQBgPpxAsI

Start with RAG with your own data and only if RAG is not enough then go for fine tuning.
Why should we do fine tuning?
What is the process of fine tuning?
How would i take a decision to fine tune?
Is there a value in fine tuning for business use case?
What are the steps involved in fine tuning?
Only high level steps are discussed here.
These are just my thoughts and can change based on organizations requirements
Use this as guidance and not as a rule book

Fine tuning Decision Making and Process

High Level Process and decision making

First figure out if fine tuning is needed.
If we can use RAG and achieve the outcome then no need for fine tuning.
Only use fine tuning if you want to add more your own context to the model.
Fine tuning can be used to respond with organization data and context.
Fine tuning is a cumbersome process and needs lot of data and time.
Collecting data and labeling data is a time consuming process.
Human in loop is needed to validate the results.
Human also have to avoid bias in the data.
Training infrastructure like GPU compute is hard to get and expensive.
Optimized fine tuning for scale will be another challenge.
Data collection and creation can be automated but needs lots of validation by human.
Fine tuning is done based on use case and task inside.
We can also use small and large language models which has multiple tasks inbuilt to fine tune.
Use data cache to save training and validation, test data sets.
Make sure there is human evaluation in the loop to data set creation results before it goes to training.
For fine tuning select the model based on tasks and how effective to use.
Try with small language model first and then go to large language model.
After fine tuning, evaluate the model with test data set.
Also evaluate the model for responsible ai and bias.
For training there are multiple way to test LoRa, QloRa, DoRa.
Also pick use GPU Nvidia libraries like NCCL to speed up training.
This is speed up Pytorch, Tensorflow training.
Once the results are good, create a leaderboard and update the information.
If results are acceptable, use LLMOps to deploy to environment.
Save the model to registry to use in production.
Saving in registry also allows to share with other in the same organization.
The model consumers might be user or applications built on top of it.
It is absolutely necessary to manage the fine-tuning life cycle management.
Security and privacy is also important in the fine-tuning process.
Be transparent in the process and results.

Process

Certainly! The image outlines a flowchart for the fine tuning process in machine learning model development. Here’s a step-by-step explanation:

Start: Begin the process.
Use Case: Identify the specific use case for which the model is being developed. This defines the purpose and scope of the model.
Task: Determine the specific task the model needs to perform. This could be text generation, image/video creation, image/audio/video summarization, etc.
RAG (Retrieval-Augmented Generation) Decision:
If “Yes”, the model uses vector embeddings from a document store to enhance the generation process.
If “No”, skip to the next step without using vector embeddings.
Model Selection: Choose a machine learning model. The choice can depend on the size of the model required (small or large).
Fine Tune Train: Fine-tune the selected model on a specific dataset. This involves adjusting the model’s parameters so it can better perform the task at hand.
Validation/Test/Evaluation: Validate the fine-tuned model through testing and evaluation to ensure it meets the performance criteria.
Responsible AI Eval: Evaluate the model to ensure it aligns with responsible AI principles, which could include fairness, privacy, security, and robustness.
Human Eval: Have human evaluators assess the model’s performance to ensure it is making sense from a human perspective.
Model Test Results/Leaderboard: Record the model’s test results and potentially compare it with other models on a leaderboard to see how well it performs relative to others.
Deployment/LLMops or Model Registry:
If the model is “Accepted” (i.e., it performs well and meets all criteria), it moves to the deployment phase where it is made available for use, or it is added to a model registry for future reference.
If the model is not accepted, it may require revisiting earlier steps for further refinement.

Throughout this process, there may be additional steps and considerations, such as using GPU compute for training/testing and employing various frameworks. It’s also important to have a dataset that the model can be trained and validated on, which involves finding data, creating datasets, and validating them, potentially with human evaluation to ensure quality. Additionally, once the model is ready, it should be made accessible to the intended consumer.

Deeper look at teams and their skills needed.

Here is my view of what is needed in the team.
Might be missing few also.
Only to definev the functionality.
Skills needed can be defined based on how the organization is structured.
The team should have a good mix of skills.
The team should have a good mix of experience.
The team should have a good mix of domain knowledge.
Domain experience can be subject management expertise or industry expertise as needed.
Need for testing which is evaluting model and it’s relevance is important.

Original article — Samples2024/finetuning/finetuneprocess0424.md at main · balakreshnan/Samples2024 (github.com)