When you think you’re doing the right moves…
It’s interesting how the pendulum of life can swing far and wide, making things chaotic at times: what you did yesterday might not be what you need today. It doesn’t mean it’s a waste of time, it’s more fuel the brain. You have to learn from this data, though.
I spent the last few weeks understanding more about RAG and how I could make sure I reduced hallucinations. I cleaned a document so the LLM I’m using can retrieve details without bias, hopefully.
This works so far, but it might not be ideal.
That’s where I didn’t really understand what the RAG system really was: it’s memory, not necessarily the best method to “force” the retrieval. As mentioned, it works, but it’s not ideal and could be further refined.
Here comes LoRA!
What I am trying to do now is dig in more into LoRA and how this could help me even more. LoRA is a Low Rank Adapter. In simpler terms, this is a method to fine-tune a model and make it “learn” new things or even “relearn” them. It can be very powerful, but it can also make the model less efficient if you push this too far. From what I understand so far, it’s a form of re-education method we can use to add more specific details in the LLM so it can reflect a bit more the system you’re building.
Merged vs Unmerged
An important point I learned as well was the merged and unmerged method. This is the difference between baking in the changes (merged) or keeping them as a layer over the LLM (unmerged).
By using the merged method, we can integrate our changes within the model and distribute it as one with the LLM. You will want to validate the licensing of the model you’re using regarding distribution though. Some might not allow you to modify and then distribute the final product.
On the other hand, the unmerged method can offer you more flexibility when it comes to licensing since it’s not directly included within the weights of the LLM. Also, the same base model can have multiple “personalities” or “job descriptions” and be very specific for certain tasks. One such usage that comes to mind is if two tasks required a different point of view or have very specialized knowledge in one domain only, and by being very specific, we can avoid making a bloated model with the additional fine-tuning we add over the model or simply prevent additional hallucinations by not confusing the model with too much data or conflicting info.
Taking a step back
This whole journey is making me sway in multiple directions, and I’m very grateful for it. Having used RAG at the beginning is not a failure, it’s part of the process, and I couldn’t fully appreciate what it can offer for this project if I just dismissed it. LoRA might not be exactly what I need here, but I’ll only know if I dig deeper and do more testing to REALLY see what it can offer and how much I can tweak and twist it to make things happen.
I also want to be mindful of not using these like many tools we “repurpose” for tasks they were not meant to do, but I also can’t ignore that sometimes we have to stretch things a bit to make them fit so we can discover new things we didn’t know could happen.
The discovery is part of the journey, alongside the knowledge we gather through it.

