You are viewing a single comment's thread from:

RE: LeoThread 2025-10-18 14-48

in LeoFinance2 months ago

Part 6/12:

  • RAG System (LLM 3 + Database): A 38-billion-parameter Llama model combined with a vector database containing medication information, user reviews, and drug data sourced from reputable platforms like WebMD. This setup aids in retrieving specific drug details and generating relevant responses.

Technical Approach:

  • Fine-tuning is conducted on accessible hardware such as a 48GB MacBook M3 Max, emphasizing cost-effective yet effective training.

  • Post-training, models are compressed through quantization (Q5 km with llama CPP) for real-time inference without quality compromise.

  • A dynamic routing system efficiently directs user queries to the appropriate LLM or combination, ensuring contextually relevant responses.

Achieving High Accuracy and Trustworthiness