So, I asked #ChatGPT on how to feed an LLM valuable information. Where should I start and how? ChatGPT's response in comments below.
So, I asked #ChatGPT on how to feed an LLM valuable information. Where should I start and how? ChatGPT's response in comments below.
Prompt: "I have an LLM program that always finetunes itself on the data in my server. I want to feed it as much valuable information from my own language as possible. I can only put text files in my server, and not actual datasets. The LLM is based on Llama3.x. If I want everything to provide to it to be valuable and accurate, where I should and how?"
ChatGPT's response:
To ensure that the information provided to your LLM is valuable and accurate, consider the following steps:
1. Curate the Text Sources
2. Preprocessing the Texts
3. Diversity and Variety
4. Language-Specific Texts
5. Regular Updates
6. Monitor the Finetuning