Part 9/12:
- Leverage existing documents: Use domain knowledge repositories to enrich training and evaluation datasets.
They recommend using LLMs as judges during evaluation—asking multiple times and applying majority voting (self-consistency) to approximate human judgment, reducing annotation costs.
Scaling and Deployment Strategies in Practice
API Rate Limits and Infrastructure
Handling increased user loads involves:
Deploying agents closer to users geographically.
Using load balancers to distribute traffic.
Negotiating higher quotas with cloud providers or using dedicated tokens per minute.
Containerization and Microservices
Encapsulating agents in Docker containers and deploying as microservices enables:
- Dynamic scaling based on demand.