Part 8/11:
- For instance, instead of experimenting with all 10 models, only 3 experiments were needed to specify the most efficient model (e.g., a 3.7B parameter model) achieving the desired accuracy.
The outcomes led to drastic reductions in computational costs—originally approximately $232 per experiment—to about $81.2 using the proposed approach, translating into around 65% savings. This also reduced fine-tuning time from weeks to mere days, a critical advantage for industry deployment.
Key optimization points included:
Data sampling: Certified random sampling to minimize dataset size without sacrificing representativeness.
Model pool narrowing: Fine-tuning only on models available in leaderboards.