NVIDIA
Sovereign AI gets boost from new NVIDIA microservices
To ensure AI systems reflect local values and regulations, nations are increasingly pursuing sovereign AI strategies; developing AI utilising their own infrastructure, data, and expertise. NVIDIA is lending its support to this movement with the launch of four new NVIDIA NIMs.
These microservices are designed to simplify the creation and deployment of generative AI applications, supporting regionally-tailored community models. They promise deeper user engagement through an enhanced understanding of local languages and cultural nuances, leading to more accurate and relevant responses.
This move comes amidst an anticipated boom in the Asia-Pacific generative AI software market. ABI Research forecasts a surge in revenue from $5 billion this year to a staggering $48 billion by 2030.
Among the new offerings are two regional language models: Llama-3-Swallow-70B, trained on Japanese data, and Llama-3-Taiwan-70B, optimised for Mandarin. These models are designed to possess a more thorough grasp of local laws, regulations, and cultural intricacies.
Further bolstering the Japanese language offering is the RakutenAI 7B model family. Built upon Mistral-7B and trained on both English and Japanese datasets, they are available as two distinct NIM microservices for Chat and Instruct functions. Notably, Rakuten’s models have achieved impressive results in the LM Evaluation Harness benchmark, securing the highest average score among open Japanese large language models between January and March 2024.
Training LLMs on regional languages is crucial for enhancing output efficacy. By accurately reflecting cultural and linguistic subtleties, these models facilitate more precise and nuanced communication. Compared to base models like Llama 3, these regional variants demonstrate superior performance in understanding Japanese and Mandarin, handling regional legal tasks, answering questions, and translating and summarising text.
This global push for sovereign AI infrastructure is evident in significant investments from nations like Singapore, UAE, South Korea, Sweden, France, Italy, and India.
“LLMs are not mechanical tools that provide the same benefit for everyone. They are rather intellectual tools that interact with human culture and creativity. The influence is mutual where not only are the models affected by the data we train on, but also our culture and the data we generate will be influenced by LLMs,” said Rio Yokota, professor at the Global Scientific Information and Computing Center at the Tokyo Institute of Technology.
“Therefore, it is of paramount importance to develop sovereign AI models that adhere to our cultural norms. The availability of Llama-3-Swallow as an NVIDIA NIM microservice will allow developers to easily access and deploy the model for Japanese applications across various industries.”
NVIDIA’s NIM microservices enable businesses, government bodies, and universities to host native LLMs within their own environments. Developers benefit from the ability to create sophisticated copilots, chatbots, and AI assistants. Available with NVIDIA AI Enterprise, these microservices are optimised for inference using the open-source NVIDIA TensorRT-LLM library, promising enhanced performance and deployment speed.
Performance gains are evident with the Llama 3 70B microservices, (the base for the new Llama–3-Swallow-70B and Llama-3-Taiwan-70B offerings), which boast up to 5x higher throughput. This translates into reduced operational costs and improved user experiences through minimised latency.