RE: LeoThread 2024-11-16 03:13

Daisy was built by the team at O2 using a custom-trained large language model with a 'character personality layer' to produce personalized responses.

It listens to the caller, transcribes it into text, sends that the LLM which generates the response and sends it back to the caller using text-to-speech. This is similar to the way Google Gemini Live works or the earlier version of ChatGPT Voice.

If you've ever had a conversation with Gemini Live or Meta AI Voice the experience will be fairly similar. It happens in real time with no noticeable delay.