Undergrads Develop Open AI Speech Model Dia to Challenge NotebookLM
Two undergraduate students have created Dia, an AI speech model rivaling Google’s NotebookLM, offering customizable voice generation and easy voice cloning, now available on Hugging Face and GitHub.

Talk about hitting the ground running—Toby Kim and his buddy at Nari Labs, a couple of undergrads from Korea with zero formal AI training, somehow managed to cook up Dia in just three months. This open-source AI speech model isn’t just a neat school project; it’s giving giants like Google’s NotebookLM a run for their money, letting users tweak voices and scripts like never before.
Dia’s no lightweight, boasting 1.6 billion parameters, all thanks to Google’s TPU Research Cloud (talk about a sweet deal). It’s got this cool trick where it can whip up dialogue from scripts, complete with custom tones and even those awkward coughs and laughs. And get this—it’s up for grabs on GitHub and Hugging Face, ready to roll on your average PC (as long as it’s packing at least 10GB of VRAM).
But here’s the kicker: Dia’s like that friend who’s fun but maybe a bit too wild. It’s got next to no guardrails against misuse, which is kinda scary when you think about scams or fake news. Nari Labs is all, ‘Don’t be evil,’ but they’re not exactly playing cop either. Plus, nobody knows where Dia’s training data came from, which is a whole other can of worms about who owns what.
What’s next? Nari Labs is dreaming big—more languages, a voice platform with a social twist. As voice AI blows up, Dia’s sitting right at the crossroads of ‘Wow, this is amazing’ and ‘Uh-oh, what did we just unleash?’