Gemini, are two brains better than one?
A podcast about multimodal large language models, twins, and whether two brains are better than one.
In this episode of Good Question, we explore one of the most exciting developments in the world of large language models: multimodal LLMs, which combine different types of data — like text and images — to create a more nuanced understanding of information.
We talked about how the relationship between those models is akin to being twins, where two brains can offer different perspectives while working toward the same goal. Just as twins, like our co-founders Ronan and Conor Burke, can leverage their unique viewpoints to tackle challenges more effectively, multimodal LLMs use their ability to process multiple data types to outperform single-modal models.
Our discussion also covers how the integration of multimodal data in LLMs is crucial for advancing AI toward behaviors that mimic human reasoning, how LLMs allow for better performance in tasks such as document parsing, and the future of these models (particularly in consumer applications).
Interesting moments …
Short on time? Check out these interesting moments from the conversation …
- 00:47 — "I received a text message the other day from none other than Gemini, Google's multimodal large language model.
- 01:50 — "Multimodal is really important with AGI or trying to build systems that can have similar behaviors and capabilities as people and humans."
- 06:30 — "Like multimodal, having a twin is like viewing a situation from two different perspectives, but still being aligned. And startups are a roller coaster, so it’s really important to have that support system."
- 17:39 — "Of all the big tech companies—Facebook, Apple, Microsoft, Amazon—it's likely they're all going to need to have best-in-class LLMs."
- 28:36 — “At the moment, do our agents talk to each other? It depends on what you define as an agent. For our customers, we deploy one agent at a time, so there's really no opportunity for those agents to talk to another agent because they don't have any other agents to talk to yet."
- 31:25 — “Enterprises are excited and optimistic. They're already accepting the impending positive impact of AI agents, which is why so many are planning integration within the next few years.”
We hope you enjoy this conversation about the similarities between multimodal LLMs and twins, exploring why two brains—whether human or artificial—are indeed better than one.
P.S. Need a little levity in your day? Listen to the end of the podcast to hear each episode’s bloopers!
Sources cited
- Apple researchers build multimodal LLM as AI strategy takes shape
- Autonomous AI workers that talk to each other will arrive in 2025, Capgemini predicts
About the guests
- Ronan Burke is the co-founder and CEO of Inscribe. He founded Inscribe with his twin after they experienced the challenges of manual review operations and over-burdened risk teams at national banks and fast-growing fintechs. So they set out to alleviate those challenges by deploying safe, scalable, and reliable AI.
- Conor Burke is the co-founder and CTO of Inscribe. He founded Inscribe with his twin after they experienced the challenges of manual review operations and over-burdened risk teams at national banks and fast-growing fintechs. So they set out to alleviate those challenges by deploying safe, scalable, and reliable AI.
- Brianna Valleskey is the head of marketing at Inscribe AI. While her career started in journalism, she has spent more than a decade working on SaaS revenue teams. She is passionate about enabling fraud fighters and risk leaders to unlock the enormous potential of AI.
Deploy an AI Risk Agent today
Book a demo to see how Inscribe can help you unlock superhuman performance with AI Risk Agents and Risk Models.