This week in the world of AI Google has made a notable stride with its latest model, Gemini. Yet, despite the buzz, it still seems to be playing catch-up with OpenAI’s already established prowess.
The timing of Gemini’s launch, nestled between Thanksgiving and Christmas, breaks from the norm, possibly highlighting Google’s urgency in responding to ChatGPT’s emergence more than a year prior. Google’s eagerness is palpable, as it tries to recast its image from a slow-moving giant back to a forward-thinking innovator. Remember THAT Google? The scrappy little search startup with nothing on it’s home page but a search bar. And we all laughed. Well, after some anticipation and delay, Google has introduced Gemini, showcasing some impressive capabilities such as native multi-modality, detecting magic tricks and excelling in accountancy exams through a social media-stirring demo video. However, to me at least, this comes across more as masterful spin than groundbreaking technological leap.
Google’s data puts its Gemini Ultra1 model marginally ahead of OpenAI’s GPT-4 model in nearly all standard benchmarks, which test AI’s prowess in areas like physics, law, and ethical reasoning. These benchmarks are the battlegrounds of the current AI race. Nevertheless, Gemini’s edge over GPT-4 is slight, suggesting that Google’s latest model is more of a step forward than a leap ahead, basically refining a model that OpenAI had already developed over a year ago. Sigh.
Moreover, Google’s Gemini Ultra remains under wraps. If its planned early January 2024 release actually materializes, it might not hold the crown for long. OpenAI, known for its agility, has had more than a year since GPT-4’s completion to innovate further, potentially with GPT-5.
I do want to stress that the demo video of Gemini, hailed as “jaw-dropping” by tech enthusiasts on social media platforms, does showcase Google’s advances in AI reasoning — a hallmark of its DeepMind lab. The AI’s ability to track objects and infer incomplete images hints at sophisticated cognitive skills. However, many of these features aren’t exclusive to Gemini. In fact, OpenAI’s ChatGPT Plus has demonstrated similar capabilities, as shown by Wharton professor Ethan Mollick’s demonstrations here and here.
So while Google’s Gemini marks a noteworthy advancement in the AI domain, it primarily represents an incremental improvement over existing technology from OpenAI. The real test for Gemini will be in its ability to maintain its relevance and superiority in the face of rapid advancements and upcoming innovations in the AI field.
In addition, Google’s Gemini, while remarkable in its presentation, seems to be more of a well-crafted demonstration than a reflection of real-time capabilities. The company admits to editing the demo video for succinctness, reducing latency and shortening responses for clarity. This editing means the actual response time in real-world scenarios would be (likely much) longer than what is portrayed in the demonstration.
Interestingly, the demo wasn’t conducted in real-time or through voice interaction. Google clarified that it was produced using still image frames, with text-based prompting, a significant deviation from the impression of a seamless voice interaction with real-time visual processing. This revelation points to a more controlled and less dynamic interaction than one might expect from the video’s portrayal.
And finally, the demo doesn’t explicitly mention that it features Gemini Ultra, the yet-to-be-released version of the model. This omission is part of a broader marketing strategy, aiming to cement Google’s reputation as a leader in AI research, backed by extensive data access and a vast deployment network. Google’s decision to integrate less advanced versions of Gemini into its products like Chrome, Android, and Pixel phones, emphasizes its strategy to leverage this widespread presence.
However, ubiquity in technology doesn’t always translate into market dominance, as history shows with Nokia and Blackberry’s decline following the introduction of Apple’s iPhone. In the realm of software, the key to success often lies in performance superiority, not just widespread availability.
Google’s push with Gemini appears to be a tactical response to recent upheavals at OpenAI, including a leadership shake-up. This is evidenced by its attempts to sway OpenAI’s corporate clients during this period of uncertainty. The launch of Gemini, therefore, seems to ride this wave, aiming to position Google as a fast-advancing competitor in AI. But I’m not so sure.
While Google’s demonstration of Gemini is impressive, it’s essential to remember that the company has previously showcased groundbreaking technologies that didn’t achieve widespread adoption or impact (like Google Duplex). Google’s complex bureaucracy and layered management have historically slowed its product releases compared to more agile entities like OpenAI.
As the world continues to adapt to the profound implications of AI, it’s prudent to view Google’s apparent leap forward with a measure of skepticism. Despite the impressive showcase, Google is arguably still playing catching up in the AI race.
