Google has just introduced Gemini, its new frontier in artificial intelligence, with a demonstration that stunned the world. A video posted on YouTube shows Gemini's extraordinary ability to interpret and respond to visual and verbal stimuli.
The seemingly simple test quickly turns into an incredible demonstration of the "almost human" abilities of this AI in understanding and interacting with the surrounding world.
Google Gemini: a quantum leap in artificial intelligence
The emergence of Google Gemini (which we covered last September at the very first announcements) marks a turning point in the artificial intelligence landscape. Gemini's ability to interpret and respond to different visual and verbal signals surpasses anything we have seen from AI technologies so far.
This is not simply a breakthrough in visual recognition or natural language understanding. What you see in the demonstration is an extremely seamless integration of both capabilities – something that brings AI closer to a true understanding of human context.
The Google Gemini demo: a window into the future
First of all, if you missed it you MUST see it. Here she is:
The demo begins with a human participant asking Gemini to describe what he sees. The simple action of placing a Post-it and drawing an improvised line on it is readily interpreted by Gemini. But it is the continuation of the test that reveals Gemini's true power.
With the drawing evolving into a recognizable figure, a duck, Gemini not only correctly identifies the object, but also provides details about the surrounding environment, demonstrating a total understanding of the visual context.
Beyond recognition: interaction and translation
Google Gemini's intelligence is not limited to mere visual interpretation. When the participant introduces games and translation requests, Gemini responds precisely. Its ability to translate “duck” into various languages, and to understand and participate in simple games, highlights a level of interactivity and versatility that previously seemed the exclusive preserve of humans.
The practical application of a technology like Google Gemini? Eh. It is impossible to define its limits. From surgery to education, from home applications to creative industries, the possibilities seem endless. Gemini could revolutionize the way we interact with technology, making the human-machine interface more intuitive, natural and efficient.
Yes, but when will we be able to use it?
After the sincere admiration for what we saw in the demo, sincerity for sincerity I must also point out that up to now the "fat" coming from Google has been little. Bard, dragged into the arena of confrontation with Chat GPT of OpenAI e Claude of Anthropic, was presented with too many expectations. The technology "in the field" is inferior to that of competitors (limited to the language model: other AI such as that of Deepmind great results are coming). And the fact that Google Gemini still doesn't have an official launch date produces some frustration.
Maybe it seems "too advanced to be true", maybe it's because you can't wait to get to grips with it, but the time for demonstrations is over. The Google Gemini demo promises to overcome the current limitations of AI technologies: let's see it in action, then.
Don't let me suspect that this is just another way to stall for time.
Edit 8/12/2023: Here you are. Not even doing it on purpose. After insistence from many users, Google admits that the actual Gemini demo was created “using movie stills and text messages,” rather than having Gemini respond to a drawing or a change in objects on the table in real time, or even predict it. This is much less impressive than the video would have you believe, and worse, the lack of a statement about the actual input method makes Gemini's readiness rather questionable, as does Google's behavior.