In the beginning it was GPT-3, with its 175 billion parameters. Then they came PaLM, Megatron-Turing, Chinchilla and so on, and the parameters broke through the 1000 billion threshold. They are the LLM, Large Language Model, models trained on immense quantities of text to imitate human language. And they have become the undisputed protagonists of the race to artificial intelligence. But there's a problem.
The larger and more sophisticated these models become, the more energy they devour. And the improvements they achieve seem to follow only linear growth, not exponential like the costs. It is the signal that perhaps, to achieve the coveted AGI, general artificial intelligence capable of thinking like (or more than) human beings a paradigm shift is needed. That the answer lies not only in big data, but in more efficient and "rational" architectures.
The LLM paradox
Let's be clear: Large Language Models are a marvel of human ingenuity. They have learned to master the subtleties of language with a mastery that is incredible. They are capable of writing articles, answering questions, summarizing complex concepts, translating between dozens of languages, and now they even know how to generate code. Sometimes, I don't think I'm exaggerating, they seem to grasp the profound meaning of the lyrics, and have an almost human understanding of the world.
Already. Almost.
But the more we put them under stress, the more their limitations become apparent: they are very good at recognizing patterns and imitating styles, but not at thinking independently. Behind the glittering facade of their linguistic performances, there is no real intelligence, no inference and problem solving ability disconnected from the data on which they were trained. Nothing.
And to obtain even these partial results, LLMs require excessive amounts of energy. Suffice it to say that on an energy level, training GPT-3 was "costly" the equivalent of flying a plane 700 times between New York and San Francisco. An enormous environmental and economic cost, which grows exponentially with each leap in scale of the models. It's as if to get a worker a little more productive, we had to double their salary every month. An unsustainable dynamic in the long term.
The spark wasn't there
But it's not just a question of cost-benefit. There is a deeper problem plaguing LLMs, which undermines their ambitions to become the basis for general artificial intelligence. And it is the lack of abstract reasoning, of true "thought" beyond superficial analogies.
Some researchers hoped that this type of capability could “emerge” spontaneously from LLMs, once sufficiently large parameters and datasets were achieved. The idea was that the more information and computing power you threw at the model, the more it would start to develop an intelligence of its own, not only emulating human language but also the underlying cognitive processes.
So far, however, there are no signs of this "emergency". Even the most advanced LLMs, when faced with tasks that require logical reasoning, planning, and out-of-the-box creativity, get lost in meaningless conjectures and hallucinations. It seems that intelligence, real intelligence, is not just a question of monstrously brute statistics, but requires different architectures and learning processes, still largely to be discovered.
The new paths for general artificial intelligence
The difficulties shown by LLMs are the basis for the fact that many researchers are exploring alternative ways to reach the final goal of AGI. One of these is the Category Theory, a branch of abstract mathematics that studies relationships between algebraic structures. Some startups, like Symbolica, believe it can provide the theoretical framework for building artificial intelligence systems capable of developing symbolic representations of the world, and not just statistical associations between words.
Another promising trend is that of “Goal-oriented” AI, that is, designed to achieve specific objectives in complex three-dimensional environments, interacting with objects and agents in a physical as well as linguistic way. The idea is that intelligence is not born in a vacuum, but develops through it the embodiment, action embodied in the world, exactly as it happens to children. Not surprisingly, it is estimated that a 4-year-old child has already processed, through multisensory exploration of the environment, approximately 50 times the data of the largest current LLM.
These are just two of the new frontiers that are opening up in the field of artificial intelligence, in an attempt to overcome the limitations of LLMs and really get closer to AGI. Frontiers that require not only technological advances, but also and above all a profound rethinking of what intelligence is and how it can emerge in artificial systems.
LLM, (artificial) intelligence no longer lives here
I'll get to the point. For decades, artificial intelligence has been “trapped” in a paradigm of pure symbolic manipulation, based on the idea that thinking is essentially processing strings of abstract symbols according to syntactic rules. It is the paradigm that gave birth to expert systems and semantic search engines, and which ultimately lies at the basis of current LLMs, albeit enhanced by datasets and neural architectures.
But perhaps it is precisely this "disembodied" and reductionist paradigm that represents the real bottleneck towards AGI. Perhaps intelligence is not just an algorithm to be run on a computer, but an emergent property of complex systems that dynamically interact with an environment, modifying it and allowing itself to be modified in a continuous cycle of perception, action and learning.
Perhaps, to create truly general artificial intelligence, we must draw more inspiration from the only general intelligence we know, that is, biological intelligence, with its distributed architecture, its neural plasticity, its sensorimotor anchoring in the world. And perhaps we must also recognize that intelligence is not a goal to be achieved, but a continuously evolving process, which does not have a predefined final form.
This doesn't mean that LLMs are useless or to be thrown away
Far from it: they represent an important stage in the evolution of AI, and still have many practical applications to be explored. But perhaps it is time to scale back the messianic expectations that many have expressed, and to recognize their intrinsic limits as candidates for general artificial intelligence.
If it ever arrives, AGI will probably not be a disembodied superbrain babbling in 1000 languages, but an integrated and embodied agent that learns from the world and transforms it, a bit like we humans do. And to get there, it will take not only much more energy, but above all much more imagination.
The frontier of the possible
I think the point isn't even to get there, to AGI. The point is to continually expand the frontier of what intelligence, human today and artificial tomorrow, can do. It means pushing the boundaries of what is thinkable and possible, through the hybrid collaboration between our biological and synthetic minds.
After all, this is what we have always done, since we carved the first symbols in the stone or pressed the first keys on a computer. Use technology to strengthen our intellect, to multiply our cognitive and creative abilities, to tackle increasingly vast and complex problems.
LLMs, with all their limitations, represent a step forward on this path. They show us how flexible and powerful language is, a technology in itself, which permeates every aspect of our lives. And they challenge us to invent new ones, new grammars of thought, to express the inexpressible and imagine the unimaginable.
The real goal is not to create an artificial intelligence that replaces us, but to co-evolve with it in symbiosis, bringing out forms of intelligence that we don't even know how to conceive yet.
LLM and the future of intelligence
LLMs are here to stay. Like bikes in the world of transport, they are destined to give us a big hand, but more will be needed.
Perhaps the future of intelligence is not a technological singularity, but a plurality of interconnected intelligences, human and non-human, biological and synthetic. An explosion of cognitive diversity that will take us beyond the current limits of thought, towards new frontiers of meaning and possibility.
But to get there, we must first free ourselves from the preconceptions and narrow visions that still imprison us. We must stop chasing computational illusions that reproduce the external manifestations of our intelligence in a clumsy and partial way, without grasping their profound essence.
We must have the courage to radically rethink what it means to be intelligent in an ever-changing universe. And we must do it with curiosity, openness, enthusiasm. With the knowledge that intelligence is not an algorithm to be discovered, but a process to be created and expanded, day after day, error after error, intuition after intuition.
The road to AGI, or whatever the intelligence of the future will be, does not pass (only) through LLMs. It passes through the unexpected connections that we will be able to imagine, through the unexplored spaces that we will be able to inhabit, through the impertinent questions that we will be able to ask.
It passes through our ability to be amazed and to dream, to make mistakes and to learn, to deconstruct and reconstruct ourselves and the world around us. Because intelligence is nothing other than this: the courage to always venture a little further, a little higher, a little deeper. Towards the next limit to break, the next frontier to explore. Towards the unknown that awaits us, and which perhaps, thanks to AI, will no longer scare us.