From Chatbots to Agents: The Evolution of Artificial Intelligence
AI agents are raising the stakes yet again for what we can expect from our technological companions in the coming months. Over the past few months, when I’ve been giving speeches or presentations and showing the audience how many of the things that were said to be impossible or “a century away” are happening every few months as we live and breathe. The rate of change and the things you and I will be able to do in the coming months and years will lead us to a much better future, where more people can do more things with the aid of artificial intelligence.
AI Agents
These are the latest developments in AI and mark a clear departure from the chatbots we have all become accustomed to over the past year. These new systems from OpenAI, Google, and soon many others, are the next significant evolution in the AI race to create more functional, robust, and embedded AI that can take on ever more complex tasks.
Agentic systems (agent systems) can reason, plan, interact at human speed, tell jokes, emote, use tools, be interrupted, and much more. This can be done in spoken natural language without the need to type and wait for a response like we currently do with traditional chatbots. But before we get too deep, let's recap quickly where we are and then dive into the high-level element of agentic systems.
Where Are We?
Today, if you want to use an AI system as a non-AI expert, you will use a text-based chatbot. Think ChatGPT, Claude, Gemini, etc. You enter your request, wait a few seconds, and then wait for the system to generate the requested information. It can be a research question, a poem, a strategy—you name it. Undoubtedly, it is handy, but it’s not how we usually communicate information. There is a limiting factor when we must write everything down and wait for the AI to respond.
There is also the issue that arises when we notice the AI generating a response that’s not what we wanted. Sometimes, we can click stop, but we have to start the process over until we end up at the optimal solution. Or what if we forgot something in our request? What do we do? Well, we have to wait and then try again—by typing. But this is changing, as we will discover. In the same way you can politely interrupt your coworker or management consultant when they are giving you an explanation, you will be able to interrupt and redirect the AI in real time when it’s giving you a response.
You might be thinking, what about vision? What happens to texting? Excellent questions. While agentic systems can function via text, voice, and visual modes, the main advancement we are focusing on today is voice interactions, since it is natural for users to migrate to using these systems with their voices for most use cases. Vision has its own valuable use cases, especially when combined with voice interactions. But that’s for a later article.
Where Are We Going?
In the coming weeks and months, you will have access to these systems, and you can forget the days of typing—well, for the most part. Not only that, if you notice something incorrect in what the system is generating, you can interrupt the AI to adjust the output, change the request, or start over. But that’s not even the best part of these systems.
These systems have the same latency as human speech, with speeds ranging from 232–320 milliseconds, which means you’d feel the same as if you were speaking to a friend or colleague when completing tasks with the system. The conversations also don’t have to be short or zero-shot. Instead, you can explain and converse with the system to lay out what you want it to accomplish for you and in what manner.
This brings us to the following essential element: planning. A major challenge with chatbots has been their inability to plan in a true sense and carry out planning-dependent complex tasks. The evolution of these systems' ability to plan provides a greater range of tasks, complexity, and use cases for agentic AI. One of the critical elements of planning is the system's ability to identify and use the correct tools needed to complete a given task.
For example, if you need to create a software project, the system could assist you in assembling research for the project, then help you write some code based on the research, then help you check and test the code, and so on. This allows the system and you to consistently create more valuable and reliable work products.
A core reason the system can generate realistic plans is that it has memory that allows it to remember a broad range of essential items about the tasks and the individual user. This makes it increasingly valuable the more it is used and will undoubtedly supercharge productivity in users’ personal and professional lives.
This ease of use, combined with increased productivity, offers a very attractive foundation for companies and workers alike to dramatically augment their competitive advantage both within their industry domestically and internationally. For companies in Mexico and Latin America, a deep focus on crafting an AI vision and strategy to harness agentic AI in combination with other forms of AI has never been more pressing. The opportunities are there; you just need to start the process and get going. But AI is very cold and boring to work with, right? My team won’t enjoy it, you say!
Some of the most fascinating and enjoyable elements of these systems is their ability to emote in a way that breaks the machine and human barriers that typically exist when working with an AI. We’re social animals and prefer to spend our time with “things” we feel we can bond with and understand us—something these AIs offer. When using the voice mode, you can feel the system's warmth, laughter, and playfulness, making working with them that much more attractive and immersive. Imagine if Excel could crack a joke when you’re making a pivot table or cleaning a data set. I don’t know about you, but I’d like that.
Ultimately, what makes these systems so incredible and powerful compared to the chatbots we have all become used to is the ability to take action and complete tasks. Rather, it’s the ability of these AIs to speak with us, joke with us, and bond with us as we improve our careers, company, and future opportunities. It’s as if your microwave took the food out of your freezer, defrosted it, then heated it while telling you a story, and then served it to you for your enjoyment.
Right now, we’re in the beginning stages of what these AIs can do, but soon, the range of possibilities in application will be exceptional and limitless. From helping to design and code software projects, research and design architectural plans, to perhaps dramatically helping a firm reposition itself globally.
As the old saying goes, the best time to plant a tree was yesterday; the next best time is today. The same goes for starting your journey in artificial intelligence.
Today is your day.
Originally published in Spanish for Fast Company Mexico:
https://fastcompany.mx/2024/10/19/agentes-ia-chatbots-christopher-sanchez/
 
                         
            