Autonomous Agents and Large Language Models


Autonomous Agents

Given the rapid rise of ChatGPT and the rise of many similarly capable other large language models (LLMs) (e.g. Google BARD, LLAMA, Stability AI), we’re living in an exciting time for singular tasks like writing or generating images. These previously creative tasks that were thought to be difficult for any AI to do, are now done routinely across hundreds of millions of users in just a few seconds.

An exciting trend that was unlocked by these LLMs is the rise of autonomous agents. While large language models are able to handle individual tasks, autonomous agents are able to take on an objective and do the following:

  1. Create tasks
  2. Collect data
  3. Prioritize tasks
  4. Execute tasks

Here’s an example architecture from Yohei Nakajima:

Travel Agent Example

To better illustrate this, think of one these autonomous agents acting as a travel agent.

An example prompt you might give this autonomous agent could be: “I want to go to Tokyo Japan, leaving LAX on May 5th and returning on May 18th. Book my flights and lodging. For lodging, please optimize for user rating, cost and distance to transit. I must have a window seat on the plane. Make sure everything is refundable.”

In the above, an agent may do something like:

  1. Search Google Flights and JAL/ANA websites for the best flights to Japan
  2. Pick the lowest cost flight with the window seat available
  3. Book the flight, filling in all your passenger information any payment information
  4. Search AirBnb, TripAdvisor, Booking.com, Hotels.com for lodging options filtering by cost and user reviews. For each hotel, compare the distances to transit stations
  5. Book the lodging using one of the above portals

An agent performing all these tasks and optimizations above can save someone so much time and reduce the overall cognitive load and worry.

Autonomous Agents Development

There are several active projects doing work on autonomous agents today:

Two user-friendly interfaces include:

Writing in April 2023, interest in these projects are huge. For example, here are the star histories for AutoGPT and BabyAGI:

Star History Chart

OpenAI’s recent demo of their external ChatGPT plug-ins also show how they’re working on developing these autonomous agent capabilities with several applications, specifically being able to use Google to fact-check initial responses:

Autonomous Agents Applications

Some of the coolest applications, I’ve happened upon include:

  1. Ordering pizza - order a pizza on dominos using your browser and a plug-in
  2. Creating a podcast outline - research recent news topics and create a podcast script
  3. Market research - research waterproof shoes with pros and cons

The above includes just some very early use cases.

I can imagine these agents being used almost any set of tasks handled by humans today. This can include:

  1. Customer support handling
  2. Scientific research summarization
  3. Knowledge base management
  4. Legal analysis and document generation
  5. Medical diagnosis and prognosis
  6. Coaching
  7. Drug discovery
  8. Investment selection
  9. Sales prospecting

The use cases are endless.

We live in an exciting time of turbocharging human potential with these large language models and autonomous agents. I can’t wait to see what we develop in the coming months!