Microintelligences or AIs talking to AIs in the cloud
Microintelligences could do a lot in open scopes that good old APIs couldn't
After going to a bar yesterday to lament not owning Nvidia and shitposting on Twitter about it, I decided to think about Nvidia seriously. And if you want to chase Nvidia seriously, you need to have a view of the addressable market. A straight-forward way to address it is: how many words each person will generate per day? One can go through many paths: recreative generation, blue-collar generation, white-collar generation, break down the white collars by main professions, and so on.
But then you think further, there is another significant use case that is machines talking to machines. And it doesn’t require much thought to realize that machines are not bound by time, therefore they can chat with each other way more than human to machine text.
I remember some years ago I listened the episode of ILTB Eric Vishria - The Past, Present, and Future of SaaS and Software. Here a quote from Eric:
So, the last couple generations of SaaS have really been or software broadly, have really been about business users using software to get their job done. So, you have a person in an organization, say a sales person, who is using Salesforce to get their job done. That has been the last generation of SaaS and software and that has been very, very effective and it's helped people, individuals become more productive
In that case, the user of the software is a human being, and therefore, the interaction with the software is through an interface called a graphical user interface, or GUI for short. Which is often in the SaaS world or mostly in the SaaS world, web based, through the browser. So that's what the interaction was.
What's happened is Jay Kreps the founder of Confluent and one of the authors of Kafka, he had this blog post last year, reading it made this come to life for me. What he described as like, "Hey, businesses have historically used software. But what's happening now is businesses are actually becoming encoded in software." That's a mega shift. When businesses become encoded in software, now what you need is not a person necessarily using software but you have software using other software. And when software is using other software, it's obviously not interacting through a user interface, it's interacting with the other software through an API.
API stands for application programming interface. So, the software is talking to other software through this interface. I think a really nice way to think about APIs, without getting into the technical mumbo jumbo, is if a human is interacting with software they're doing it through a GUI and if software is interacting with software, it's going to happen through an API.
That's kind of the analogy. It's a really interesting way to think about things and you can go through kind of example after example inside of businesses where this is really happening, where the business used to use software and now are increasingly actually encoded in software.
(…) So, think of maybe a loan application process. Old days, whatever, forty years ago a person goes and talks to a loan officer, the loan officer asks for a bunch of documents. Those documents come in. The loan officer assembles those and make sure the packets goes back and forth to the customer blah-blah-blah, takes it to kind of the credit committee or whatever to approve it, and presents the case manually and there's a discussion, and the decision comes down, et cetera.
Okay. Then, people started using software for that. Now, we have PDFs things. We use DocuSign for things. We start to assemble documents electronically, maybe there's some workflow that helps facilitate everything, okay, and finally we have a complete packet. Now, hit submit. That packet gets submitted digitally and everything else.
Where we're working towards, and where we'll get to, is actually this entire thing being encoded in software. So, literally, you're clicking a button, the customer's supplying some basic information, and then software is going out and talking to your bank and talking to your existing mortgage company and pulling all of this information together automatically. And assembling that, maybe doing some analysis, and then another piece of software, or another service, is actually looking through that quantified data, and actually surfacing automatically a decision based on historicals and everything else and surfacing a score or a decision based on that. And then surfacing that back to the customer immediately and you can imagine a world where all of that, you see this with kind of credit cards and things today.
All of that happens very, very quickly. So, what used to be a multiple weeks manual process that was labor-intensive for the customer, labor-intensive for the loan officer and the credit committee. All of a sudden starts to become very fast, very real time and very automated.
First of all: we were so early! The man had to explain what an API was. How great 2020 was (for investors). Good times 🥲
Second. This simple comment caused me a bit of an epiphany on how great software is, it was the last drop of water in the bucket that overflowed. The key thing about software is that it enables abstractions. And what Eric is talking about is that you can create increasingly higher levels of abstractions, when the customer doesn’t have to worry about what is happening the other side of the API. Of course, this happens to the mortgage API, but also for a lot of technical stuff. There are lots of virtual machines running on Docker, running on Kubernetes, running on Nutanix, running on AWS.1
In a certain way, what Eric is saying is that the world would adopt Amazon’s API mandate:
All teams will henceforth expose their data and functionality through service interfaces.
Teams must communicate with each other through these interfaces.
There will be no other form of interprocess communication allowed: no direct linking, no direct reads of another team’s data store, no shared-memory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network.
It doesn’t matter what technology they use. HTTP, Corba, Pubsub, custom protocols — doesn’t matter.
All service interfaces, without exception, must be designed from the ground up to be externalizable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions.
Anyone who doesn’t do this will be fired.
Thank you; have a nice day!
Can I be honest? A lot of this happened. There are APIs for many things, even if they aren’t exactly as elegant as Stripe’s. It just takes scrolling Zapier or Power Automate to see. But nonetheless, a key vendor of a portCo that is a $100B market cap company can’t send us the breakdown of our revenue in any other format than PDF.
On the other hand, a lot of stuff is simply hard and cannot be easily encoded in an API. There are many many people whose job is to deal with edge cases.
Having worked at a G2K bank, a nice description of it is that you have a lot of clusters of people of like 10-15 people and they communicate with each other a lot to get their job done.
So let’s make it realistic. Let’s think about the Market Risk team at a bank. They will make lots of stress tests, Monte Carlo simulations, and worry with stuff like “what would happen with our bank if interest rates dropped?”. And a lot of the time, they can just make their assessments and communicate in the system that will aggregate with other types of risk and send to the Senior Leadership their daily update or whatever. They input the risk they’re seeing into some kind of system and all the other interested parties in the organization (or outside) can consume this information. And lots of the inputs they need to make their risk assessments they will also consume from other people’s systems. Presumably, someone else has the job of listing all the assets in the balance sheet, pricing them, and giving other type of information.
So what these people do all day? A lot, because there are lots of edge cases. For example, in January 2020 I was present when our COO came and asked the Market Risk team “perhaps you should compare this new coronavirus with SARS” and because they weren’t prepared for SARS, I suppose they downloaded some relevant data, run some Python notebooks, and made some fancy projections that were proved completely wrong in 3 weeks. Likewise, lots of these Market Risk teams must be responding to their bosses questions like “what will happen if the U.S. government defaults?”.
But the point here is that lot of the communication in a Bezos mandate world don’t work though APIs, because valid inputs and outputs can be very arbitrary. That’s the reason why email (or Teams/Slack) is so widespread in the corporate world. There are many many curve balls, open scope problems that need to be addressed.
Enter generative AI.
Much like software, generative AI can be arbitrary abstracted. You can make an AI that makes prompts, AI that evaluates responses, an AI that asks other AI for an input that they’ll use for their response, and so on. Despite models like GPT-4 significant excellence in many performances, their probabilistic nature makes them unpredictable and some problems like summing two numbers are insurmountable problems without help due to their nature.
AIs taking with AIs in the cloud will help each other significantly reduce their randomness.
Architectures like Google’s ReAct2 enable the self-reflection process. They can break their requests in subtasks, they can ask other AIs that are responsible for other necessary inputs, they can make APIs calls, they can write code, they can access databases in their scope, they can create archives, they can create a routine. And eventually they’ll be able to take control of UIs (hey Automation Anywhere and UiPath, you need to externalize GUIs for LLMs so that AIs can interact with all the legacy Visual Basic 6 systems).
And a neat part of this is that you can mechanical turkey your AI into it.
So say you have your process. Maybe you get requests, and you create tickets of some sort. but now you abstract it behind a LLM. Perhaps now the LLM helps breaking it down into many sub steps and perhaps they know to whom to assign. The LLM can start by creating a to-do list. And then you can start automating what kind of tasks will be done by the AI and which will be left for humans.
Or perhaps the AI tries to come up with a solution for the problem and the human just approves the plan.
Until eventually we have better AIs, with more context spans, and more importantly: cheaper.
In the long run, the AI of the public relations team of the bank will get an email from the Federal Reserve saying “hey, we’re doing some stress tests for fun to see what would happen if we put the Fed Funds Rate at 6%, can you give me an estimate on how your asset, liabilities, tier 1 capital, Basel and so on would happen in that scenario? Plz, send in 2 weeks”. The public relations AI has no idea on how to do that, they think Basel is a city in Switzerland, but they send it to the Market Risk team. So, the AI creates a task that they need to resolve in 2 weeks (like a thread), even if what they do is keep asking Market Risk what they’re doing (or escalating if the deadline is approaching).
Another way to frame is the sole process of the inner thought of the AI will need to be run in the cloud, the AI talking with itself in the cloud.
Anyway, microintelligences can extend APIs to next levels. They can automate entire teams, deal with edge cases, define their own priorities, execute their tasks, collaborate with other AIs, all the while working in limited scopes with significant predictability due to many layers of reasoning.
Investment takeaways:
I should look at NOW, TEAM, and other stuff that can replace seat-based workers with AIs, that actually do the task3. Seems like the most trivial way to implement what I am talking about here.
But it is not clear if certain companies that are currently seat-based can make this transition and make more money. I guess, yes? But who knows
I don’t believe this can work as well in the front office. Maybe it’s a generational stuff and Gen Z will be ok with sales robots, but if I am dropping a big check, I want to be wined and dined by a human.
This can create tons of shadow IT if the man-machine interaction during the deployment phase takes a long time
This will take time and it is certainly too expensive to do today. But it seems like a good startup idea to create an orchestrator for all of this. 9 out of 10 CIOs will kill you if you suggest to them for AIs to run the code themselves generated. But they’ll have to have access to the internet, right?
Robotic process automation could be an interesting AI play if they found a way for AIs to interact with them
Does the observability crowd have an angle here? I don’t know. Perhaps, before calling you to say shit hit the fan, PagerDuty could run a virtual war room, try some approaches. Or at least, raise some theories on why it is causing problems.
Actually, 10 out of 10 CIOs won’t allow the PagerDuty bot to open the PowerShell in the production Windows Server to run some commands to test some hypothesis the AI came on why the performance of the legacy .NET system has deteriorated.
This sound like some Less Wrong thing, but trusted AI is the key to enable AIs talking to AIs. A lot of guard-rails will need to go before you get access to the PowerShell
This demand seems incremental, doesn’t it? Even if you automate the entire purchasing department removing the human that interact with the SAP through the GUI, you’ll still use the SAP
This smells digital twins
It’s possible that micro intelligences don’t even happen in this hype cycle. Or never! What do I know?
If you’re trying to estimate how many tokens per day the world will estimate through generative AI, one question you may need to answer is: how many work-related words does the average white-collar work think per day?
Anyway, exciting times! Share this post with colleagues if you think it helps to elucidate how to think about the demand for Nvidia.
If you allow me to digress, I have a conspiracy theory that a sizable portion of the world’s computing (or at least server side, but client side too since the M1 Mac) is spent not actually running stuff but running abstractions. Of course, we could simply not do that, and everyone codes in C++ (or even lower-level languages), but it would waste a lot of time. When my coding professor in college was explaining why he picked Python for us for the course, his main point was that despite being slower, you should measure how slow a program is by the time it takes to code it + the time it takes to run. And reality is that make as a lot of sense to just code badly, but in a way that makes stuff easier. Abstractions make a significant portion of it. Therefore, people are exchanging making more operations and a higher silicon bill for paying less labor. I stated this for abstractions, but it can work for any flexibility at large. A friend of mine works for a guy that whenever he asks him to whether they should optimize a certain database the man simply says “nah, let’s just put it in RDS and allocate more resources”.
I find it funny they gave a name for it and it’s Google’s. It’s basically thinking and thinking about what you thought.
Can you add stuff like Zendesk, Freshworks, Intercom and Salesforce Service Cloud here? Yes, but it seems way less fun and everyone already thought about it. A chatbot that reads knowledge base is nice, but isn’t not like a 10x experience.