Is ChatGPT and the its family of LLMs alive? If not, will models like these someday (soon) be alive? Do we need to be afraid of them destroying our world as we know it? And regardless of whether these things are alive or not, will they be useful?
Those questions are big questions right now, and for good reason. We finally have computers that can hold a conversation with us, and in those conversations deliver so much capability and insight that they often seem smarter than us. While not without flaws and mistakes, the current errors are seemingly fixable enough that the AI future we have so long seen in books and movies is finally on our doorstep.
If you haven’t tried it, ChatGPT, which is the ‘Kleenex’ brand of Large Language Models (LLMs, sometimes informally called ‘chat bots’), can give a decent answer to almost any question, and with further conversation can provide changes or more specifics. These answers aren’t just text Q&A. It can summarize articles. It can generate functional computer code. It can perform classification and mathematical computation. In simple, it can do a lot, although certainly not everything.
I will say that I think ChatGPT is AGI – artificial general intelligence. I don’t think there is really any doubt about this. Some people say AGI means to be “like a human” but I think that is a silly goal. An LLM is artificial, and it is able to provide useful intelligence and decisions on a broad, general array of topics, hence an AGI. This is why it is so exciting – one model can theoretically do many things given only simple instructions. Before in data science, our models had to be painstakingly tuned for very specific tasks, and while the need for that will continue, now, especially with multimodal (sound, images, text, and more) versions of these models, we really start to see the ability to set one tool to many useful tasks, opening it up to much broader adoption.
To me, though, the thing to marvel at here isn’t just the generative AI, but rather our human language. These language models are tuned to learn our language and to learn how we use it, but not to directly make decisions. The fact that our languages can with their patterns capture so much detail about the world and do so many things, that is what is so impressive. Many of the ’emergent’ properties of LLMs are actually the emergent properties of our languages. Language is the great enabler of human civilization (enabled further by its wide, archived use across the internet), and all these language models really do is provide an interpreter that allows the wonders of the electronic world – rapid and large scale computation, near infinite storage, and global connection to be coupled with the wonders of the human language. It is more a bridge that a standalone entity.
Note that I haven’t said the ChatGPT or its kin is alive. You will, when using one of these chat bots, will almost certainly have a sense that you are talking to a human. Inevitably you will start thinking about the AI apocalypse and Terminators wandering the streets, hunting you down. But before you get lost down that track, I think it is important to actually think about the underlying natures here of human vs AI, and to really understand how your instinctive assumptions are problematic.
AI models in general consist of three primary components: architecture, optimization/evaluation metric(s), and training data. Architecture is the means by which input data is processed into an output data shape. The architecture we use today is very inefficient compared to the human brain. If data scientists were completely honest, they would admit that most of the architectures we use today are the product of nothing more advanced than lots of “guessing and checking” until something worked. Training data is critical, because a model can only learn based of what it has seen. Modern LLMs are basically fed the entire internet, so there is a lot for them to learn, but also a lot of contradiction and incorrectness as well. Finally, the optimization metric. This is perhaps the most important piece to understand. The optimization metrics are the “life purpose” of the AI model. They are what the AI model is trying to do.
To understand AI, the architecture can mostly be ignored because all the architecture really does is determine if it all works or not (unless you are actually trying to build or change the model, in which case you can’t ignore architecture and will spend most of your time on it). What really shapes what the AI is doing then is the training data and the optimization metrics.
Armed with this basic understanding, I want to squash the notion that these LLMs are self-aware, or are likely to be so anytime soon. And to do that, we must dive into why humans are the way we are.
Firstly, remember that humans really love to personify things. We make humanoid cartoons out of almost anything. We assume our pets have identical emotional presentations to human ones. If anything shows the slightest bit of intelligence, we assume, we want, it to be like our selves. Yet just because something is smart, and clearly intelligent, doesn’t me it has a sense of ‘self’, ’emotions’, or ‘desires’ baked in like we do. Personification should be avoided in thinking about AI. LLMs in particular and AI in general have very different architecture, training data, and optimization from humans.
Humans and the other living things in our world are the product of billions of years of evolutionary optimization. Billions of years is a effectively unimaginable time to us it is so vast, and the amount of training and optimization there makes ChatGPTs massive computation training time look very young and naive.
The story of life begins with something that is basically an enzyme, a relatively simple chemical that can by it’s basic nature help make copies of itself from existing precursors. The copies it makes of itself are not very good, so there is a lot of variation. Most probably don’t really work at all, but some are capable of assisting self-creation better than others. The most successful make the most copies, and so on progressively favoring the most successful, and in doing so complexity builds. Amazing complexity like we have today takes a ridiculously long time, but it happens. It is important to understand that it isn’t the “strong” that survive but those that are strong and the most determined to keep existing. Existence becomes our basic optimization metrics. Our base architecture is a polymer called DNA. Our training dataset is billions of years of ecology and geology.
There is tons of complexity and nuance to the human story. For example, I am a fan of the ‘social brain hypothesis’ which basically posits that once human ancestors reached a certain level of intelligence where they were able to start thinking critically about each other, in created a sort of arms race where ever more intelligence was required to handle the ever increasingly complex socialization tasks, like an expanding complexity of language. All of this is quite interesting, but what we have so far should be sufficient to make the point.
You are wondering, why is this important to our discussion of AI? Well, the idea of “self” or “awareness” or “consciousness” is a direct product of our massively-optimized survival and reproduction goal. We need to be aware of our own existence in order to make decisions and actions pertaining to it. We need to be aware of our self vs others in order to form our highly-optimized complex societies. Intelligence is a powerful survival tool, but only if it has a sense of what it is trying to analyze and deal with, hence the sense of self.
The point is this: a desire to live is a very specific optimization goal. Being alive refers to this survival and reproduction goal, with some independence and energy generation capability (thereby excluding viruses). The sense of self, of consciousness we have is a very powerful tool that took a very long time and complex set of pressures to develop beyond just being alive, but ultimately serves that basic purpose.
A photograph isn’t the same as the actual landscape it shows, even if they look the same. Just because a LLM can show a likeness that we associate with being alive and conscious, but that doesn’t mean it is conscious. Nor is it couscous, a frequent typo I make when trying to type “conscious”.
Overall: being intelligent and being alive are two different things.
AI could be programmed to survive and reproduce, but you would end up only with something more like a virus. AI viruses will definitely be created as powerful hacking tools. However, they are not as scary as they may first seem. AI isn’t magic, it could be very effective at finding vulnerabilities in a network, but ultimately it would be limited by what vulnerabilities actually exist. And AI would certainly be used as a white-hat hacker to do penetration testing, finding vulnerabilities and patching them before they can be used to compromise a system.
Ultimately, the sense of self is the end production of a ridiculous amount of optimization on the survival and reproduction theme, and while definitely possible to build into AI, I think it would take an immense amount of effort with ultimately no real gain besides proving it can be done. We aren’t optimizing AI for that path, and even if we did it would take an immense amount of time, compute, and tinkering to get it there, things I don’t see happening anytime soon when task-focused AI provides the real return on investment.
If you are really worried about the dangers of AI, probably the simplest would be to ban any form of self-replicating AI. But even if someone did go rogue and make a self-replicating AI, it could be controlled. AI isn’t magic. It need will always need a decent bit of compute and energy, it has clearly identifiable elements, and it faces the same challenges of any virus: diversity. Diverse genomes prevent viruses from destroying all hosts, and our diverse computer systems, ancient and new all mixed together with millions of different configurations would be a challenge as well. Diversity of AI systems supporting us would be critical as well, meaning there isn’t just one target to take down. Even if we stick AI’s in everything, as long as we don’t stick the same AI in everything, any rogue actor (AI or human) would find it difficult to subvert a usable chunk of the many thousands or millions of different AIs out there. In general, AI security won’t really differ significantly from existing cybersecurity ideas.
On a more practical level, control of AI comes down to training data and optimization metrics. Really think about what you are giving them and how you are telling them to use it, if you are the people building them.
LLMs are fundamentally different from humans. Instead of the massive survive-and-reproduce optimization, they usually combine two basic sources of optimization: the ability to predict the next word (a mathematical approximation for ‘understand our language’), and human reinforcement learning on good/bad outputs. And where are they going with this optimization? The answer, is really, we are trying to make them the into puppies.
Yes, LLMs, the computer dogs of the world, but it is an effective analogy. We are designing them to understand our commands (optimization 1) and then to react to those commands in pleasing ways (optimization 2). Likes training dogs. Domestic animals in general make an excellent comparison for these models as they show exactly what we have done in the past when given the option to selectively tune intelligent beings to our liking. We make useful tools of them, and with a few exceptions, we prefer house cats to having big tigers around in our homes.
Will LLMs bite the hand that feeds them? Dogs, well raised, do not, and that is even after their prehistory as highly-independent biting wolves. LLMs did not start life as wolves, but as librarians. That’s not to say LLMs won’t occasionally produce results that lead to harm, but rather that poorly designed or poorly raised (trained) models are more likely to have issues than those raised properly.
Rogue humans are a bigger danger than rogue AI. Probably the simplest approach here is to have a list of questions that public-facing AI can’t give a too-practical answer to, like “how do I refine uranium at home and build a nuclear bomb”. Humans can already do all sorts of dangerous stupid things, and them using a chat bot to help is inevitable and unavoidable. The goal should be to make sure that these LLMs don’t make it significantly easier for these dangerous actions to be performed than existing tools already enable.
An interesting idea I had would be completely open language models, but where you need a verified ID and a government accessible database behind it – where you can ask whatever, just know that if it’s suspicious, the government will be able to track it down if they want (it would be too much work to track down most people, just the idea they can be watching would discourage most abuse). Then more-expensive, privacy-sensitive models that have to meet higher standards of regulation to produce fewer dangerous outputs, and in exchange, aren’t monitored and can be deployed offline. Of course, open source will mean all regulation is moot, at least to those with the basic skills to fire up their own private instances.
Factors that help ensure our survival against rogue AI are particularly that large LLMs are relatively expensive in compute, slow, and more difficult to maintain. In production systems, cheaper, faster, and simpler are always preferred. These simple pressures ensure that the bare minimum models for a task will be used, and even if we develop fancy self-aware AI, it wouldn’t be practical to use it in most places when something simpler works as well, cheaper, and likely faster.
So far, we have mostly outlined the relative nature of LLM intelligence and the dangers they can or cannot pose. Now, how do we actually use them?
Well, it is pretty simple. You ask them a question or otherwise request with writing a written response in turn.
‘Prompt engineering’ is a term coined to refer to the effort put into designing prompts, questions, for these large language models. I personally dislike the term. It was clearly come up with by people who don’t admire the study of language. Prompts need to be specific, and they need to be in appropriately styled, quality English. Is writing a form of engineering? I suppose you could argue it is so, but I think it would be best to think about this not as an engineering challenge but as a communication challenge.
The LLMs are going to try their very best to help you, but they can only give you what they think is the most likely appropriate response based on what they have learned, with absolutely no prior knowledge of you. Models don’t have that quick glance of details that humans get from seeing a person, much less the depth of that comes from extended history. Imagine communicating with them like a party game, where you are slipping notes to someone who can’t see you, and who has to follow the instructions based only on those notes. Asking “how do I get rich?” could produce all sorts of answers: from spirtual richness to playing the stock market, you need to be more specific to guide the response.
Incidentally, this makes me think Google (in particular, but also other big techies) is going to have a big advantage in chat bots as they can feed the massive profile they have built about you for ads into the LLM as a prior knowledge, which will definitely make more accurate initial responses. I don’t believe this is currently a feature of these bots, but I imagine you will see it very soon. Monetizing shouldn’t be too hard either: my vision is sidebars for products ads or affiliate links directly relevant to the conversation, much like is already done in search. Of course, selling the APIs will also be a revenue stream – smaller companies will want regulator-approved, highly-polished bases for their own products.
Currently there is a lot of what I call “arbitrage” in inputs to models. Slight differences that sound the same to us can produce very different outputs. Making models more “deterministic” or “consistent” is the main area of work on these models, and I strongly suspect that future versions of models will have fewer issues with this.
I see two possible developments to assist the use of these models. The first would be a hybrid human-programming language that gives much more specific stylistic requirements. Think of it like “data types” but instead “syntax type”. For example, “tell me how to cure this disease in the style of H10 (scientific journal) or E3 (simple but accurate layman’s terms) or D23 (British humor).” Since composing an entire catalog of humanities desired tones and moods and all the combinations thereof would be hard work, more immediately we are already seeing models that are customized (by optimization metric/reinforcement learning and by training dataset composition) to specific contexts, for example the medical chat bot that will always speak like a qualified medical professional (and never the British humor). These customized models are no longer the AGI (general intelligence) but can be built as a fine-tuning of the more general AI starting point.
Large language models aren’t calculators, nor scientists, they are language. There is a ton of information on the internet which they have learned from, and synthesis is a very real property of language. In my mind, AI models are capable of seeing similarity in a way we can’t, capable of very interesting connections inherent in our language’s stored knowledge base. What LLMs can’t do is their own science. They can’t gather new data based on a hypothesis and test it. But what language can do above all is communicate instructions.
In general I expect to see us developing more standard languages for defining our world. LLMs don’t care if the language is math, English, French, or star charts. What it needs is a structure and patterns for things, and then a way to translate those with the other languages it already knows in order to utilize bigger connections and enable users, who can only speak English, to receive answers on anything. I suspect startups that develop a “language” for particular task and manage to interface it with an LLM will be more valuable than those startups building LLMs. But designing a proper language for a task is really difficult work that not many will be able to do.
This is where we are going with the fanciest applied uses of LLMs. What you can do is essentially give a recipe to a skilled chef, where the LLM is the chef. The chef already knows the basics of cooking, so your instructions don’t have to be exhaustive, but it does generally need to use language the chef can understand. If it is something the chef doesn’t know how to use yet, say a custom tool you have built, you need to provide more detailed instructions.
The greatest limitation of LLMs is that they are stuck with language. We haven’t yet made a “language” for exactly how to drive a car, and accordingly expecting an LLM to help us create self-driving cars is foolish. But we have created languages for lots of other things, particularly math and programming languages that can do all sorts of useful things. In general, interaction with the physical world is possible, but will require lots of extra work to interface with an LLM.
In order to really benefit from the automation potentials of LLMs a business needs to be digital language driven. That means data. This is why Nousot, my employer, is really excited by this technology. The basic tasks of implementing LLMs are much the same as existing data engineering and data science challenges where interfacing inputs from the messy, rather abrasive physical world and then returning insight and decisions back to it has always been the basic challenge of generating value. The new value is that LLMs can handle a lot more complexity more easily, really opening up the number of problems we can tackle, and/or how quickly we can tackle them.
But I repeat, AI isn’t magic. Automated systems are still going to take time and skill to deploy.
Will all this new automation destroy jobs? I generally follow the arguments of this pay-walled article from the Economist: Your job is (probably) safe from AI. It shows that automation does destroy jobs sometimes, but usually more slowly than predicted, and such cases of it totally removing a profession are quite rare because most jobs are multifaceted and will evolve rather than disappear. Many of the most obvious targets for LLMs, like call center operators, are hard jobs to keep staffed anyway, because they can be so soul-destroying and low-paying to boot. And skilled call center operators (like the tire experts at a previous job I used to hear) will be harder to replace, and add a client-relationship value simply because humans naturally prefer humans because humans. I personally suspect this will push us to seeing a more defacto 30 hour work week, as some of the more tedious work is removed, leaving behind higher-quality more mentally-draining elements that exhaust our focus most quickly.
Overall, the beauty of LLMs is that they can combine the language we know with a much larger knowledge base and faster computation. You already know exactly how to use them because it is much the same as how you’ve been interacting with people for your entire life. This doesn’t mean they are alive, or even close to being alive. This doesn’t mean they are perfect. But you now have a personal, highly skilled chef on hand. The question is, what do you want to make?
This post may be reposted in original or modified form by Nousot.