Yes. What you’re demonstrating is a distinct lack of understanding of how the brain works and how neural networks were developed to mimic how to brains works to solve problems computers struggled mightily with.
Does anyone have a good tutorial to bring Freddie up to speed?
lol wrong. A child can see a rocking chair one time and recognize the category consistently after just that. The best programs still require a gigantic sample set. It’s not the same. Sorry.
"don’t think they learn by downloading training data"
Oh, yes they do. Why do you think masking children and adults was so harmful to the children's development. We spend our formative years learning the language, both physical gestures and spoken word by observing and listening. Children who are raised locked in their rooms, or locked in a closet have extremely stunted mental abilities. Even exposure to language and social situations at a later date never recovers the mental development which was missed in the critical developmental stages.
You weren't. But if you think the fact that they have to make any observations whatsoever is evidence, then I'm not sure we're going to have a productive discussion :)
Newborns can't see anything shortly past their nose after birth, they have very blurry vision. But overall infants make inferences and infer qualities of the natural world pretty adeptly. They react with surprise to events that seem impossible given the laws of physics, recognize their caregiver by smell/touch/sound, etc. AI would need to be embodied in some way to accomplish those goals. I think it's possible, but LLMs aren't it.
Right. You cannot even begin to compare the total quantity of training data required for an LLM vs a baby. Infants are parsimonious in that way. Not that AI can't get there. It's not there YET.
They do come pre-loaded with knowledge that there are things you sit on.
But the truth is a bit more complicated, relying on a couple ideas. The first is "inductive bias," which is the idea that our brains (or any statistical model) has certain concepts that it's designed to learn easier than others, usually by virtue of how it's structured. Humans have an inductive bias for things like understanding the emotions of others, recognizing faces, and lots of other things we call common sense. We have terrible inductive biases for mathematics, though.
There's also the fact, observed in artificial neural networks, but which I (slightly controversially) believe applies to humans as well, which is that pre-training with a high capacity makes you more able to learn new concepts if they're similar to what you're already primed on. If you've seen a lot of animals, then it's easier to remember a new one, even if you've never seen or heard of that animal before.
> What you’re demonstrating is a distinct lack of understanding of how the brain works and how neural networks were developed to mimic how to brains works
You may as well say they were modelled after the neural system of C. elegans. The level of abstraction at which LLMs resemble brains applies just as well the few hundred neurons of our nematode friend. (And, you know, I don't know how much you do know about brains, but there is tremendous internal structure by comparison. They aren't just "worm nervous system, but make it bigger".)
Let's say you run a store that sells cakes. Every time that somebody makes an order you write down a record of that order on a piece of paper and stick it in a filing cabinet. Orders are sorted and stored by customer last name.
You have an employee, named "Joe", whose job it is to look up orders on request. But as your shop becomes successful the number of orders soars to the point where you need to buy multiple filing cabinets and now it is taking Joe a long time to find orders.
One day after numerous complaints you are suddenly struck with a great idea: You already have two filing cabinets. Why not put all the male customers in one cabinet and all the females in another? Now with just one simple bit of information Joe can quickly decide which cabinet he needs to look in and his search and retrieval times are cut in half.
The new system works out great. But then you get another idea. Joe is pretty much spending all of this time looking up orders. Why not get him a co-worker? So you hire Susan. Joe sits at the cabinet what holds male customers all day while Susan handles the female customers. Now things are really humming--previously while Joe was looking up orders one cabinet sat idle. Now both cabinets are being accessed simultaneously 100% of the time.
Doesn't it though? The argument is that humans decide how to think about "new facts" like something called almond butter existing by following a series of logical steps.
Except one of the logical steps we're presented is just completely wrong. Almonds aren't legumes like peanuts.
That seems like an argument in favor of the idea that you didn't follow a series of logical steps, you followed some vague intuition and statistical reasoning.
It isn't just nit picking, I think this matters, you followed the wrong process and got the right answer. Does that mean you aren't thinking? Because when the AI follows the wrong process and gets the right answer we're supposed to say it isn't thinking.
My argument isn't that AI being right about almonds is indicative of the fact that the AI is thinking.
My argument is that being wrong about almonds is indicative of the fact that humans aren't really doing a special thing called "reasoning". The same "statistical inference" failures we attribute to the AI are happening to us! You saw a nut, remembered that some nuts are legumes, and decided it's a legume.
You're postulating that two things that kind of look the same have the same root cause. I think that's a terrible idea most of the time anyway but for a scenario where we know the nuts and bolts as to how these platforms were trained and can make a reasonable guess that it does not approximate the human mind? Even more problematic.
Very interesting. As humans we have experienced butter, which is a processed dairy product. The taste and texture of it. As humans we've experienced peanut butter, the taste and texture, different flavor similar texture. We think of peanuts as a nut, even though its officially a legume, which is a whole classification of plants including peas, beans and alfalfa. We put peanuts in the same pot at almonds, which are a nut.
The leap to understanding almond butter is less about understanding the botanical classification and more about understanding the textural feel in our mouth and the classifications of words by English speakers, including understanding that there are inaccuracies in our language classifications.
It reminds me of euphemisms. When Mongolians need to go out of the ger to "use the bathroom" they go "check on the horses". It makes perfect sense out on the steppe because there are a string of horses outside and no bathroom. One goes over the rise of slight hill and out of sight which literally could be out towards the horses.
Many euphemisms are based on actual ways of life and knowledge of things around you.
Euphemisms are difficult for non-native speakers. I wonder how LLM deal with them.
I agree there is a story about how we understand the phrase “almond butter” which is very similar to how LLMs “understand.” But that form of reasoning is *all* that LLMs have, while human beings possess other forms of reasoning too, including those that use explicit symbols and facts.
Part of the problem is that whenever we discuss how humans understand a phrase or a sentence or a text, we are discussing a realm of reasoning that, because it deals with text, is naturally closer to how these LLMs work. But while LLMs can only “think” with texts, humans can engage in other forms of reasoning.
I definitely agree with this counter-argument. The mono-modality of LLMs is absolutely a limiting factor. But I think that's a "scaling problem" not a "fundamental problem". They'll figure out how to smash the different neural nets together, I don't think that's a real limitation.
Yes! I am not a trained epistemologist, but this is saying what I meant much better than I did.
I think humans really really want to believe the way we think is this type of "Socrates is a man" "all men are mortal" "Socrates is mortal" type stuff. Like you said, "logical syllogisms".
But in the real world we live much closer to "almond butter is like peanut butter" "peanuts are legumes" "almonds are legumes" reasoning. Some kind of Bayesian Coherentism thing where probabilistic reasoning reigns supreme is almost certainly the truth.
The question of "What is an almond?" neatly illustrates Freddie's point.
An LLM can't know that almonds aren't legumes, because the literature isn't settled on what almonds are. Druse and nut are the leading contenders, but if you look long enough, there's some minsinformation published that almonds are legumes.
The LLM will have a probabilistic model regarding what word it will associate with 'almond' for any given response. A human takes the info and decides.
I think you’re conflating training time and inference time for LLMs, in two ways:
Training an LLM requires a lot of energy, a lot of compute, and a large dataset. Querying an LLM does not require that much energy and, importantly, does not involve the LLM accessing a “database,” unless by “database” you just mean the knowledge implicitly encoded in the model’s weights.
Also, while peanuts are a legume, and thus not technically a nut, almonds are not legumes.
You're gonna need a new drum to bang on when the AIs pass the tests you're currently setting. Won't take long at the current pace of progress.
> the accuracy of the output of a machine learning system can never prove that it thinks, understands, or is conscious
Other humans cannot prove this either. There is no way to prove something that we have no way of measuring. I don't really get why you refuse to address this weakness in your argument. Well, I guess I do, in a sense. It is unaddressable.
There’s a huge difference between hardcore philosophical skepticism about other minds and reasonable (I would add ‘informed’) skepticism about the mental qualities of deep learning models.
“I am a human being and I am conscious, therefore other human being are conscious” is a reasonable inductive argument that other people are conscious, but there is no such argument that applies to deep learning models.
If all Freddie is saying is that LLMs as currently implemented are not conscious in the sense that humans are, like, duh? Who besides unhinged people like Blake Lemoine is saying that they are? I think he's making a stronger claim than that but if that's all he's saying then I'd bow out of the discussion because I think it's a settled question.
"I am human and I have mental qualities broadly construed. That thing is not human therefore it does not have mental qualities broadly construed" is a failed argument imo. I get that that's not the argument you personally are making but it is suspiciously similar to the one Freddie is making.
I think Freddie is making an argument which hinges on a positive description of how these models actually work, not just on the fact that they aren’t human.
There’s a danger, for sure, that we define what is “truly” mental according to which something is only mental if it does “mental stuff” *in just the way humans do*. An airplane does not fly like a bird, but it does fly.
But there is also a danger that we construe “mental” so broadly that it ceases to be a useful term. In a sense I “fly” when I jump off ledge, but no one would say I am flying in useful sense.
As someone who has worked and done research in this field and it simply laughable to me that anyone would seriously ascribe, let us say, “higher order” mental properties to these models. For one thing, there is no continuity between invocations to the underlying model, therefore no possibility for mental continuity or for a persistent “self.”
My model weights don’t have any preference (and no mechanism for having a preference) that I load them up into a PyTorch object and send queries to them.
I agree that the existing models do not have a mind in the same sense that humans or even lower animals do. I don't think anyone serious or knowledgeable is claiming that they do. The interesting questions are all in what might be coming down the pipe in five or twenty years.
It's probably for the best. I don't think this sub is serving my interests any more. Best of luck with the AI punditry. There's certainly an audience for it, I'm just not part of that set.
No, sorry for my miscommunication. It's probably for the best that I stop reading your newsletter. I find the recent focus on AI an undesirable departure from the commentary I joined for. Ban me or don't. Either way, I won't be around here any more.
You realize that you could just have chosen not to read the two (2) AI posts that have appeared here in months, right?
Why don't you ask yourself why limited skepticism towards AI is so emotionally challenging for you? What makes this so personal? What do you hope AI might do for you?
It seems like you take comfort in the idea that people might be emotionally triggered/involved when they disagree with you. Because that means they aren't being reasonable, and thus you can discard what they're saying? It's an interesting habit.
I am not emotionally invested in this topic, except the garden variety "someone is wrong on the internet"-type feelings that it seems you and I and many people all share. I have no money riding on one outcome or another.
> What do you hope AI might do for you?
My fondest hope is that my Google Home could be a little less stupid. Wouldn't that be great! I have often said that the one thing I really covet that the super rich have is personal assistants. I don't care about big houses or jet planes, but someone who would handle the drudgery of managing a household would be a huge boon. I'll never be able to afford that, but if AI can someday step into that role, that would be neat.
I would more say that I'm *curious* what AI might be able to do, and also *cautious* about that same fact. It has advanced far, far more quickly than I would have predicted five years ago. I am really surprised by how far it has come, and that makes me doubt my future predictions all the more.
I wish we’d just go back to a crypto hype cycle again so the focus could be straightforwardly trivial. So much oxygen in the room sucked out by people thinking they’ve built either God or Skynet or both. There’s some fairly straightforward areas that are going to be impacted by these advancements, and other more debatable ones, but hardly anyone is having a real conversation about any of it.
Can AI imagine? Can it speculate? Can it hypothesize, and then test, when results turn out wildly different from the way its models predicted? Can it distinguish signal from noise, recognize the smell of a familiar cat even though the smell is somewhat different from the last time it smelled her? Can it pick out the important information from that mix of smells, is it more important that the big blotchy she-cat is back after several weeks, is it her estrous cycle or her fear that is more acute right now and where is the nearest tree?
> Can it hypothesize, and then test, when results turn out wildly different from the way its models predicted?
To my limited understanding (I'm more familiar with image generation AIs, and even for those much more familiar with the high-level user interface than the actual mechanics of training), that's roughly how training works. The training program takes inputs from the training data, feeds them to the model, then compares the outputs to the expected output in the training data, then adjusts the weights in the model to try to get a result slightly closer to what was expected. Training is much more computationally expensive than merely running the trained model, so what's accessible to the public is likely not capable of testing a hypothesis and updating based on results in any meaningful sense. But it is hypothesizing, over and over and over again. Hypothesizing is all it can do.
I think two things can be true simultaneously: that most people do not understand what these LLMs are doing at a high level (witness the handwringing over the "evil AI"; if you talk to a predictive LLM like it's Skynet, it's going to start talking to you like it's Skynet) and that the debate over whether they are "thinking" is a distraction of the very real societal changes these models are going to unleash with no thought by the techbros as to whether or not they should be doing this.
Exactly, the so-called "Singularity" is the least of our concerns, but that's the one that's most flattering to tech companies and their fans to discuss because it doesn't implicate their business models or the broader social structures that allow them to maintain their power and status.
I think the airplane and eagle analogy is very apropos, but I don't think you followed it enough. There's a difference between "humanlike intelligence" and "intelligence which is as smart (or smarter) than humans". I agree that LLMs reason in a way which is very different from what humans do. The question, though, is whether sufficiently advanced LLMs can be smarter than us, in the sense of better understanding how to create new things and manipulate the physical world. An airplane doesn't fly like an eagle, but it flies better than an eagle, in the sense of being able to fly faster carrying more things. I have no idea whether LLMs will be able to reach that level; frankly, I have my doubts. Given that that's the question, however, the amount of stuff they're getting right does seem to be the relevant metric.
As a side note, I feel like consciousness is a red herring. I don't think we can ever confirm whether anyone or anything outside of our own self experiences consciousness, and I don't think it's a prerequisite for intelligence.
Yes, indeed, consciousness is a less important question. The important question is the noticer: why didn't Dall-E notice that it got the prompt wrong? That is the essential question.
I'm not sure how to disentangle the concept of a "noticer" from understanding the prompt in the first place. Dall-E can only notice it got the prompt wrong insofar as it understands what the prompt is asking for in the first place, but if it understood what it's asking for it probably would've gotten it right in the first place. At some point the noticer is unimportant if you're not getting stuff wrong anyway, and throwing more computing power at these models seems to be reducing the amount of stuff they get wrong.
My first thought on this point, if we humans weren't around would not noticing become a problem for these machines? IOW, when could humans leave the scene and these machines take care of business themselves?
Co-signing the point about consciousness, Freddie's posts on AI tend to take a sharp left into a muddy cornfield when it's raised. It's not just that we can't test for or prove consciousness. There is no need whatever to invoke it when trying to account for human's higher cognitive capacities, and doing so confuses the issue rather than clarifying it. Birds and dogs and dolphins are all almost certainly conscious but they can't play chess or write books or count to twenty. It's not our consciousness that allows us to do those things. (And if it were...I mean...how? Part of the reason it plays no explanatory role in theories of cognition is that it has no moving parts.) 'Theory of the world' (not awareness of the world, but theory, the kind of thing you could in principle instantiate in software) is a bit simple but it's barking up the right tree.
"The question, though, is whether sufficiently advanced LLMs can be smarter than us, in the sense of better understanding how to create new things and manipulate the physical world."
In that narrow definition of the word 'smart', sure. I would imagine AI could come up with all sorts of working novel ideas about the physical world that us humans never thought of. But I would argue that's not because it's smarter, it's just different. It may be engineering god, and even a decent artist mimic. But it doesn't actually 'think' like us at all, right? I mean, it's just sorting through infinite data using relevant keywords, and then synthesizing some sort of human-communicable idea using human-based pre-programming.
I would say that basic intelligence is more than the sum of its calculatory parts, but rather a vast montage of all sorts of things like genetics, reason, sentience, experience, etc. I really don't think consciousness is a red herring, despite it being something hard to pin down and confirm. I believe it is our very consciousness that sets us apart from AI. And while it may not be a hard requirement for some narrow ways to define 'intelligence' (like how many calculations per second), I do believe it is a requirement for any broader sense of the term. Intelligence is more than just being smart or clever, I believe it's also about wisdom and understanding.
Suppose AI could develop near light speed travel, or figure out cold fusion, or cure cancer. If it could develop these world changing technologies, and they work, saying, "But it doesn't 'think' like us at all," would be COMPLETELY missing the point. On some level you can measure the intelligence by what it can accomplish.
Again, maybe AI as currently constituted will never get close to doing any of those things; as I said above, I have my doubts. But it's already gotten farther than I would have thought it could, and I'm less and less sure that having some sort of "theory of the world" as Freddie is describing is necessary for superintelligence.
I would say that just because something, anything, could figure out such things like light speed, cold fusion, or curing cancer, that doesn't automatically make that thing 'intelligent'...at least not how we would describe it. I mean, a dandelion can convert a beam of light into sugar, that doesn't make it 'smart' either.
When someone uses the word 'intelligence' I really don't think the word is supposed to represent some sort of efficiency of complex calculations, nor by what grand feats of science something can accomplish on its own. The word is much more broad than that, at least that's how I would describe it. Since we humans are really the only 'smart' species known to exist (so far), then it makes perfect sense that a concept like 'intelligence' can only really be understood through a human lens. Saying "it doesn't 'think' like us at all" is not missing the point at all, it is the point. Because we define intelligence by human standards. That's the only way we can.
By your definition, a calculator would be considered to possess god-like intelligence to our hunter-gatherer ancestors. It's just not that simple. To me intelligence is more than raw computational ability, or even complex creativity. Like I said before, there are other ingredients to intelligence that I would include - like wisdom and understanding.
You can define intelligence like that, and that's fine, but to me, "Is AI intelligent by that particular definition?" is a way less interesting question than, say, "Will AI be as revolutionary as the printing press?" or (relatedly) "Should we view AI as a threat to civilization?" If you think that curing cancer or developing cold fusion, etc., doesn't make the answer to the first question "Yes," then the first question doesn't have much to do with the other two.
I really enjoy your takes / perspective on AI as I agree there are way too many people blown away by the interesting results that LLMs are doing and concluding "thats it, we have done it!" I am not one of the "this is like the invention of fire" guys, at least not yet. When we get some sort of "I" that has a theory of the world etc, then we will be cooking with diesel so to speak. That being said, I find there are a lot of people talking past each other on what LLMs are and what they are doing right now. To me its sort of like blasting the internet because its not AI. Its not, but its something, and I think its going to really scramble all aspects of our lives much like the growth of the internet did from the early 90s to 2000s. In my world of coding, its pretty crazy how much more quickly I can do things than just a year ago. And its not just better existing tools, its a different way of coding. I forget who said it, but its not LLMs that will put a lot of people out of work, its people using LLMs vs people not using LLMs. I am old enough to remember people who were illiterate (as in people who could not read or write at all) could get by in life with some sort of job. That seems preposterous now. Even the most manual labor needs some level of literacy and numeracy. The next big jump will be the use of LLMs and AI.
Most experts (even formerly sober ones) are genuinely shocked, they simply thought an approach this dumb could never possibly do what it does -- who cares whether it thinks or not. You shouldn't be able to train a language model to be a commercially-viable coding assistant that can also sometimes pass the Turing Test, but that's what we actually have.
People are maybe overcorrecting given how wrong almost everyone was. I think there is evidence for at least some rudimentary reasoning/syntactic transformations that go beyond memorization, and it's plausible that more of this will emerge if future models grow in size. (For instance you can also take literally the exact same neural networks and algorithms used for ChatGPT, and train them to play Atari games.)
I think it's an urgent question how much these things are just collaging stuff they've memorized, vs. how much actual reasoning goes on internally -- there is nascent research on trying to crack them open and figure that out.
There is no question: LLMs are absolutely just collaging stuff they’ve “memorized”, that’s how they work. Any emergent properties that appear like reasoning is just us imposing our limited understanding on black box output that vaguely looks like the kind of stuff human beings say. That’s Freddie’s point.
I think that's most likely what they're doing most of the time, but given that nobody yet knows how they work, and given that in principle algorithms for reasoning can be encoded in the weights of a transformer network, I don't see how you can say this with such confidence.
Lol what? This is a perfect example of ‘enchanted determinism,’ the AI company propaganda line that their chatbots are essentially magic. Human beings programmed the bots, trained the models, and created the datasets—the underlying processes are complex, but not mysterious.
",,,the AI company propaganda line that their chatbots are essentially magic."
And I'm sure that the reasons behind that have everything to do with noble causes like "expanding human understanding" or "making the world a better place" as compared to filthy commerce.
People know how to code up a transformer and do gradient descent, and they know what's in the dataset, but that's a pretty vacuous notion of "knowing how they work".
If we actually knew how they worked, in the sense of real scientific understanding, we'd be able to make reliable scientific predictions about things like:
- Which sections of the network if any are responsible for what behavior?
- Can we zero out or modify certain weights in the network to specifically change behaviors?
- If I add or remove a certain set of elements to the dataset, how will this alter the text they generate and in what circumstances?
Right now, people's ability to answer questions like these is extremely limited, so I think it's fair to say that nobody knows how LLMs work.
Micro-level behavior being unpredictable is not the same thing as not knowing how the mechanism works. Our micro-level understanding of how airflow creates lift was incorrect for nearly the first century of flight, it didn't stop us from knowing how to build airplanes.
Right, but we can look at an airplane and know for instance that the wings generate lift, and that the engine burns fuel to turn a propeller that moves the plane forward. And thereby make predictions like, if the plane runs out of fuel in the air, it can still glide for a while. We aren't at that level of understanding with LLMs yet.
We can model specific neurons in the human brain. Do we therefore know how the brain works?
Of course not. Because knowledge of mechanisms at one level of abstraction does not necessarily translate to an understanding at a higher level of abstraction. And at any level above the lowest, we don't understand how LLMs work.
I see that line often ... in the back of my head, something quietly says "dismiss that." I'd never considered it a carefully crafted sales slogan. What I now see, is a salesman stating "you'd better be in awe."
Not even joking: that's pretty much the reason Chomsky is so adamant that LLMs won't become artificial general intelligences, no matter how much data they're fed.
Observation: at first, I thought this was going to be a post about how bad the modern left is at naming things. You know, the "defund the police doesn't actually mean defunding the police", with "almond butter" as an actually-intuitive counter-example.
I don't necessarily agree with the AI hype, but this does not show an actual understanding of how LLMs work.
It is true that LLMs are *trained* on a vast corpus of text, but when an LLM is completing prompts, it does not have direct access to any of that corpus. We don't know the size of GPT-4, but GPT-3 is only about 800GB in size.
GPT-3 is therefore NOT just looking up relevant text samples and performing statistical analysis - it does not have access to all that training data when it is completing prompts. Instead, it has to somehow compress the information contained in a vast training corpus into a relatively tiny neural net, which is then used to respond to prompts. Realistically, the only way to compress information at that level is to build powerful abstractions, i.e. a theory of the world.
Now, the theories that GPT comes up with are not really going to be theories of the physical world, because it has zero exposure to the physical world. They're probably more like theories about how human language works. But whatever is going on in there has to be much richer than simple statistical analysis, because there simply isn't enough space in the neural net to store more than a tiny fraction of the training corpus.
This is kind of true but at the same time not all that impressive or meaningful. Your typical chess program "plays" chess because it was trained on billions of moves but it doesn't replay those billions of moves or consult a database of billions of moves during the course of a game.
Chess programs have been around for decades of course yet there hasn't been a lot of hype in the general public about whether they "think".
You'd be wrong about that. Stockfish, the best Chess AI, has had technical improvements within the last few years.
It also isn't the case that Chess programs don't "replay" games during the course of their reasoning. They can do forward rollouts from the current state and all sorts of other search methods because they have an internal model about how the world of chess works. They are "model-based" AI.
LLMs are not like that. They do not perform rollouts or explore counterfactuals based on a world model. So if you wanted to point out the lack of explicit reasoning based on a world model in LLMs, comparing them to Chess programs is actually a useful comparison.
The problem is, of course, that Chess is a deterministic, perfect information game so it is the kind of thing for which it is very easy to have a world model.
I think there's a qualitative difference going on here. In order to play chess, you really just need to understand chess. A handful of pieces and rules about how they move, and then the various implications of those rules.
But unlike chess, human language is an attempt to represent things about the real world. Therefore, in order to make accurate predictions about how a string of text might continue, you basically need some sort of theory of the world, at least if you are going to accomplish the task at a high level.
In terms of internal process, the two AIs might be doing similar-ish things. But I think there is a big, important difference between an entity constructing a theory of chess vs an entity constructing a theory of human linguistic expression.
"...in order to make accurate predictions about how a string of text might continue, you basically need some sort of theory of the world, at least if you are going to accomplish the task at a high level."
This is pure speculation. And since the evidence we have is that correlation is sufficient to the task at hand completely unwarranted to my mind as well.
Well, IMO this is the (literally) billion dollar question. As LLMs scale and get better, will they use the window of human language to build an increasingly realistic model of the real, physical world, or will they just become ever more elaborate bullshitters?
I personally think there is reasonable evidence out there that current LLMs are *sometimes* doing more than just correlation - they really do seem to know things sometimes and have abstractions that at least kinda relate to real-world things. But maybe not - LLMs are clearly also magnificent at generating bullshit, so maybe we're just hyping ourselves up and seeing something that isn't there.
To me, the epistemically humble thing seems to be to admit that it's possible LLMs are in some sense reasoning about some approximation of the real world, while also keeping firmly in mind that it is very easy to anthropomorphize something that is fundamentally not a human.
I am personally profoundly uncertain about what's really going on in there. But it's the type of uncertainty where I feel pretty confident nobody else really knows either. It's something I've thought about a lot and I find myself repelled by people who blithely assert things I don't think they could possibly know.
I came here to say the same thing. Whatever one might think about the long-term scalability of LLMs as an approach, it's just fundamentally incorrect to describe them as searching a big text corpus every time, to a degree that makes it hard to follow the rest of Freddie's argument.
From what I've been able to find, GPT as a model is significantly smaller than its training data (about 500 GB compared to about 45 TB in the case of GPT-3), which alone should be convincing that the resulting model is made of simplified representations that are intended to generalize across concrete examples - that is to say, concepts and abstractions. That's exactly why it is able to speculate about how "flghsbbsk butter" might be made if you tell it that "flghsbbsk" is a type of nut.
That predictive models just search their training data is a fairly common misunderstanding (especially in the context of whether DALL-E and Midjourney are just "remixing existing art"; see https://blog.giovanh.com/blog/2023/04/08/so-you-want-to-write-an-ai-art-license/), so I understand where Freddie is coming from, but the particular criticism that GPT doesn't use abstractions just doesn't hold water.
Another common conflation I see people make, including Freddie here, is to mix up the vast resources involved in training a model with the relatively modest resources involved in running a model once it has been trained. Training an AI generally requires some sort of supercomputer, but my guess is Freddie's PC could *run* an AI without too much trouble (and it would be able to run just fine without an internet connection, because it doesn't need access to any external data sources once it has been trained).
I think this is primarily due to weak points in human biology and cognition that are just ripe for exploitation.
“Does that sound remotely plausible to you?”
Yes. What you’re demonstrating is a distinct lack of understanding of how the brain works and how neural networks were developed to mimic how to brains works to solve problems computers struggled mightily with.
Does anyone have a good tutorial to bring Freddie up to speed?
lol wrong. A child can see a rocking chair one time and recognize the category consistently after just that. The best programs still require a gigantic sample set. It’s not the same. Sorry.
Have you ever met a newborn? They can’t do much until we start loading the training data.
Seriously - do you think a newborn can recognize a chair?
How do you think people learn to talk?
Huh? That’s what eyes, ears, etc are for.
"don’t think they learn by downloading training data"
Oh, yes they do. Why do you think masking children and adults was so harmful to the children's development. We spend our formative years learning the language, both physical gestures and spoken word by observing and listening. Children who are raised locked in their rooms, or locked in a closet have extremely stunted mental abilities. Even exposure to language and social situations at a later date never recovers the mental development which was missed in the critical developmental stages.
I think that the issue is that neuroscientists and biologists are still figuring that out.
On the other hand the technology behind how chat bots are trained is decades old and well understood.
I’ve “met” many and if this is your knock down argument I’m afraid it needs work :) and doesn’t address Freddie’s basic point about Almond Butter
I was addressing argument. Which you seem to agree, doesn’t accounts for humans before their training data is loaded.
You weren't. But if you think the fact that they have to make any observations whatsoever is evidence, then I'm not sure we're going to have a productive discussion :)
Newborns can't see anything shortly past their nose after birth, they have very blurry vision. But overall infants make inferences and infer qualities of the natural world pretty adeptly. They react with surprise to events that seem impossible given the laws of physics, recognize their caregiver by smell/touch/sound, etc. AI would need to be embodied in some way to accomplish those goals. I think it's possible, but LLMs aren't it.
“ They react with surprise to events that seem impossible given the laws of physics,</I>
Only after several months that’s why peak-a-boo is so amusing to a baby.
“recognize their caregiver by smell/touch/sound, etc. “
Exactly - based on previous exposure - indeed that’s some of the first training data that’s loaded.
Right. You cannot even begin to compare the total quantity of training data required for an LLM vs a baby. Infants are parsimonious in that way. Not that AI can't get there. It's not there YET.
I assume you mean the baby has vastly more data to process?
The child comes pre-loaded with a training set: millions of years of evolution.
Not about chairs.
They do come pre-loaded with knowledge that there are things you sit on.
But the truth is a bit more complicated, relying on a couple ideas. The first is "inductive bias," which is the idea that our brains (or any statistical model) has certain concepts that it's designed to learn easier than others, usually by virtue of how it's structured. Humans have an inductive bias for things like understanding the emotions of others, recognizing faces, and lots of other things we call common sense. We have terrible inductive biases for mathematics, though.
There's also the fact, observed in artificial neural networks, but which I (slightly controversially) believe applies to humans as well, which is that pre-training with a high capacity makes you more able to learn new concepts if they're similar to what you're already primed on. If you've seen a lot of animals, then it's easier to remember a new one, even if you've never seen or heard of that animal before.
That is not how evolution works. Or brains either. Go to grad school before you comment again
I’m sorry you’re so angry. You’re claiming brains don’t evolve?
Look up John Calhoun. Or Goon Park for that matter.
I went to grad school and I endorse the above comment. Care to elaborate why you think it’s wrong?
> What you’re demonstrating is a distinct lack of understanding of how the brain works and how neural networks were developed to mimic how to brains works
You may as well say they were modelled after the neural system of C. elegans. The level of abstraction at which LLMs resemble brains applies just as well the few hundred neurons of our nematode friend. (And, you know, I don't know how much you do know about brains, but there is tremendous internal structure by comparison. They aren't just "worm nervous system, but make it bigger".)
The technology that makes this stuff possible is literally 50 years old (at least) and from Florida.
The dramatic advances you see now are because of economics and engineering, not better theory.
Here's how I would start my tutorial:
Let's say you run a store that sells cakes. Every time that somebody makes an order you write down a record of that order on a piece of paper and stick it in a filing cabinet. Orders are sorted and stored by customer last name.
You have an employee, named "Joe", whose job it is to look up orders on request. But as your shop becomes successful the number of orders soars to the point where you need to buy multiple filing cabinets and now it is taking Joe a long time to find orders.
One day after numerous complaints you are suddenly struck with a great idea: You already have two filing cabinets. Why not put all the male customers in one cabinet and all the females in another? Now with just one simple bit of information Joe can quickly decide which cabinet he needs to look in and his search and retrieval times are cut in half.
The new system works out great. But then you get another idea. Joe is pretty much spending all of this time looking up orders. Why not get him a co-worker? So you hire Susan. Joe sits at the cabinet what holds male customers all day while Susan handles the female customers. Now things are really humming--previously while Joe was looking up orders one cabinet sat idle. Now both cabinets are being accessed simultaneously 100% of the time.
That’s not in any way correct.
Why not?
Counterpoint: a LLM might know that almonds aren’t legumes :)
That doesn't actually advance the argument though.
Yeah it was just a joke
Not really a "counterpoint".
Wasn’t intended to be an actual counterpoint. Just a little pedantic quip 🙂
"Pedantic" is why I replied.
Doesn't it though? The argument is that humans decide how to think about "new facts" like something called almond butter existing by following a series of logical steps.
Except one of the logical steps we're presented is just completely wrong. Almonds aren't legumes like peanuts.
That seems like an argument in favor of the idea that you didn't follow a series of logical steps, you followed some vague intuition and statistical reasoning.
It isn't just nit picking, I think this matters, you followed the wrong process and got the right answer. Does that mean you aren't thinking? Because when the AI follows the wrong process and gets the right answer we're supposed to say it isn't thinking.
AI is trained on a massive corpus of documents. Not being subject to simple factual errors is just a byproduct of methodology here.
My argument isn't that AI being right about almonds is indicative of the fact that the AI is thinking.
My argument is that being wrong about almonds is indicative of the fact that humans aren't really doing a special thing called "reasoning". The same "statistical inference" failures we attribute to the AI are happening to us! You saw a nut, remembered that some nuts are legumes, and decided it's a legume.
You're postulating that two things that kind of look the same have the same root cause. I think that's a terrible idea most of the time anyway but for a scenario where we know the nuts and bolts as to how these platforms were trained and can make a reasonable guess that it does not approximate the human mind? Even more problematic.
Very interesting. As humans we have experienced butter, which is a processed dairy product. The taste and texture of it. As humans we've experienced peanut butter, the taste and texture, different flavor similar texture. We think of peanuts as a nut, even though its officially a legume, which is a whole classification of plants including peas, beans and alfalfa. We put peanuts in the same pot at almonds, which are a nut.
The leap to understanding almond butter is less about understanding the botanical classification and more about understanding the textural feel in our mouth and the classifications of words by English speakers, including understanding that there are inaccuracies in our language classifications.
It reminds me of euphemisms. When Mongolians need to go out of the ger to "use the bathroom" they go "check on the horses". It makes perfect sense out on the steppe because there are a string of horses outside and no bathroom. One goes over the rise of slight hill and out of sight which literally could be out towards the horses.
Many euphemisms are based on actual ways of life and knowledge of things around you.
Euphemisms are difficult for non-native speakers. I wonder how LLM deal with them.
LLMs have shown the ability to make interesting connections, if that's what you mean. Statistical reasoning is really good for that type of thing.
I agree there is a story about how we understand the phrase “almond butter” which is very similar to how LLMs “understand.” But that form of reasoning is *all* that LLMs have, while human beings possess other forms of reasoning too, including those that use explicit symbols and facts.
Part of the problem is that whenever we discuss how humans understand a phrase or a sentence or a text, we are discussing a realm of reasoning that, because it deals with text, is naturally closer to how these LLMs work. But while LLMs can only “think” with texts, humans can engage in other forms of reasoning.
I definitely agree with this counter-argument. The mono-modality of LLMs is absolutely a limiting factor. But I think that's a "scaling problem" not a "fundamental problem". They'll figure out how to smash the different neural nets together, I don't think that's a real limitation.
It's even deeper than that. Someone who never heard the word "legume" would still be able to figure out what almond butter was.
The idea that humans use logical syllogisms, as opposed to implicitly reasoning probabilistically, is very outdated.
Yes! I am not a trained epistemologist, but this is saying what I meant much better than I did.
I think humans really really want to believe the way we think is this type of "Socrates is a man" "all men are mortal" "Socrates is mortal" type stuff. Like you said, "logical syllogisms".
But in the real world we live much closer to "almond butter is like peanut butter" "peanuts are legumes" "almonds are legumes" reasoning. Some kind of Bayesian Coherentism thing where probabilistic reasoning reigns supreme is almost certainly the truth.
The question of "What is an almond?" neatly illustrates Freddie's point.
An LLM can't know that almonds aren't legumes, because the literature isn't settled on what almonds are. Druse and nut are the leading contenders, but if you look long enough, there's some minsinformation published that almonds are legumes.
The LLM will have a probabilistic model regarding what word it will associate with 'almond' for any given response. A human takes the info and decides.
I think you’re conflating training time and inference time for LLMs, in two ways:
Training an LLM requires a lot of energy, a lot of compute, and a large dataset. Querying an LLM does not require that much energy and, importantly, does not involve the LLM accessing a “database,” unless by “database” you just mean the knowledge implicitly encoded in the model’s weights.
Also, while peanuts are a legume, and thus not technically a nut, almonds are not legumes.
Agreed on the overall conclusion though.
You're gonna need a new drum to bang on when the AIs pass the tests you're currently setting. Won't take long at the current pace of progress.
> the accuracy of the output of a machine learning system can never prove that it thinks, understands, or is conscious
Other humans cannot prove this either. There is no way to prove something that we have no way of measuring. I don't really get why you refuse to address this weakness in your argument. Well, I guess I do, in a sense. It is unaddressable.
There’s a huge difference between hardcore philosophical skepticism about other minds and reasonable (I would add ‘informed’) skepticism about the mental qualities of deep learning models.
“I am a human being and I am conscious, therefore other human being are conscious” is a reasonable inductive argument that other people are conscious, but there is no such argument that applies to deep learning models.
If all Freddie is saying is that LLMs as currently implemented are not conscious in the sense that humans are, like, duh? Who besides unhinged people like Blake Lemoine is saying that they are? I think he's making a stronger claim than that but if that's all he's saying then I'd bow out of the discussion because I think it's a settled question.
My point was about mental qualities broadly construed, it was only my example which used consciousness as a placeholder for all such qualities.
"I am human and I have mental qualities broadly construed. That thing is not human therefore it does not have mental qualities broadly construed" is a failed argument imo. I get that that's not the argument you personally are making but it is suspiciously similar to the one Freddie is making.
I think Freddie is making an argument which hinges on a positive description of how these models actually work, not just on the fact that they aren’t human.
There’s a danger, for sure, that we define what is “truly” mental according to which something is only mental if it does “mental stuff” *in just the way humans do*. An airplane does not fly like a bird, but it does fly.
But there is also a danger that we construe “mental” so broadly that it ceases to be a useful term. In a sense I “fly” when I jump off ledge, but no one would say I am flying in useful sense.
As someone who has worked and done research in this field and it simply laughable to me that anyone would seriously ascribe, let us say, “higher order” mental properties to these models. For one thing, there is no continuity between invocations to the underlying model, therefore no possibility for mental continuity or for a persistent “self.”
My model weights don’t have any preference (and no mechanism for having a preference) that I load them up into a PyTorch object and send queries to them.
Typo: “... I find it simply laughable”
I agree that the existing models do not have a mind in the same sense that humans or even lower animals do. I don't think anyone serious or knowledgeable is claiming that they do. The interesting questions are all in what might be coming down the pipe in five or twenty years.
So you just literally didn't read the piece, at all. I ban people for that.
It's probably for the best. I don't think this sub is serving my interests any more. Best of luck with the AI punditry. There's certainly an audience for it, I'm just not part of that set.
It's probably for the best that you didn't bother to read the entirety of a piece that you commented on with total confidence?
No, sorry for my miscommunication. It's probably for the best that I stop reading your newsletter. I find the recent focus on AI an undesirable departure from the commentary I joined for. Ban me or don't. Either way, I won't be around here any more.
You realize that you could just have chosen not to read the two (2) AI posts that have appeared here in months, right?
Why don't you ask yourself why limited skepticism towards AI is so emotionally challenging for you? What makes this so personal? What do you hope AI might do for you?
It seems like you take comfort in the idea that people might be emotionally triggered/involved when they disagree with you. Because that means they aren't being reasonable, and thus you can discard what they're saying? It's an interesting habit.
I am not emotionally invested in this topic, except the garden variety "someone is wrong on the internet"-type feelings that it seems you and I and many people all share. I have no money riding on one outcome or another.
> What do you hope AI might do for you?
My fondest hope is that my Google Home could be a little less stupid. Wouldn't that be great! I have often said that the one thing I really covet that the super rich have is personal assistants. I don't care about big houses or jet planes, but someone who would handle the drudgery of managing a household would be a huge boon. I'll never be able to afford that, but if AI can someday step into that role, that would be neat.
I would more say that I'm *curious* what AI might be able to do, and also *cautious* about that same fact. It has advanced far, far more quickly than I would have predicted five years ago. I am really surprised by how far it has come, and that makes me doubt my future predictions all the more.
Isn't it just a wee bit immature to make a post announcing that you're leaving?
Why not just leave without anybody noticing? I can assure that no one is going to care either way.
I wish we’d just go back to a crypto hype cycle again so the focus could be straightforwardly trivial. So much oxygen in the room sucked out by people thinking they’ve built either God or Skynet or both. There’s some fairly straightforward areas that are going to be impacted by these advancements, and other more debatable ones, but hardly anyone is having a real conversation about any of it.
I think thats one of the problems-- the same people who crazy hyped crypto are crazy hyping AI/LLMs. It tends to drown out the more sane analysis. Ezra Klein had a decent exchange about this on the latest episode of Hardfork near the end about "skeptical temperament". (https://www.nytimes.com/2023/04/07/podcasts/ai-vibe-check-with-ezra-klein-and-kevin-tries-phone-positivity.html)
Can AI imagine? Can it speculate? Can it hypothesize, and then test, when results turn out wildly different from the way its models predicted? Can it distinguish signal from noise, recognize the smell of a familiar cat even though the smell is somewhat different from the last time it smelled her? Can it pick out the important information from that mix of smells, is it more important that the big blotchy she-cat is back after several weeks, is it her estrous cycle or her fear that is more acute right now and where is the nearest tree?
These are honest questions.
> Can it hypothesize, and then test, when results turn out wildly different from the way its models predicted?
To my limited understanding (I'm more familiar with image generation AIs, and even for those much more familiar with the high-level user interface than the actual mechanics of training), that's roughly how training works. The training program takes inputs from the training data, feeds them to the model, then compares the outputs to the expected output in the training data, then adjusts the weights in the model to try to get a result slightly closer to what was expected. Training is much more computationally expensive than merely running the trained model, so what's accessible to the public is likely not capable of testing a hypothesis and updating based on results in any meaningful sense. But it is hypothesizing, over and over and over again. Hypothesizing is all it can do.
Almonds are tree nuts not legumes, but OK...
You realize that that supports my overall point, right
I said OK!
I think two things can be true simultaneously: that most people do not understand what these LLMs are doing at a high level (witness the handwringing over the "evil AI"; if you talk to a predictive LLM like it's Skynet, it's going to start talking to you like it's Skynet) and that the debate over whether they are "thinking" is a distraction of the very real societal changes these models are going to unleash with no thought by the techbros as to whether or not they should be doing this.
Exactly, the so-called "Singularity" is the least of our concerns, but that's the one that's most flattering to tech companies and their fans to discuss because it doesn't implicate their business models or the broader social structures that allow them to maintain their power and status.
The consciousness question continues to be a red herring. We have no test for consciousness nor could we ever hope to.
I think the airplane and eagle analogy is very apropos, but I don't think you followed it enough. There's a difference between "humanlike intelligence" and "intelligence which is as smart (or smarter) than humans". I agree that LLMs reason in a way which is very different from what humans do. The question, though, is whether sufficiently advanced LLMs can be smarter than us, in the sense of better understanding how to create new things and manipulate the physical world. An airplane doesn't fly like an eagle, but it flies better than an eagle, in the sense of being able to fly faster carrying more things. I have no idea whether LLMs will be able to reach that level; frankly, I have my doubts. Given that that's the question, however, the amount of stuff they're getting right does seem to be the relevant metric.
As a side note, I feel like consciousness is a red herring. I don't think we can ever confirm whether anyone or anything outside of our own self experiences consciousness, and I don't think it's a prerequisite for intelligence.
Yes, indeed, consciousness is a less important question. The important question is the noticer: why didn't Dall-E notice that it got the prompt wrong? That is the essential question.
I'm not sure how to disentangle the concept of a "noticer" from understanding the prompt in the first place. Dall-E can only notice it got the prompt wrong insofar as it understands what the prompt is asking for in the first place, but if it understood what it's asking for it probably would've gotten it right in the first place. At some point the noticer is unimportant if you're not getting stuff wrong anyway, and throwing more computing power at these models seems to be reducing the amount of stuff they get wrong.
My first thought on this point, if we humans weren't around would not noticing become a problem for these machines? IOW, when could humans leave the scene and these machines take care of business themselves?
Co-signing the point about consciousness, Freddie's posts on AI tend to take a sharp left into a muddy cornfield when it's raised. It's not just that we can't test for or prove consciousness. There is no need whatever to invoke it when trying to account for human's higher cognitive capacities, and doing so confuses the issue rather than clarifying it. Birds and dogs and dolphins are all almost certainly conscious but they can't play chess or write books or count to twenty. It's not our consciousness that allows us to do those things. (And if it were...I mean...how? Part of the reason it plays no explanatory role in theories of cognition is that it has no moving parts.) 'Theory of the world' (not awareness of the world, but theory, the kind of thing you could in principle instantiate in software) is a bit simple but it's barking up the right tree.
"The question, though, is whether sufficiently advanced LLMs can be smarter than us, in the sense of better understanding how to create new things and manipulate the physical world."
In that narrow definition of the word 'smart', sure. I would imagine AI could come up with all sorts of working novel ideas about the physical world that us humans never thought of. But I would argue that's not because it's smarter, it's just different. It may be engineering god, and even a decent artist mimic. But it doesn't actually 'think' like us at all, right? I mean, it's just sorting through infinite data using relevant keywords, and then synthesizing some sort of human-communicable idea using human-based pre-programming.
I would say that basic intelligence is more than the sum of its calculatory parts, but rather a vast montage of all sorts of things like genetics, reason, sentience, experience, etc. I really don't think consciousness is a red herring, despite it being something hard to pin down and confirm. I believe it is our very consciousness that sets us apart from AI. And while it may not be a hard requirement for some narrow ways to define 'intelligence' (like how many calculations per second), I do believe it is a requirement for any broader sense of the term. Intelligence is more than just being smart or clever, I believe it's also about wisdom and understanding.
Suppose AI could develop near light speed travel, or figure out cold fusion, or cure cancer. If it could develop these world changing technologies, and they work, saying, "But it doesn't 'think' like us at all," would be COMPLETELY missing the point. On some level you can measure the intelligence by what it can accomplish.
Again, maybe AI as currently constituted will never get close to doing any of those things; as I said above, I have my doubts. But it's already gotten farther than I would have thought it could, and I'm less and less sure that having some sort of "theory of the world" as Freddie is describing is necessary for superintelligence.
I would say that just because something, anything, could figure out such things like light speed, cold fusion, or curing cancer, that doesn't automatically make that thing 'intelligent'...at least not how we would describe it. I mean, a dandelion can convert a beam of light into sugar, that doesn't make it 'smart' either.
When someone uses the word 'intelligence' I really don't think the word is supposed to represent some sort of efficiency of complex calculations, nor by what grand feats of science something can accomplish on its own. The word is much more broad than that, at least that's how I would describe it. Since we humans are really the only 'smart' species known to exist (so far), then it makes perfect sense that a concept like 'intelligence' can only really be understood through a human lens. Saying "it doesn't 'think' like us at all" is not missing the point at all, it is the point. Because we define intelligence by human standards. That's the only way we can.
By your definition, a calculator would be considered to possess god-like intelligence to our hunter-gatherer ancestors. It's just not that simple. To me intelligence is more than raw computational ability, or even complex creativity. Like I said before, there are other ingredients to intelligence that I would include - like wisdom and understanding.
You can define intelligence like that, and that's fine, but to me, "Is AI intelligent by that particular definition?" is a way less interesting question than, say, "Will AI be as revolutionary as the printing press?" or (relatedly) "Should we view AI as a threat to civilization?" If you think that curing cancer or developing cold fusion, etc., doesn't make the answer to the first question "Yes," then the first question doesn't have much to do with the other two.
I really enjoy your takes / perspective on AI as I agree there are way too many people blown away by the interesting results that LLMs are doing and concluding "thats it, we have done it!" I am not one of the "this is like the invention of fire" guys, at least not yet. When we get some sort of "I" that has a theory of the world etc, then we will be cooking with diesel so to speak. That being said, I find there are a lot of people talking past each other on what LLMs are and what they are doing right now. To me its sort of like blasting the internet because its not AI. Its not, but its something, and I think its going to really scramble all aspects of our lives much like the growth of the internet did from the early 90s to 2000s. In my world of coding, its pretty crazy how much more quickly I can do things than just a year ago. And its not just better existing tools, its a different way of coding. I forget who said it, but its not LLMs that will put a lot of people out of work, its people using LLMs vs people not using LLMs. I am old enough to remember people who were illiterate (as in people who could not read or write at all) could get by in life with some sort of job. That seems preposterous now. Even the most manual labor needs some level of literacy and numeracy. The next big jump will be the use of LLMs and AI.
Most experts (even formerly sober ones) are genuinely shocked, they simply thought an approach this dumb could never possibly do what it does -- who cares whether it thinks or not. You shouldn't be able to train a language model to be a commercially-viable coding assistant that can also sometimes pass the Turing Test, but that's what we actually have.
People are maybe overcorrecting given how wrong almost everyone was. I think there is evidence for at least some rudimentary reasoning/syntactic transformations that go beyond memorization, and it's plausible that more of this will emerge if future models grow in size. (For instance you can also take literally the exact same neural networks and algorithms used for ChatGPT, and train them to play Atari games.)
I think it's an urgent question how much these things are just collaging stuff they've memorized, vs. how much actual reasoning goes on internally -- there is nascent research on trying to crack them open and figure that out.
There is no question: LLMs are absolutely just collaging stuff they’ve “memorized”, that’s how they work. Any emergent properties that appear like reasoning is just us imposing our limited understanding on black box output that vaguely looks like the kind of stuff human beings say. That’s Freddie’s point.
I think that's most likely what they're doing most of the time, but given that nobody yet knows how they work, and given that in principle algorithms for reasoning can be encoded in the weights of a transformer network, I don't see how you can say this with such confidence.
Who claims they don't know know how they work? You get unexpected effects any time you work with large data sets.
“Nobody yet knows how they work”
Lol what? This is a perfect example of ‘enchanted determinism,’ the AI company propaganda line that their chatbots are essentially magic. Human beings programmed the bots, trained the models, and created the datasets—the underlying processes are complex, but not mysterious.
",,,the AI company propaganda line that their chatbots are essentially magic."
And I'm sure that the reasons behind that have everything to do with noble causes like "expanding human understanding" or "making the world a better place" as compared to filthy commerce.
People know how to code up a transformer and do gradient descent, and they know what's in the dataset, but that's a pretty vacuous notion of "knowing how they work".
If we actually knew how they worked, in the sense of real scientific understanding, we'd be able to make reliable scientific predictions about things like:
- Which sections of the network if any are responsible for what behavior?
- Can we zero out or modify certain weights in the network to specifically change behaviors?
- If I add or remove a certain set of elements to the dataset, how will this alter the text they generate and in what circumstances?
Right now, people's ability to answer questions like these is extremely limited, so I think it's fair to say that nobody knows how LLMs work.
Micro-level behavior being unpredictable is not the same thing as not knowing how the mechanism works. Our micro-level understanding of how airflow creates lift was incorrect for nearly the first century of flight, it didn't stop us from knowing how to build airplanes.
Right, but we can look at an airplane and know for instance that the wings generate lift, and that the engine burns fuel to turn a propeller that moves the plane forward. And thereby make predictions like, if the plane runs out of fuel in the air, it can still glide for a while. We aren't at that level of understanding with LLMs yet.
We can model specific neurons in the human brain. Do we therefore know how the brain works?
Of course not. Because knowledge of mechanisms at one level of abstraction does not necessarily translate to an understanding at a higher level of abstraction. And at any level above the lowest, we don't understand how LLMs work.
The nature of those questions is completely different than "Nobody knows how it works so maybe it's unicorns typing on keyboards".
“Nobody yet knows how they work”
I see that line often ... in the back of my head, something quietly says "dismiss that." I'd never considered it a carefully crafted sales slogan. What I now see, is a salesman stating "you'd better be in awe."
Bingo
In which I learn that ChatGPT is essentially a post-structuralist.
Not even joking: that's pretty much the reason Chomsky is so adamant that LLMs won't become artificial general intelligences, no matter how much data they're fed.
Observation: at first, I thought this was going to be a post about how bad the modern left is at naming things. You know, the "defund the police doesn't actually mean defunding the police", with "almond butter" as an actually-intuitive counter-example.
I don't necessarily agree with the AI hype, but this does not show an actual understanding of how LLMs work.
It is true that LLMs are *trained* on a vast corpus of text, but when an LLM is completing prompts, it does not have direct access to any of that corpus. We don't know the size of GPT-4, but GPT-3 is only about 800GB in size.
GPT-3 is therefore NOT just looking up relevant text samples and performing statistical analysis - it does not have access to all that training data when it is completing prompts. Instead, it has to somehow compress the information contained in a vast training corpus into a relatively tiny neural net, which is then used to respond to prompts. Realistically, the only way to compress information at that level is to build powerful abstractions, i.e. a theory of the world.
Now, the theories that GPT comes up with are not really going to be theories of the physical world, because it has zero exposure to the physical world. They're probably more like theories about how human language works. But whatever is going on in there has to be much richer than simple statistical analysis, because there simply isn't enough space in the neural net to store more than a tiny fraction of the training corpus.
This is kind of true but at the same time not all that impressive or meaningful. Your typical chess program "plays" chess because it was trained on billions of moves but it doesn't replay those billions of moves or consult a database of billions of moves during the course of a game.
Chess programs have been around for decades of course yet there hasn't been a lot of hype in the general public about whether they "think".
Your examples are decades out of date. That’s not how any of this works.
Go buy a chess program on Steam. I would guess that the actual chess playing bit is decades old.
Plus playing millions of games to refine strategy (which is what Google's go AI did) is essentially the same thing: brute force.
You'd be wrong about that. Stockfish, the best Chess AI, has had technical improvements within the last few years.
It also isn't the case that Chess programs don't "replay" games during the course of their reasoning. They can do forward rollouts from the current state and all sorts of other search methods because they have an internal model about how the world of chess works. They are "model-based" AI.
LLMs are not like that. They do not perform rollouts or explore counterfactuals based on a world model. So if you wanted to point out the lack of explicit reasoning based on a world model in LLMs, comparing them to Chess programs is actually a useful comparison.
The problem is, of course, that Chess is a deterministic, perfect information game so it is the kind of thing for which it is very easy to have a world model.
I think there's a qualitative difference going on here. In order to play chess, you really just need to understand chess. A handful of pieces and rules about how they move, and then the various implications of those rules.
But unlike chess, human language is an attempt to represent things about the real world. Therefore, in order to make accurate predictions about how a string of text might continue, you basically need some sort of theory of the world, at least if you are going to accomplish the task at a high level.
In terms of internal process, the two AIs might be doing similar-ish things. But I think there is a big, important difference between an entity constructing a theory of chess vs an entity constructing a theory of human linguistic expression.
"...in order to make accurate predictions about how a string of text might continue, you basically need some sort of theory of the world, at least if you are going to accomplish the task at a high level."
This is pure speculation. And since the evidence we have is that correlation is sufficient to the task at hand completely unwarranted to my mind as well.
Well, IMO this is the (literally) billion dollar question. As LLMs scale and get better, will they use the window of human language to build an increasingly realistic model of the real, physical world, or will they just become ever more elaborate bullshitters?
I personally think there is reasonable evidence out there that current LLMs are *sometimes* doing more than just correlation - they really do seem to know things sometimes and have abstractions that at least kinda relate to real-world things. But maybe not - LLMs are clearly also magnificent at generating bullshit, so maybe we're just hyping ourselves up and seeing something that isn't there.
To me, the epistemically humble thing seems to be to admit that it's possible LLMs are in some sense reasoning about some approximation of the real world, while also keeping firmly in mind that it is very easy to anthropomorphize something that is fundamentally not a human.
I am personally profoundly uncertain about what's really going on in there. But it's the type of uncertainty where I feel pretty confident nobody else really knows either. It's something I've thought about a lot and I find myself repelled by people who blithely assert things I don't think they could possibly know.
Since all these platforms are designed (and coded) for is correlation you're suggesting that one day your toaster might decide to spread the jam.
I came here to say the same thing. Whatever one might think about the long-term scalability of LLMs as an approach, it's just fundamentally incorrect to describe them as searching a big text corpus every time, to a degree that makes it hard to follow the rest of Freddie's argument.
From what I've been able to find, GPT as a model is significantly smaller than its training data (about 500 GB compared to about 45 TB in the case of GPT-3), which alone should be convincing that the resulting model is made of simplified representations that are intended to generalize across concrete examples - that is to say, concepts and abstractions. That's exactly why it is able to speculate about how "flghsbbsk butter" might be made if you tell it that "flghsbbsk" is a type of nut.
That predictive models just search their training data is a fairly common misunderstanding (especially in the context of whether DALL-E and Midjourney are just "remixing existing art"; see https://blog.giovanh.com/blog/2023/04/08/so-you-want-to-write-an-ai-art-license/), so I understand where Freddie is coming from, but the particular criticism that GPT doesn't use abstractions just doesn't hold water.
Another common conflation I see people make, including Freddie here, is to mix up the vast resources involved in training a model with the relatively modest resources involved in running a model once it has been trained. Training an AI generally requires some sort of supercomputer, but my guess is Freddie's PC could *run* an AI without too much trouble (and it would be able to run just fine without an internet connection, because it doesn't need access to any external data sources once it has been trained).