AZEEM AZHAR: Hello. I’m Azeem Azhar. Eight years ago, I started Exponential View to explore the underlying forces driving the development of exponential technologies. I’m a child of the exponential age born the year after Intel released the 4,004, the world’s first single chip microprocessor. I got my first computer in 1981. I’m holding it as I speak. And as I moved into adulthood, the internet was growing exponentially, transforming from a tool of academia to the worldwide utility it is today. I at least did not grow exponentially with it. Now, back in 2015 when I started writing Exponential View, I’d noticed something curious was going on. You could feel the pace picking up. New technologies were clearly on some kind of accelerating trend. Today we’re revisiting my conversation with cognitive scientist and AI expert, Gary Marcus. Drawing in his background in psychology and neuroscience, he highlights a role of understanding reasoning and learning in AI development as well as the importance of incorporating prior knowledge into these systems. Gary shares his insights on the quest for a new programming approach that could enable safe and helpful AI, and we ponder the potential timeline for realizing artificial general intelligence. Here’s my pick of the week. It’s my 2019 discussion with Gary Marcus. Gary, welcome to Exponential View.
GARY MARCUS: Thanks very much for having me.
AZEEM AZHAR: Let’s start with some basics. How do you define artificial intelligence?
GARY MARCUS: I don’t. I think there’s a will to give sharp definitions to things that are hard. I always think of the old line about pornography from, I think it was Potter Stewart, the Supreme Court Justice, and he said, “I know it when I see it.” Artificial intelligence has to be broadly defined because intelligence itself has to be broadly defined. Intelligence is multi-dimensional. Some of it we have captured very well with current techniques. So I would say that playing Go or playing chess, that’s a form of intelligence and current AI can do that very well. There’s another kind of intelligence, which is broad intelligence or general intelligence. Sometimes people call it artificial general intelligence, and that doesn’t really exist yet. So, the kind of intelligence that lets you go into a novel situation that you’ve never encountered before and figure out what you should do, that’s another strand of intelligence and that’s what makes people reliable assistance in all sorts of ways and makes machines because they lack it really terrible in open-ended environments. We keep getting promised chat bots that we’ll be able to have arbitrary conversations with and they’re still not really that good. So there’s been huge advances in some ways. So people thought Go would take 10 more years and it arrived ahead of schedule. I don’t think it’s as big an accomplishment as some people think, but it’s certainly a very good engineering advance. We have a lot of things that are flashy and we have some things that are practical now. I mean Google Translate is amazing. You know, it’s not perfect. I wouldn’t trust a legal contract to it, but you can just type anything in the world’s major languages and get them translated instantly for free. That means AI is actually here. That’s a form of AI. I think from a corporate perspective, not everything is as practical as people wish that it were or led to believe. So Google can do certain kinds of AI because they have a huge amount of data, and then when a smaller company tries to do the same thing with a smaller amount of data, it’s not as reliable. The AI that we have right now is dependent on massive amounts of data. That’s different from people who can do things with small amounts of data. And it means if you are running a small company and you have kind of a moderate amount of data, you may or may not get the returns that you’re kind of led to believe from the media.
AZEEM AZHAR: A number of these technologies that we see every day like photo tagging or translations have really improved in the last five or six years and they’ve improved since the arrival in the commercial environment to this technology called deep learning. If you’re like me, you’ll have noticed that photo tagging has got much better in the last two or three years so has translation. Isn’t deep learning working right now?
GARY MARCUS: Absolutely. I mean, deep learning is a major advance. The ideas and it aren’t really that new, but the ability to use it practically is new. The trigger for it was really graphics cards that were made originally for computer games. People figured out how to adapt them to this thing called neural networks that’d actually been around since the fifties or even the forties, but people knew how to build them before, but they didn’t know how to use them in a practical environment. They took too long. They required too much data. And then suddenly we had the internet that improved the amount of data available and we had these graphics cards that allowed the models to work a lot faster than neural network systems to work a lot faster. And that’s what deep learning is really about, is taking really impressive hardware with really impressive amounts of data and getting good results. So up until about 2011 when people started using the graphics cards, deep learning wasn’t really doing that well. There were these annual competitions, for example, to recognize objects and deep learning was not at the head of the pack. And then Fei Fei Li introduced this thing called ImageNet. So suddenly there was a lot more labeled data to work with and suddenly there were new processors to do things on and that made a huge advance in the effectiveness of the technology.
AZEEM AZHAR: The critical junction seemed to be the combination of Fei Fei Li’s data, labeled data and ImageNet and the arrival of these graphical processes. Is that right?
GARY MARCUS: That’s absolutely right. There were other things too. So there were some small technical tricks like replacing a sigmoid with a ReLU. Those are two different ways of calculating the activation of a neural net. And the new way was just a lot faster. So there were some technical tricks that made things much quicker.
AZEEM AZHAR: Now, one of my previous guests was Jack Clark from Open AI and he publishes something called…
GARY MARCUS: Good friend.
AZEEM AZHAR: … the AI Index every year, and it shows a really rapid improvement of deep learning in these different domains, whether it is understanding sentences or answering questions or recognizing pictures, objects, and images, and it’s really, really fast. With that rate of acceleration, perhaps one argument is, well, if that just continues for another few years, we will get these systems that will be indistinguishable from humans in these particular domains.
GARY MARCUS: Many people make that argument, but I would, first of all, note that on natural language understanding it’s not really true. So a natural language understanding, we’re still doing the kinds of stuff that Alexa can do when Siri can do, which are very limited. They’re not conversational. I would say that there has not been anything like exponential growth in natural language understanding. We still don’t have any kind of a system that can read a news story and tell you who did what to whom, when, where, and why. We couldn’t do that in 1950 and we can’t do that now in 2019. So, there are some things which there’s exponential growth. I actually gave Jack my comments and what I said is, “You’re highlighting all the things that the systems have gotten better at, but people aren’t measuring the things that the systems aren’t very good at.” So, there aren’t good measures of dialogue understanding. People measure what they can do well. That’s one example. Another example is what kind of progress have we made in robots in the home? So we had Rosie the Robot on television in the 1960s on the Jetsons, but we still don’t have anything like that in the real world. There’s not been exponential progress in building a robot butler or robot housekeeper.
AZEEM AZHAR: And all the efforts to build self-driving cars seem to be pushed back by a decade. Right? People are saying it’s going to be the mid 2020s or even the early 2030s before we see them, and five years ago the promise was we’d have them today.
GARY MARCUS: That’s right. There’s some linear progress in driverless cars, but there are all these outlier cases, weird cases with weird weather or weird light or things that happen infrequently like tractor trailers going across the road or tow trucks stopped on the side of the road where you don’t get a lot of data and there’s really not that much progress there. So each year, you know, you look at the intervention rates and they get a little better one in 8,000 hours, one in 12,000 hours or something like that, but we’re not really making enough progress, I think, in driverless cars to use them safely. And certainly the expectations were totally wrong. In 2012, I wrote a piece in the New Yorker called Moral Machines about what would happen if a school bus was spinning out of control and you were in a driverless car, an example a lot of people have used since. When I wrote that article in 2012, I thought by 2020 we would have driverless cars. They were legal in three states when I wrote that in 2012. It looked like there was a lot of progress and then that progress hit a wall. And so there are some things you can draw on a graph that look great, and there are some things people don’t usually draw on the graphs because we don’t have the same benchmark or because fewer people are working on it. Where there’s actually much less progress on general intelligence being able to understand an arbitrary question and be able to cope with it like the Star Trek computer could do, there’s no progress. It’s flat.
AZEEM AZHAR: And some people would argue that it’s just a matter of time, that given the rates with which computational power is increasing for a fixed dollar cost, I mean, we’re beyond Moore’s Law now and we’re coming out with these clever architectures. We’ve had companies like Graphcore and Cerebras that have built out these enormous chips just for AI, and we seem to be able to increase the amount of computational power that we can deliver by many orders of magnitude every decade or so. And with every bit of extra computational power, we can create more data from the internet of things and devices and sensors. So one argument would be, well, what if we have a million or a billion times as much compute and therefore a billion times as much data? That’ll be enough. Won’t it?
GARY MARCUS: I doubt it. I mean that’s really why we wrote the book Rebooting AI. It’s not that we think that AI is impossible. Right? My co-author and I, Ernie Davis, both want to see AI happen. We both think it can happen, but we don’t think that more and more data and more and more compute by itself is the solution to a problem. Obviously you would like to have more data. You would like to have more compute, and that helps with problems like speech recognition. But it goes back to what I said earlier in the podcast about intelligence being multidimensional. One dimension of intelligence is essentially perceptual classification. It’s a very handy dimension of intelligence that gets used in all kinds of things. So when AlphaGo plays a mean game of Go, it’s because it has very good perceptual classification combined with something else which we call tree search or their version is Monte Carlo Tree Search. Those are two different aspects of intelligence that get synthesized into a hybrid model that has sort of the best of both worlds. We need a lot more hybrid models, including ones that hybridize some techniques we have with some that don’t exist yet. The problem is that intelligence involves many different things. You classify things, but you also make inferences. You do reasoning. When you read a children’s story or anything that you read really, some things are spelled out, but most of them aren’t. Right? A story, a news story, or a fiction story that gave you every detail of what was going on and explained every inference, you know, this person must have been hungry, that’s why they got their food, would be the most tedious thing imaginable. So, any good writer tries to kind of say the things that are not obvious and let you infer the things that are obvious. For a machine to cope with that it has to be able to infer the things that are obvious to people. On that we have made no progress and having more and more data by itself is not going to solve the problem. We need to go back to some things that people thought about in the early history of AI, which is about reasoning and inference and how you combine different kinds of knowledge. And I’m not at all saying that’s impossible. I’m saying it’s neglected and that we need to come back to those questions using all these wonderful tools for deep learning and classification that have been developed in the last few years, but supplementing them with systems that can reason logically.
AZEEM AZHAR: You make the point that there’s no real one way for the mind to work because actually our minds are not one thing. There are many different parts. They operate differently and we have to somehow synthesize them and bring them together.
GARY MARCUS: A lot of animals can do what deep learning can do. A lot of animals can do perceptual classification and navigate their worlds pretty nicely. Only humans can acquire the kind of rich culture that we have. That’s because we have language and having the kind of deep learning perceptual classification by itself is not enough to give you language. Herb Terrace tried to raise a chimpanzee, which he called Nim Chimpsky in honor of Noam Chomsky and Nim just didn’t get it. The chimp has something like deep learning. It can do perceptual classifications, lots of experiments that will demonstrate that, but it didn’t have enough of an understanding of how other people work or what language is supposed to be, and so it didn’t get there. And so to think that deep learning is going to solve everything is thinking that a chimpanzee is going to be able to write Shakespeare. It’s not really going to happen even with all the random typing.
AZEEM AZHAR: So if deep learning is a useful tool, but it’s just one tool to take us towards powerful general artificial intelligence, what else do we need to get there? Do we even know? Is this the realm of engineering? Or is it the realm of research?
GARY MARCUS: We know some. It’s a lot easier just to take a lot of correlations, which is what deep learning does, than to figure out how to teach a system to understand all of the aspects of time and how time relates to the lives of human beings and the events that they participate in or to understand the causality of what happens when you pour water in a glass. Or one of our favorite examples in the book is a human being can look at a grater and understand what its parts are. A current system can draw a 3D model of a grater, but there’s no system out there that can look at a grater and say this is the function of those holes and why they’re sharp and so forth. So we need systems that can represent ideas about function and cause and so forth, which is not really a focus of current research, but it needs to be there.
AZEEM AZHAR: This approach of trying to make sense of causality and time and space does remind me of earlier efforts to build artificial intelligence, what I suppose we used to call good old-fashioned AI, and this was this idea of taxonomies and relationships and networks where you would say a finger is a part of a hand and a hand is part of a body and a body is part of a living being, and you have these complex trees of knowledge. Now those sorts of approaches were things we were trying in the sixties and seventies and early eighties, but they didn’t seem to yield lasting results except in some very narrow expert domains. So why would they be appropriate approaches now?
GARY MARCUS: Well, I think there’s two things to realize there. One is they’re actually still useful. I mean, people make taxonomies all the time. They’re very widely used, for example, in medicines. You have to also reason about a lot of uncertain knowledge and the techniques that we had in the sixties and seventies really weren’t set up for that. They didn’t have probabilities. They didn’t have distributions of information and so forth, and so weren’t very good at representing things that are kind of statistically true, but not a hundred percent true. So there were weaknesses in the specific techniques. The techniques actually do have some value. And then the other thing I would say is that it’s a totally unfair comparison. Deep learning was terrible in the 1950s. It was useless. There were papers about it. It took 60 years before it was actually useful. And there’s this kind of weird historical accident that at a particular moment people finally figured out how to make deep learning useful by having bigger data, by having the GPUs, by making a couple of technical tweaks. And then they’re comparing that with a technology that was built on machines that had like 8K of RAM, not eight gigabytes of ram, but 8K of RAM with data sets that were a hundred examples that were hand annotated by a graduate student instead of billions of examples. So it’s really not a fair comparison. The people working on good old fashioned AI didn’t have modern tools to work with, but I think the questions that a lot of those folks were trying to ask were very much the right questions to ask. And those questions have been abandoned. Now that we have deep learning to do the perceptual classification, we have computers with memory that would’ve been undreamed of at that time, compute that would’ve been undreamed of that time, plus techniques for representing probabilities and doing Bayesian reasoning. We have all these tools those guys didn’t have. Let’s go back and ask their questions again and see if we can’t do better.
AZEEM AZHAR: And we also have a lot of data that has been created by humans living their everyday lives, their quotidian experiences, whether it is sharing on Instagram and tagging something as a burrito or updating a list of Wimbledon winners on the Wikipedia, that there’s a lot of data out there and relationships that can then be gleaned for us to bootstrap any sort of data sets from-
GARY MARCUS: I’m suddenly reminded of, I think his first name is Jacques Derrida’s, the deconstructionist, the famous literary deconstructionist, obituary in the New York Times. I don’t have it in front of me, but roughly speaking, somebody said that, you have a lot of books in your library? Have you read them all? And he said only one very carefully. Right now we have systems that can read all the world’s data, but they read it very sloppily. They don’t really understand it. If we could build a machine that could read one source, let’s call it Wikipedia, but really understand it, you could probably get away with not reading so much of the other data, or at least then you’d be in a much better position to read the other data. If we get machines to understand one deep source in a deep fashion, it would be fantastic and an utter revolution compared to where we are now. And I do think that’s possible. I just think we have to go about asking a different set of questions, questions that aren’t kind of what statistical correlation can I find from X, Y, and Z? But rather, how do I get a machine to actually understand running text?
AZEEM AZHAR: And how would you do that?
GARY MARCUS: I think the first most important one is right now there’s a huge bias in the field towards looking for solutions that are entirely done by learnings. But I think you want that learning to be constrained by some prior understanding of the world. So you can think about what Kant wrote in the Critique of Pure Reason that you start with, I mean I’m psychologizing him, but you start with space and time and causality. We need to build systems that have enough basic understanding of those so that they can constrain what they learn. And they’re not just sort of looking for any random correlation, but they’re looking for what causes what in the world. What is the function of this? So if you know in advance that there are objects, that there are people, that there are places, that people have goals and so forth, then you can make sense of what you read. Otherwise, it’s just a bunch of random characters. I just did a fun experiment. So there’s a system called GPT-2…
AZEEM AZHAR: GPT-2. Yeah. Yeah.
GARY MARCUS: …which is probably the best language generation system in the world right now. It’s the one that OpenAI was so excited about. They said we’re not going to share it with anybody because it’s dangerous, which I find silly. I don’t think it’s really all that dangerous, but it was great PR for them. So OpenAI released this GPT-2 thing and we took a story, it’s a Laura Engels Wilder story. She wrote Little House on the Prairie and we just took two paragraphs from it. A man loses his wallet. The little boy finds it, returns it to the man. The guy gets the wallet back. He counts the money and it’s all there. And he’s excited. So, I fed that into GPT-2 which is kind of like an improvisation device. You put in some text and it continues the text and it makes perfectly fluent text that makes no sense at all. So one example we got was right after the sentence about he finds his wallet, it says he went to look for his money in a safe place. Well, that makes no sense at all. If he’s got the money in his wallet, he’s got it back. He doesn’t need to go look for the safe place. The system knows that wallet is correlated with safe place in that a lot of sentences have them both, but it doesn’t understand that a found wallet means that the money is there and not in some other place. There’s no understanding whatsoever of the text that it’s comprehending.
AZEEM AZHAR: I have three kids and we sometimes watch nature documentaries on television, and there was this amazing one a couple of years ago on the BBC, which showed some baby, I’m going to say mountain goats maybe baby ibexes, that’s right, and they had to climb this near vertical surface and there were a matter-
GARY MARCUS: That’s one of my favorite videos.
AZEEM AZHAR: Right. So you know it. Okay. So they don’t have a chance to gather much training data, as we call it in the sort of machine learning community, and they certainly don’t have chance to make a mistake and run that experiment again with new parameters.
GARY MARCUS: That’s right. There’s a technique called reinforcement learning where you try things and you get feedback. It worked, it didn’t. How AlphaGo works is it gets a lot of feedback. It worked or didn’t those baby ibexes get zero chance for error. They make one mistake and they’re off the mountain towards the end of it.
AZEEM AZHAR: So how do they get up the mountain, Gary?
GARY MARCUS: So the way they do it is natural selection has picked a distribution of animals that are built in with knowledge about three-dimensional geometry, about their own bodies, the relation between forces that they apply and the things they’re climbing on. I mean, if you watch the video carefully, they’re not a hundred percent perfect, but they don’t make any major mistakes, so they slip a little bit. So there’s also some feedback mechanisms that are built in so they can compensate, but they know not to go too far off the cliff. There’s a little calibration that happens. So nature gives a rough draft to their brains in the form of natural selection has chosen a set of genes that builds an ibex that can cope as soon as it’s outside the womb. That rough draft does get refined. Right? They have to compensate for the strength of their own limbs and the weight of their bodies and stuff like that. But there’s a really good first draft and it’s never nature versus nurture. It’s nature and nurture working together. And what happens in the baby ibex is that nature is given a really good first draft that allows them to survive by giving them enough understanding of three-dimensional geometry and motor control and so forth. It’s not of course conscious knowledge, but it’s there. It’s exactly what machine learning is not doing right now. So there is one built in thing in machine learning right now, which is the ability to recognize objects in different places and everything else it’s learned. So it’s a rough draft that’s woefully incomplete compared to what the baby ibex is getting.
AZEEM AZHAR: I suppose getting the baby ibex pre-wired in that way has been quite an expensive process. Right? It’s taken many billions of years and many, many other types of ibexes and pre-ibexes have slipped down the mountain or been eaten by mountain lions in order to evolve a particular configuration. I’m just curious again about why, as we get more and more efficient with producing compute cycles and giving those things experience, why that same path can’t currently get us to where we get to. If I think about a human playing Go, right, a human playing Go requires about 20 watts of energy. Right? That’s what the brain takes up. And a Go playing supercomputer racks of GPUs takes up tens of thousands of watts. It’s hugely, hugely inefficient by comparison, but I just roll forward and think if it was the application of many, many cycles of iteration through an evolutionary process and throwing a bunch of energy into the problem over the course of these episodes, why wouldn’t the same path work again, but in Silico?
GARY MARCUS: So, there are a whole bunch of questions in what you just asked. One observation that comes out of what you just said is that in biology, a lot of the energy consumption is front end loaded in the sense that it happened over evolutionary time. So now the baby ibex probably uses 20 watts. So it’s incredibly efficient compared to any machine that we know how to build to do motor control. But, yes, an enormous amount of energy was expended in getting there. And in some sense, that’s my whole argument is we need to allocate energy to the front end loading of the systems instead of doing it all on the back end. Now another part of what you’re asking is could we do evolutionary search in order to create better AI systems? And there’s a whole school of that kind of work and in my last company I had some of the best people in the world who are working on that. The problem for now is it would require kind of replicating the whole billion years of evolution. So my suggestion is that we might start with points that are a little bit more sophisticated by looking at how the ibex works or how the human toddler works and try to have starting points that are further on instead of trying to replicate the whole thing. Well, one way to think about it is only took 50,000 years depending on your perspective relative to a chimpanzee brain that took a long time to evolve and the primate brain took a long time to evolve from a mammalian brain and that took a long time to evolve from the vertebrate brain. If we could figure out how a vertebrate brain works, build in that innate structure in our systems and then do the evolutionary modeling, we might move a lot faster. Or if we could figure out even better how a primate brain works, incorporate those insights and innate ideas from those things into our systems and then do evolutionary search that might work great.
AZEEM AZHAR: What you describe though sounds like it’s an engineering effort that encompasses a bunch of different scientific research. Right? It’s engineering in the sense that we have a working example of how intelligence works in the wild through various different types of mammals, and we’ve got some understanding about how their brains work and the different regions and components and networks and sub-networks within that. And perhaps we need to study them in some sense and then think about what that then helps us to design and build in an artificial context.
GARY MARCUS: I’d say, yes, but…
AZEEM AZHAR: Right.
GARY MARCUS: So, I think that’s approximately correct except that we don’t know enough about how the brain actually works to do the reverse engineering of the brain that you just described. If we did, if we understood neuroscience better, I think that’d be a great way to build AI. The sad fact is that neuroscience has been kicking around for a few hundred years and we still don’t really know some very basic things like how short-term memory works. We just haven’t figured it out yet. And I think in fact, we’re going to need AI to figure out neuroscience rather than the other way around. However, the good news is we actually know a lot about psychology and we have good tools to understand psychology. They’re not perfect, but we can figure out the psychology of how things break down into modules, of what information people use. We know lots about different forms of memory. We know something about how people understand sentences and produce them. We understand something about cognitive development. And what I would argue is if we look at that, which I would call collectively cognitive science that can really help us build our AI, that right now as a kind of historical artifact, AI is dominated by former physicists who are good at math, who don’t really know that much cognitive science, don’t really have that much respect for it, and they’ve got some neat equations and math to run with the big data, and they’re taking basically correlations rather than causation out of that data. And I think if we brought cognitive scientists in the fold and stopped having people like Geoff Hinton going around saying, don’t have symbols in there, it’s ancient technology, and instead said, hey, how can we work together, I think we could really do great stuff.
AZEEM AZHAR: It’s interesting. When I look at the debate that is going on it always brings to mind Luigi Galvani who was the Italian scientist in the 18th century who discovered animal electricity and could get the frog’s leg to twitch, and he was a contemporary Volta and they could build some things, but it wasn’t until we had Maxwell’s theory of electromagnetism that we could really start to build and harness the technology. And when I look at this debate that goes on, I get the sense that perhaps what we’re missing is a working theory of how intelligence works.
GARY MARCUS: I think that’s right. I don’t even think a lot of people are trying right now. They have this tool and they’re trying to figure out what they can do with a tool rather than trying to figure out what intelligence is. And I guess what I’ve done in a lot of my career is to try to lay out what I think intelligence is because I think until we understand what we’re trying to build, just working from the bottom up saying how can I tweak these layers and add more neurons is not really getting us to where we want to be.
AZEEM AZHAR: It seems like common sense is an important component in all of this. How do you go about building common sense in an artificial system? And who decides what common sense is?
GARY MARCUS: I think those are really good questions that we don’t have answers to yet. The challenge starts with the fact that common sense is not one thing, just like intelligence is not one thing. We define it as kind of knowledge that’s ordinarily held that you can expect any adult, for example, to have. But some of that knowledge is about how objects work. Some of that knowledge is about how people work, how animals work. There’s like material science. I mean, even if you’re not a sophisticated person, the difference you know between a napkin that you can tear and a pen that’s difficult to break. So there’s all kinds of things that go into it. It’s not just one thing and that means there’s not going to be one single source for it. The first question is how do you even represent it? How do you program it into the machine? The biggest effort tried to do it all with logic. And logic is not very good in handling uncertainty. So somebody spent 30 years trying to program all this stuff in logically, and it hasn’t really been that effective. We would like to find ways of programming in the uncertainty that we have around the things that we know. That would help. So there’s a question about how you even store this. It’s a kind of question like how do you write a computer programming language? Deep learning’s Achilles heel is it doesn’t really have a way of directly incorporating explicit knowledge. So there’s no way to tell deep learning that a bottle is something that can carry liquids and it might leak and all this kind, you just can’t put it in there. So first mission is to really build a language for representing that stuff at all. The other part of your question is of course, who decides, and I mean you could imagine a version of the French Academy legislating it. I don’t think that the boundaries are so fixed. I think for example, different companies will build common sense reasoning engines that have different boundaries, and ultimately there’s no even fixed line between common sense knowledge and expert knowledge. If you’re building a practical thing like a robot, you’re going to want stuff that everybody knows. And you might want some stuff that the robot knows, for example, about chemistry that the average person doesn’t, but they would be really useful so that your robot doesn’t blow up the building. But common sense knowledge is a really good place to start, and even that is beyond what most systems have.
AZEEM AZHAR: A number of times during our conversation, you’ve used the word probabilistic and distributional, and I guess those are terms that are very distinct from the machines that we’ve traditionally had for the last a hundred years or so. I mean, machines that we’ve had for the last a hundred years have been deterministic. If I pick up a hammer and I hit a nail, it does the same thing every time unless there’s actually a flaw or a fault or a failure. As we start to build in probabilistic things, we start to have machines whose behavior is perhaps not as certain as we would like. And that starts to, I suppose, create issues around how do we test these things? How do we sign these things off? How will we go about determining what reliability or robustness means in this AI world?
GARY MARCUS: So, if you think about how an airplane works nowadays, there’s actually provable, verifiable software for certain parts of the kernel. So that’s going to work. It’s not going to crash. Just like we know that about USB drivers now, it didn’t used to be true. Right? Or Windows used to crash all the time, and then they figured out how actually verify that USB drivers were built to a certain standard and so forth. So, we don’t have formal verification to show that cars will do as they’re instructed. And the regulations on cars are pretty loose, and we mostly rely on it seems to do what it’s supposed to do and, you know, you want to do better than that. But it’s hard to come up with strict criteria. And some of it really does have to ultimately rely on real world experience testing as much as you can in different kinds of cases. And we kind of do that with people, like some people are reliable and some aren’t, and you observe them over time and you say this person gets stuff done and this person flakes out and you gather some empirical data about how they respond to different environments. This one can handle stress and this one can’t.
AZEEM AZHAR: In your book, Rebooting AI, which I enjoyed reading, you talk at one point about a robot butler. So let’s imagine this robot butler and we need it to pour wine in a typical restaurant. When do you think we’re going to get that butler?
GARY MARCUS: Well, restaurant’s easier maybe than a home if you can constrain the restaurant enough. The key question is really about how open-ended is the environment. So, if you had a fast food joint that just has to pour drinks and there’s no people around, that’s not so hard. If you’re talking about a cocktail party where people might be wearing funny hats and they might be bumping into each other, then the robot has to be more resourceful. It has to realize that if the glass fell on the floor, maybe I should pick it up right now, but maybe I shouldn’t, depending on where people are moving and how much of a scene it’s going to cause and so forth. And so, the more open-ended the situation is, the more challenging it is. And part of what we’re trying to build in my company are tools to allow people to make robots that work in the open-ended world, but there are no tools that exist now so we’re taking on a pretty big challenge. There’s nothing, there’s no off-the-shelf software right now that you can use to make a general purpose pouring system. And so what you have are demos. Somebody shows that they can get a beer from a refrigerator and pour that beer, but they do it in an empty room where they know exactly where the refrigerator is and what the distance is from the places that they’re going. It’s like as fixed and structured as possible. And then probably you see a video where there were 30 takes and you see the best take. Getting to the point where you can do this in general in a crowded environment is years away.
AZEEM AZHAR: There have been many forecasts and prognostications about when we will get artificial general intelligence. They range from 2035, which seems a bit soon to me, to be honest through to some people saying hundreds of years, if you were a betting man, and I’m sure you’re not, what’s the kind of date range you would give us for when we would have artificial general intelligence as smart and flexible as a human brain?
GARY MARCUS: I think in 16 years we’ll have arguments does this or that constitute artificial general intelligence? We might see little examples of it, but real artificial general intelligence means you can solve any problem and not just a few problems that seem kind of cool. And I think that’s at least 20 years away because I don’t see enough kind of laboratory demonstrations now that could be commercialized to say that we’re close. I don’t think it’s hundreds of years away. I don’t think the problems are that hard. I think the incentives are strong right now. There’s a lot of commercial interest, a lot of money being invested. I think scientific progress is to some degree correlated with the size of the investment, with the amount of interest in the problem, and there was a lot of interest in the problem. So I mean, if I had to sort of guess a number, I would say 30 to 50.
AZEEM AZHAR: Wonderful. 30 to 50 years. I’ll have to get you back on in 2049 and see how you did. GARY MARCUS, thank you…
GARY MARCUS: Awesome.
AZEEM AZHAR: … for taking the time to speak today.
GARY MARCUS: Thanks very much.
AZEEM AZHAR: Well, thank you for listening. That was another episode of the Exponential View podcast. I hope you enjoy it. Please check out the archives for some more conversations. I also have a weekly newsletter where I discuss these matters as well. And so you can subscribe at www.ExponentialView.co. My name is Azeem Azhar. This podcast is produced by Marija Gavrilov and Fred Casella. Katie Irani was the researcher. Our sound editor is the brilliant Bojan Sabioncello, and Exponential View is a production of E to the Pi I Plus One, Limited.