Hi everyone! I'm really excited to be presenting one of my favorite AI topics on self-programming artificial intelligence. I'd like to start this off with a simple question. Is it possible for a computer program to write its own programs?
Before I get started, let me quickly introduce myself. My name is Kory Becker. I'm a software developer at Bloomberg LP. I design web applications and web services, although my core interest in computing has always been artificial intelligence and data science. I'm also the author of the book, "Building Voice-Enabled Apps with Alexa" http://shop.oreilly.com/product/9781939902443.do, which is my first published book. In fact, the final release was just published this year! Wow! If you're interested in conversational UI, chat bots, or building your first Amazon Alexa app, take a look at my book online at Safari Books or O'Reilly.
I want to briefly start this presentation off by just clearing up some media hype that has been steadily growing over the past year or two, surrounding what AI is and all of the amazing things it's going to do. The news like to say things like "chatbots are going to take over everyone's jobs, machine learning is changing everything, etc". Not to belittle machine learning, as it truly is an amazing branch of AI that has made significant leaps and bounds in accuracy over the past few years (largely due to massive online datasets, increased computing speed, and deep learning). However, AI is not just machine learning. There is a lot more to it! AI encompasses many different branches. There is logical AI, which deals with representing knowledge as logical sentences. There is Symbolic AI (also called Classical AI), which uses human-readable representations of problems (my STRIPS planning library is an example of this). There is knowledge-based AI like the Cyc database, pattern recognition such as image recognition of cats, dogs, and the CIFAR dataset. There is also AI planning (see http://stripsfiddle.herokuapp.com for an example of this, where I demonstrate AI for solving Starcraft build orders! How cool is that?). There are heuristics like A* Search. But the focus of this presentation is the last topic on the slide, which is genetic algorithms, also called evolutionary computation.
So, why genetic algorithms? Genetic algorithms are a branch of AI that deal with simulating evolution among thousands of different genomes to produce the best offspring, most fit to solve a particular task. In the case of self-programming AI, this task is writing its own computer programs! I like to compare the overall idea to the Infinite Monkey Theorem. The idea is that if you have a hundred monkeys banging on a hundred typewriters, given enough time, they'll produce a written work by Shakespeare. I know this seems crazy, but consider it. The monkeys are hitting random keys, of course. However, at some point in an infinite time space, they'll manage to randomly type a real word. Perhaps, a real sentence. And at some point in infinite time, a complete book! Now, consider this difference. What if you gave a banana to the monkeys each time they typed a correct key? Perhaps, you could guide the random key banging into a more structured approach, effectively training the monkeys to write out a work by Shakespeare. This is exactly the idea behind program synthesis.
Program synthesis is a more formal naming for self-programming AI. It's simply computer programs that can automatically generate their own computer programs for a specific goal. Program synthesis can be done in different ways, such as neural networks, deep learning, and as we're about to see, genetic algorithms.
In the case of using genetic algorithms for an AI to write its own computer programs, a programming language that is easy enough for the AI to manipulate is required. The selected programming language is a very strange and esoteric one. It's very difficult for humans to understand and use. However, it's simple nature makes it perfect for a self-programming AI. First, the programming language is Turing complete. This means that it's theoretically capable of solving any problem in the universe. That means, the AI would be theoretically capable of writing its own computer programs for any problem in the universe. Next, it consists of just 8 instructions. This makes it easy for a genetic algorithm to manipulate, as the number of possible instructions are kept to a minimum. Third, it's easy to build an interpreter for and to expand upon.
To generate a program, we first create a genome. The genome is encoded as an array of doubles (or floating point values). Each value corresponds to a programming instruction in the language. The genomes are converted to a program, executed, and their fitness is ranked according to the output of each program. The closer a generated program comes to solving the task, the more likely it is to continue to the next generation. At each generation epoch, we use roulette selection, crossover, and mutation to create child programs that are slightly different (and hopefully better!) than their parents.
Here is an example of constructing a genome from an array of doubles. Notice how each value maps to a specific instruction in the programming language. Initially, these values are all random and the programs won't do much except throw errors and fail. Most likely immediately. However, one or two are bound to run and do, at least, something! It's these that move on to produce offspring with code that gets better and better.
To create offspring, a parent genome contributes part of its genome to the child. This is called crossover, and you can see it in action in the slide above. We can also randomly apply a little mutation, which slightly modifies the value of a particular gene, resulting in a change to the resulting programming instruction. These changes copy forward potentially beneficial parts of the parent and mutate certain instructions, which may or may not, end up making the child programs more fit.
Finally, the resulting programs that are generated get ranked according to how well they performed. You can see in this slide how the top program fails, and is removed from the pool of genomes. However, the bottom program succeeded and is carried forward to produce child programs. It just so happens, the bottom program is a valid running program that takes 2 bytes for input and prints them out.
Now that we're all experts on genetic algorithms and evolutionary computational AI, let's see what the project has produced. The AI has been able to successfully generate programs for "hello world", reversing a string, addition, subtraction, multiplication, reading input from the user, if/then conditions, printing the fibonacci sequence, and even writing a program to output the "Bottles of Beer on the Wall" song!
It's pretty amazing how the AI was able to produce these programs, and more! There is certainly some potential in what could be done. I've been writing software for many years and I've always felt that it's just something that humans shouldn't be doing. Computers should be writing the code. Humans should be designing the programs. Consider a future where humans create "lego blocks" of program designs. They give these designs to the computer, which then writes the programs to make these "lego blocks" real software. The human then designs a higher level problem, where the computer can then utilize the building blocks as sub-modules to create even more complex programs, generating software for web applications, databases, games, and much more.
If you'd like to learn more about my self-programming AI research, check out my blog at http://primaryobjects.com/cms/article149. I have a full write-up of the details behind the AI, example programs that it's successfully produced, and ideas towards future work. It's really fascinating stuff! Also, be sure to check out my book, "Building Voice-Enabled Apps for Alexa" on Safari Books. Thank you!