The researcher was Richard Stallman, who in 1980 asked a professor at Carnegie Mellon University about the source code for a printer driver. The professor could not provide it due to a non-disclosure agreement (NDA) with Xerox, which was Stallman's first encounter with an NDA. In response, Stallman resolved to create a free operating system without such restrictions. He went on to launch the free software movement and develop the GNU operating system to ensure freedom and sharing of software. The hacker ethic from MIT inspired both free and open source software movements, which provided common platforms like Python and NumPy for advancing open source artificial intelligence research after the "AI winter" of the 1980s-90
3. “Open source exists
because of Artificial
Intelligence. Artificial
Intelligence exists
because of Open Source”
4.
5.
6. The MIT researcher meets the
Carnegie Mellon professor
It's 1980. A 27-year-old artificial intelligence researcher from the
Massachusetts Institute of Technology (MIT) is at Carnegie Mellon
University's (CMU) computer science lab. It just so happens that a
professor there knows something about the malfunctioning printer
that's been giving him —and the whole MIT AI Lab— headaches for the
last several months.
8. A nice social
hack
His department, like those at many universities at the time, shared a PDP-10
computer and a single printer. One problem they encountered was that
paper would regularly jam in the printer, causing a string of print jobs to
pile up in a queue until someone fixed the jam. To get around this problem,
the MIT staff came up with a nice social hack: They wrote code for the printer
driver so that when it jammed, a message would be sent to everyone who
was currently waiting for a print job: "The printer is jammed, please fix it."
This way, it was never stuck for long.
12. Some context
of Richard
Stallman at
that time ...
Stallman enrolled as a graduate student at the Massachusetts Institute of
Technology (MIT). He pursued a doctorate in physics for one year, but
moved to the MIT AI Laboratory. As a research assistant at MIT under Gerry
Sussman, Stallman published a paper (with Sussman) in 1977 on an AI truth
maintenance system, called dependency-directed backtracking. This
paper was an early work on the problem of intelligent backtracking in
constraint satisfaction problems. The technique Stallman and Sussman
introduced is still the most general and powerful form of intelligent
backtracking.
15. The professor
signed an NDA
The software Xerox provided didn't let him. It was
written in binary, and no one at the lab could alter
any of the commands.
Sproull had signed a non-disclosure agreement
(NDA) with Xerox. If he gave Stallman the source
code, there was indeed potential harm: a lawsuit
from a very large corporation.
NDAs were fairly new at that time, but they were
gaining popularity with major tech companies like
Xerox and IBM. This was Stallman's first
encounter with one. And, when he found out about
it, he came to see it in apocalyptic terms.
16. So he decided to
create an open
operating system
As a reaction to this, Stallman resolved that he
would create a complete operating system that
would not deprive users of the freedom to
understand how it worked, and would allow them
to make changes if they wished. It was the birth
of the free software movement.
17. Such reaction
was because of
his lab ethic code
To understand why Stallman viewed
Sproull's NDA and refusal to hand over
the source code as such a threat, you
have to understand the ethic that
MIT's AI Lab had embraced for its
nearly 21 years of existence—an ethic
Stallman held dear.
That ethic was based on sharing. It
was based on freedom. And it was
based on the belief that individual
contributions were just as important
as the community in which they were
made.
18. MIT's Tech Model Railroad
Club (TMRC)
The TMRC changed the way all of us interact with machines. Without
it, our daily use of smartphones, laptops, and even self-driving cars might
look completely different.
We at TMRC use the term 'hacker' only in its original meaning: someone
who applies ingenuity to create a clever result, called a 'hack'
19.
20.
21. LISP
emerged at
the MIT AI
Lab
MIT electrical engineering professor John McCarthy offered a
course that charted a revolutionary path for computing at
MIT. It was a language course. The language was LISP. And
its inventor was the course's instructor, McCarthy.
LISP was designed to create artificial intelligence—
another term coined by McCarthy.
McCarthy and fellow MIT professor Marvin Minsky believed
machines designed to do simple calculations and told to
carry out other rudimentary tasks were capable of much,
much more. These machines could be taught to think for
themselves. And, in doing so, they could be made
intelligent.
22. The MIT AI lab
was created
before their
computer science
department
Out of this belief and following the
invention of LISP, McCarthy and
Minsky created the AI Lab, even
though the university wouldn't have a
formal computer science department
until 1975—when the electrical
engineering department became the
electrical engineering and computer
science department.
23.
24. The first AI
was used
for …
In those early days of computing, when machines
were gigantic and expensive (the IBM 704 was
worth several million dollars), you needed to
schedule time to access them.
The Lab drew heavily from the TMRC hackers,
who were becoming increasingly interested in the
university's small collection of computers. And while
many appreciated McCarthy's and Minsky's dream
of teaching machines how to think, these
hackers really wanted to do something much
more basic with the machines.
They wanted to play with them.
25. Games were
an excellent
motivation
for
contribution
Two hackers (Peter Samson and Jack Dennis)
discovered the answer when they programmed the
Lab's TX-0 computer to play the music of Johann
Sebastian Bach.
With that feat accomplished, the hackers turned to
other potential hacks. Could a machine provide
other forms of entertainment?
In 1962, three other hackers (Steve "Slug" Russell,
Martin "Shag" Graetz, and Wayne Witaenem)
developed one of the world's first video games
on the Lab's PDP-1. It was called Spacewar!
27. And it also
incited
collaboration
for hardware
In terms of gameplay on the computer
itself, hitting the switches on the PDP-1
quickly enough was difficult and
cumbersome. And so, two fellow TMRC
hackers—Alan Kotok and Bob
Saunders—picked through the random
parts and electronics in the club's tool
room one day. They used those spare
parts to fashion the first joysticks.
28. It was an example of the
hacker ethic
When all was said and done, more than 10 hackers had left their marks
on Spacewar! Its gameplay was the result of successive hackers
improving upon previous hackers' works—hacks on hacks on hacks.
Spacewar! represented the best of the hacker ethic. It illustrated the type
of innovation and problem-solving that open collaboration brought.
29. Machines making music and
novel ideas
For their parts, McCarthy and Minsky supported these student hackers—whether
by developing a video game or teaching a machine to make music. It was these
hackers' clever ways of attacking problems that would enable AI research to
continue. As Minsky and McCarthy later reflected on this first class of researchers:
"When those first students came to work with us, they had to bring along a special
kind of courage and vision, because most authorities did not believe AI was
possible at all."
30. Hackers expanded the
conception of the
computer capabilities
Computers were marketed more as support
tools than as engines of change. For industry
leaders, hacking was less a contribution than
a distraction.
The hacker ethic yielded a unique form of
ingenuity that pushed the boundaries of AI
research and the power of computing forward.
31. The hacker ethic was
replicated in other places
In the meantime, as this first generation of
hackers graduated from MIT, the hacker ethic
passed to a new class. And, in the process, it
reached beyond the campus of MIT and the
borders of Massachusetts.
32. Hacker culture
In 1962, John McCarthy
took a position at Stanford
University and started the
Stanford Artificial
Intelligence Lab (SAIL).
With him came the hacker
ethic, which a new
generation soon adopted.
33. Hacker culture It's safe to say that without the Homebrew Computer Club,
the personal computer and, eventually, the smartphone
would not exist as we know them today.
34. Some known
members
Among the members of the
Homebrew Computer Club
were Steve "Woz" Wozniak
and his friend Steve Jobs.
35. DARPA wanted weapons, not games
But while the hacker ethic was spreading to new adherents, AI research was
experiencing some tumultuous times. Funding from the Defense Advanced
Research Projects Agency (DARPA), which had propelled major AI research
projects during the 1960s, disappeared suddenly in 1969 with the passage of
the Mansfield Amendment. The amendment stipulated that funding would no
longer go to undirected research projects, which included the great majority
of AI projects.
36. The interactions were not
natural
In 1974, things got even worse when DARPA pulled nearly $3 million from Robert
Sproull's future home, Carnegie Mellon University's AI Lab. The Lab had been
working on a speech recognition program, which DARPA hoped could be installed in
planes for pilots to give direct commands. The only problem? Those commands had
to be spoken in a particular order and cadence. Pilots, it turns out, have a hard time
doing that in highly stressful combat situations.
37. So they moved to enterprise-
related AI projects
With government funds dwindling, researchers turned to the business world as
their primary source of funding and marketing for AI projects. In the late 1970s, these
enterprise-related AI projects centered on emulating expert decision-making, and
they became known as "expert systems." Using if-then reasoning, these projects
and resulting software became highly successful.
38. Proprietary
code is a
competitive
advantage
Software had boomed into a multimillion-dollar industry.
The innovation achieved in the AI Labs and computer
science departments at MIT, Stanford, Carnegie Mellon, and
other universities had gone mainstream. And, with the help
of new companies like Apple, computers were finally
becoming vehicles of transformation.
But, while the hacker ethic continued to thrive among
hobbyists, proprietary products became the standard in
this new burgeoning tech industry.
Businesses relied on competitive advantage. If a company
was going to be successful, it needed a product that no one
else had. That reality conflicted with the hacker ethic,
which prized sharing as one of its foundational principles.
39. You can make a high-end
salary by proprietary code
In the former camp, openness and transparency still thrived. But in
the latter, those ideals were verboten. If you wanted to make a
high-end salary working with computers, you needed to cross
that line. And many did. NDAs became the contract these former
hackers signed to seal their transformation.
40. The hacker ethic
turns to dust as the
A.I. winter
approaches
In the months after Richard Stallman's
unfortunate meeting with Robert Sproull, most
of the hackers at MIT's AI Lab left for a
company, called Symbolics, started by the
Lab's former administrative director, Russell
Noftsker.
Stallman, who refused to go the proprietary
route, was one of the few who stayed behind at
MIT.
41. A.I.
businesses
did not
succeed
Within a year, the billion-dollar business of specialized LISP
hardware collapsed.
A few years later, in the early 1990s, the expert systems
market would follow suit. They eventually proved ineffective
and too costly.
Soon, artificial intelligence entered a long and seemingly
endless winter.
42. It was a shame to
work in AI
AI was associated with systems that have all too often failed to live up to their
promises. Even the term "artificial intelligence" fell out of fashion. In 2005,
The New York Times reported that AI had become so stigmatized that "some
computer scientists and software engineers avoided the term artificial
intelligence for fear of being viewed as wild-eyed dreamers."
43. The hacker
ethic reborn
While artificial intelligence endured a
long, slow decline, Richard Stallman—
the self-proclaimed "last true
hacker"—sought to resurrect the
hacker ethic.
44. So it was the A.I.
winter and Stallman
had plans
By late 1983, Stallman was ready to announce his project and recruit
supporters and helpers on Usenet (a kind of pre-web Reddit) . In September
1983, he announced the creation of the GNU project (GNU stands for GNU's
Not Unix—a recursive acronym). Calling on individual programmers to
contribute to his project, especially those "for whom knowing they are helping
humanity is as important as money." The goal would be to develop a new
operating system based on the principle of sharing.
45.
46. So he started the free
software movement
This was the beginning of what would become the free
software movement. For many, it was the
reincarnation of the hacker ethic. It countered the
proprietary model of development and emphasized the
value of sharing. And it was solidified with the
creation of the GNU General Public License (GPL),
which was released on February 25, 1989.
47. He developed GCC
In January 1984, he started working full-time on the project, first creating a compiler
system (GCC) and various operating system utilities. Early in 1985, he published
"The GNU Manifesto," which was a call to arms for programmers to join the effort,
and launched the Free Software Foundation in order to accept donations to support
the work.
48. Linux adopted his GPL v2 license
With Linus Torvalds' creation of Linux® in 1991 (and then
choosing to release it under version 2 of the GPL in 1992), the
free software movement gained even greater attention and
momentum outside the normal hacker channels.
49. Hacker Ethic 2.0
not as A.I. but as
free software
And around the same time that artificial intelligence entered
its long winter, the hacker ethic 2.0—in the form of free
software—surged in popularity.
50. Netscape shared
its source code
In 1998, Netscape made headlines
when it released the source code for
its proprietary Netscape Communicator
Internet Suite. This move prompted a
serious discussion among developers
about how to apply the Free Software
Foundation's ideals to the
commercial software industry.
Was it possible to develop software
openly and transparently, but still
make a profit?
51. Free software and
open source
discussions
For Stallman, the initial motivation must always
be the ethical idea of available and accessible
software for all—i.e. free as in free speech (one
of Stallman's favorite analogies). If you went on to
make a profit, great. There was nothing wrong with
that. In his view, though, this other branch's initial
motivation was profits, while the ethical idea of
accessibility was secondary.
52. So the open source term
was coined
In early February 1998, Christine Peterson gave this new branch its official name when she suggested
"open source" as an alternative to free software following the Netscape release. Later that same month,
Bruce Perens and Eric S. Raymond launched the Open Source Initiative (OSI). The OSI's founding
conference adopted Peterson's suggested name to further differentiate it "from the philosophically
and politically focused label 'free software.'"
53. The free software movement
inspired open source.
Yet, despite this divide, many who work in open source
recognize Stallman as a founding father. It was the free
software movement that would later inspire the emergence of
open source.
54. A.I. winter was about to end and
the open source movement was
strong
In the years that followed, open source would come to equal,
and in some cases rival, proprietary development.
This was especially true once AI's long winter came to an
end.
55.
56. But the A.I. needed a common code
baseline
Several programming languages were used to develop the Artificial
Intelligence algorithms, but in order to be able to replicate existing code
there should be a common platform. Most scientists preferred C or
Python.
57. Python
In December 1989, Van Rossum had been looking
for a "'hobby' programming project that would
keep [him] occupied during the week around
Christmas" as his office was closed when he
decided to write an interpreter for a "new
scripting language [he] had been thinking about
lately". He attributes choosing the name "Python"
as a big fan of Monty Python.
58. Numpy
The Python programming language was not initially designed for numerical
computing, but attracted the attention of the scientific and engineering community
early on, so that a special interest group called matrix-sig was founded in 1995 with the aim
of defining an array computing package. Among its members was Python designer and
maintainer Guido van Rossum, who implemented extensions to Python's syntax (in
particular the indexing syntax) to make array computing easier.
An implementation of a matrix package was completed by Jim Fulton, then generalized by
Jim Hugunin to become Numeric, also variously called Numerical Python extensions or
NumPy.
59. Scientific open source
SciPy is an open-source scientific computing library for the Python
programming language. Since its initial release in 2001, SciPy has
become a de facto standard for leveraging scientific algorithms in
Python, with over 600 unique code contributors, thousands of dependent
packages, over 100,000 dependent repositories and millions of
downloads per year. It was done in NumPy.
60. Machine Learning
Modern AI tools and techniques were not
congregated in a single tool. Scikit-learn
(initially a SciPy plugin) was initially developed
by David Cournapeau as a Google summer of
code project in 2007 and it gathered the
common machine learning tools in a the same
open source toolkit.
61. Deep Learning
Created in 2007. Theano is a Python library and optimizing compiler for
manipulating and evaluating mathematical expressions, especially matrix-
valued ones. In Theano, computations are expressed using a NumPy-esque
syntax and compiled to run efficiently on either CPU or GPU architectures.
Theano is an open source project primarily developed by a Montreal Institute
for Learning Algorithms (MILA) at the Université de Montréal. They stopped on
2017 due to competing offerings by strong industrial players
62. The raise of Corporate Open
Source
Google open sourced his internal AI framework,
TensorFlow, in 2015.
Facebook decided to give support to Pytorch in
2016. It is important to mention that torch was created
in 2002 for C++ but it did not gain popularity until the
python wrapper was released and when Caffe2 was
merged in 2018.
63. Open Neural
Network
Exchange
(ONNX)
In September 2017 Facebook and Microsoft
introduced a system for switching between
machine learning frameworks such as PyTorch
and Caffe2.
Framework interoperability: Allow developers to
more easily move between frameworks, some of
which may be more desirable for specific phases of
the development process, such as fast training,
network architecture flexibility or inferencing on
mobile devices.
In November 2019 ONNX was accepted as
graduate project in Linux Foundation AI.
64. That’s the history so lets recap
The hacker ethic was crucial for the early A.I. advancements, but when the
winter came, that was adopted as a philosophy of the free software. Once
the A.I. winter ended, the open source movement was as strong that even
the largest software companies opened their coding tools.
65. But … Who actually write
that code?
If it is true that there is diversity in the background of
participants of any open source community. For the A.I.
projects there is not that diverse. So, Who are the main
actors?
68. Graduate studies
After completing the undergraduate studies (bachelor) then you can pursue a
graduate degree (masters or doctorate).
Richard Stallman was pursuing a graduate degree when he started the
movement. Most of the A.I. algorithms creators have a Ph.D. degree.
69. The achieved that not
because of the Ph.D.
degree
Because of the time available to experiment
Because of the work in teams
Because of the computational power
Because they were not concerned about working and
money.
70. The Ph.D. is a job
Some people think that the PhDs are only students.
Most of the Ph.D. students get a salary either for his work as Research
Assistants, Teaching Assistant, or they have Fellowships or
Sponsorships.
Some companies and governments have projects with the universities
and the graduate students are the people who execute them.
71. So are you saying
that I need a Ph.D. to
contribute?
Not exactly.
Contributing to A.I. frameworks that are a
collection of algorithms requires to create an
state-of-the-art algorithm that merits their
inclusion. In that case a Ph.D. is able to spend
months in a solutions that outperforms existing
algorithms, so it is more probable that a Ph.D.
can author it.
When you create your own project or you
contribute to an existing one by coding the
building blocks or adapting research code, then
there are a lot of opportunities to contribute.
72.
73. So, How can I start?
In most of the A.I. repositories there are issues marked as :"good
first issue" or related labels. Try to find them and start contributing.
GitHub search:
is:open label:"good first issue"
74. There are some A.I.
frameworks per companie
I am listing some of the most popular A.I. frameworks that are
maintained by companies. It is a good opportunity to learn deeply
how it works and to get a job if you are a committed contributor.
79. Some other Open Source and A.I.
curiosities
There are some other facts related to open source and A.I. that are
interesting to know.
80. Is OpenAI really open?
OpenAI is an independent research organization consisting of the for-profit
corporation OpenAI LP and its parent organization, the non-profit OpenAI Inc. The
corporation conducts research in the field of artificial intelligence (AI) with the stated aim to
promote and develop friendly AI in such a way as to benefit humanity as a whole; it is
considered a competitor to DeepMind. The organization was founded in San Francisco in
late 2015 by Elon Musk, Sam Altman, and others, who pledged US$1 billion.
OpenAI stated they would "freely collaborate" with other institutions and researchers by
making its patents and research open to the public. In 2019, OpenAI became a for
profit company
81. Why is AI
succeeding this
time?
There are 3 main factors
● Modern AI approaches are not rule
based anymore but data-driven.
● Now we have enough data and effective
synthetic generation data techniques.
● We have the computation power thanks
to GPUs and other high-performance
hardware.
82. And what about
DeepMind
DeepMind Technologies is a UK artificial intelligence
company founded in September 2010, and acquired by
Google in 2014. The company is based in London, with
research centres in Canada, France, and the United
States. In 2015, it became a wholly owned subsidiary of
Alphabet Inc.
DeepMind Technologies' goal is to "solve intelligence",
which they are trying to achieve by combining "the best
techniques from machine learning and systems
neuroscience to build powerful general-purpose
learning algorithms". They are trying to formalize
intelligence in order to not only implement it into machines,
but also understand the human brain
83. Let's explain what a GPU is
A graphics processing unit (GPU) is a specialized electronic circuit designed to rapidly
manipulate and alter memory to accelerate the creation of images in a frame buffer intended
for output to a display device. Their highly parallel structure makes them more efficient
than general-purpose central processing units (CPUs) for algorithms that process large
blocks of data in parallel.
With the emergence of deep learning, the importance of GPUs has increased. It was found
that while training deep learning neural networks, GPUs can be 250 times faster than
CPUs. The explosive growth of Deep Learning in recent years has been attributed to the
emergence of general purpose GPUs.
85. And to explain
what CUDA is
CUDA (Compute Unified Device
Architecture) is a parallel computing
platform and application programming
interface (API) model created by Nvidia. It
allows software developers and software
engineers to use a CUDA-enabled graphics
processing unit (GPU) for general purpose
processing – an approach termed GPGPU
(General-Purpose computing on Graphics
Processing Units).
CUDA is used in the backend of all the
major deep learning frameworks,
86. But CUDA is not open source
...
So why do every framework use CUDA if it is not open source? Why do they
want to support the NVIDIA sellings?
The answer is simple. NVIDIA was the first optimized hardware that showed to
have stable and fast results during intensive processing. But there are some
good news, there is an open source alternative, and they are starting to open
part of their hardware documentation:
https://github.com/nvidia/open-gpu-doc
87. OpenCL is the open source
alternative
OpenCL (Open Computing Language) is a framework for writing programs that
execute across heterogeneous platforms consisting of central processing units
(CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-
programmable gate arrays (FPGAs) and other processors or hardware
accelerators.
It is gradually being adopted for some parts of existing frameworks.
https://www.khronos.org/opencl/
88. But a TPU is a more
optimized hardware
A tensor processing unit (TPU) is an AI
accelerator application-specific integrated
circuit (ASIC) developed by Google
specifically for neural network machine
learning. It has 8-fold per pod (with up to 1,024
chips per pod).
But only Google has them in their data
centers.
89. So how can I run
A.I. without a GPU
There are not much free options, but Google has an easy
to use notebook based platform that runs GPUs and
TPUs in the background for free. Its name is Google
Colaboratory.
90. Google Colaboratory
Colaboratory (also known as Colab) is a free Jupyter notebook environment
that runs in the cloud and stores its notebooks on Google Drive. Colab was
originally an internal Google project; an attempt was made to open source all
the code and work more directly upstream, leading to the development of the
"Open in Colab" Google Chrome extension, but this eventually ended, and
Colab development continued internally.
It gives a 300Gb hard drive, and a 12Gb GPU memory for the NVIDIA Tesla
K80 GPU instance.
91. Some useful
resources to
learn A.I.
from code
Papers with code
https://paperswithcode.com/
A.I. Notebooks for colab
http://bit.ly/awesome-ai
92. Conclusions The A.I. scene and
the free software
movements are
intimately related.
In A.I. frameworks,
the Ph.D. often
writes the
algorithms and
enthusiastic
committers the
application
structure or
migrate existing
code to the
framework format.
Getting started in
the A.I. scene is
easier by doing
introductory
issues in the
repositories.
Learning and
testing tasks are
easier and free by
using free cloud
services such as
Google
Colaboratory.