A talk from The Combine 2011.
APIs Demystified is intended take the magic out of APIs for people that aren't programmers. We will discuss what Application Programming Interface means, starting with a general overview and then moving the focus to web APIs and how they are becoming the building blocks of today's applications. A discussion of why a company might decide to build an API follows.
API stands for Application Programming Interface.\nGlad I could clear that up. Thanks for coming, any questions?\n
An application is anything that lets you perform a specific task that isn’t a core function of the device. For example, receiving a phone number on a smartphone is a core function of the device, but if you want to do something more important like launch birds at pigs or make fart sounds, you need to download a specific application made for doing just that. \n\nYou can think of an application like a toaster. You plug it into your power source and it does a specific task (making toast). \n
Programming is not magic. Programming is getting a machine to perform an operation or a set of operations by giving it instructions. Let’s go back to our toaster. You’re all set up with your shiny new toaster, and it’s time to make some delicious sourdough toast. You need to give the toaster some instructions that tell it how dark you want the toast to be. So you give the toaster some instructions, and you yell “HEY TOASTER, MAKE MY TOAST VERY DARK.” But the toaster doesn’t listen to you, and stares at you like you’re a fool.\n\n\n
We tried yelling instructions at our toaster, but it didn’t work. That’s because we didn’t use an interface that the toaster understood. If we instead turn the toast darkness dial, we are now giving the toaster instructions in a way that it expects and understands. That’s an example of “application programming” - you’ve told your application how you expect it to behave by giving it instructions, but it didn’t mean anything until we had an interface that provided those instructions to the toaster in a way that is meaningful to both us and the toaster.\n\nThe most important part of “application programming interface” is in fact the last part, interface. An interface is a connection point that allows some interaction to take place between discrete components. \n\nThe cool thing is that the components don’t care about how the other is going to act, as long as it plays by the rules. For example, think of standard mains electrical wiring in your house. You can plug any device into a power outlet, and as long as it follows the rules defined by the system (it can operate at 60hz and 120 volts), it will work. \n\nInterfaces are not just limited to hardware like a power outlet or a dial on a toaster. They can also describe connection points between software. Think of any software that uses plugins or add-ons, such as your browser extensions, or macros in Excel. Those extensions are interfacing with the rest of the browser or the rest of the spreadsheet via software connection points; in fact, they are talking to those applications via an API.\n\nWhat’s the deal with toaster knobs going to 6?\n\n
APIs describe a set of expectations, but they don’t actually describe how those expectations are going to be implemented and used. You can’t open up an API on your iPhone or your laptop and run it like an application. APIs are generally invisible to end users, but you use them everyday without realizing it. You interact with real, concrete implementations of APIs rather than with the APIs themselves. \n\nThink back to the power outlet, you plug something in and as long as it plays by the rules, it works. Your Twitter client on your smartphone plugs in to the Twitter API to perform certain tasks, such as showing tweets by people you follow, or posting a new tweet. As long as the Twitter client you are using adheres to the expectations of the Twitter API, it works. The API doesn’t care if you’re using Android or an iPhone or if the code using the API is written in Java or Objective C. In geek speak we say that the interface is agnostic, because it doesn’t care at all about how it is implemented.\n\n
Let’s think back to the toaster. In order to make darker toast, we had to turn a knob that is hardwired to the heating element. But imagine if our toaster had an API to describe and handle instructions like “change the toast setting” or “start toasting”, and instead of being wired directly to the hardware, the knob interfaced with an API? The API doesn’t care what’s talking to it, only that whatever is talking to it follows the rules. The knob being turned would tell the API “change the darkness setting”, and the API would then apply that setting to the toaster.\n\nNow that we have a layer that abstracts the interface, we could add a voice module that knows what we mean when we yell that we want darker toast. The voice module code would interpret the command, translate it so it can communicate with the API, and the API would again communicate with the toaster to affect the necessary changes. Now we’ve got a toaster you can yell at!\n\nA voice activated toaster sounds crazy, but would actually be possible using some programmable hardware such as an Arduino. People have done stuff more ridiculous than this - someone made a pair of sneakers that uses an accelerometer to post to twitter every time you take a step.\n\n\n
The term API is general and broad in scope, and can be applied to many different situations. Sometimes this variability can lead to confusion. APIs at different scopes can seem very different, but they always do basically the same thing; they give you an interface to provide instructions.\n\n
Operating System -> Hardware ( Device driver )\n\nA device driver is like an API but with a hardware component. Operating systems don’t care what hardware you plug into them as long as there is a device driver to describe the interaction. \n
\nCode -> Code framework ( Java core or .NET Framework )\n\nA developer writing code in Java might use low-level methods that are built into the programming language to perform a task such as checking that a password is at least 8 characters or to take the square root of a number. These are tasks that could be manually, but it makes a lot more sense to abstract the details so developers can focus on the interesting stuff. This also allows for the code that handles these specific tasks to be optimized in ways that wouldn’t always happen if we had to code them up every time.\n\n\n
\nProgram -> Operating System ( Save this file, etc)\n\nPrograms can interact with an operating systems API to handle tasks like creating, opening, and writing to files, playing sounds, and outputting graphics to devices. Can you imagine if every developer had to handle these low-level operations in their code? Programs would be much larger, take longer to write, and there would be a lot more inconsistency. With these APIs, developers can tell the operating system “I need a progress bar and a few buttons” and the OS provides standard, consistent widgets for use.\n\n\n
App -> Web App (show all arcades near the user’s area code on a map)\n
APIs can refer to a lot of different types of interfaces, as we’ve just seen. But nowadays, when people are talking about APIs, there’s a really good chance that they are referring to web APIs unless they specify some other context. \n\nA web API is an interface with additional layers that standardize communications and give options on how to format the input and output so that they can be commonly used across multiple systems.\n\nA protocol layer defines and enables communication between multiple parts of a system - for example, the protocol that manages traffic on the internet is called TCP/IP, and it’s responsibly for telling machines how to find one another and giving them a common language for trading data back and forth.\n\nWeb APIs make use of a widely implemented and well known protocol called HTTP, or hypertext transfer protocol. You’ve used HTTP a lot - every time you load something in your browser, chances are that you are sending your request and receiving the response via HTTP.\n\n\n
APIs can refer to a lot of different types of interfaces, as we’ve just seen. But nowadays, when people are talking about APIs, there’s a really good chance that they are referring to web APIs unless they specify some other context. \n\nA web API is an interface with additional layers that standardize communications and give options on how to format the input and output so that they can be commonly used across multiple systems.\n\nA protocol layer defines and enables communication between multiple parts of a system - for example, the protocol that manages traffic on the internet is called TCP/IP, and it’s responsibly for telling machines how to find one another and giving them a common language for trading data back and forth.\n\nWeb APIs make use of a widely implemented and well known protocol called HTTP, or hypertext transfer protocol. You’ve used HTTP a lot - every time you load something in your browser, chances are that you are sending your request and receiving the response via HTTP.\n\n\n
APIs can refer to a lot of different types of interfaces, as we’ve just seen. But nowadays, when people are talking about APIs, there’s a really good chance that they are referring to web APIs unless they specify some other context. \n\nA web API is an interface with additional layers that standardize communications and give options on how to format the input and output so that they can be commonly used across multiple systems.\n\nA protocol layer defines and enables communication between multiple parts of a system - for example, the protocol that manages traffic on the internet is called TCP/IP, and it’s responsibly for telling machines how to find one another and giving them a common language for trading data back and forth.\n\nWeb APIs make use of a widely implemented and well known protocol called HTTP, or hypertext transfer protocol. You’ve used HTTP a lot - every time you load something in your browser, chances are that you are sending your request and receiving the response via HTTP.\n\n\n
These requests return information that the browsers know how to handle - HTML to describe how things are laid out for presentation, images to be rendered, and javascript files to add dynamic behavior and interactivity.\n\nEach resource, whether is an image or an HTML page or a javascript file, is requested by it’s own HTTP request and delivered in it’s own HTTP response. Loading a webpage is a big HTTP conversation made up of exchanges between your browser and the web server.\n\nLet’s take a peek behind the curtain and look at an actual HTTP exchange.\n\n\n
This is a subset of the HTTP Requests that my browser made to Google in order to load the google homepage. This information is readily available for you to explore and play with.\nAll I had to do was go to the tools menu and select Developer Console, then click on the Network tab.\n\nYou can see a variety of information about each request... (explain some of the categories)\n\nWe’re going to examine the actual request and response for the highlighted row, which is the Google logo.\n
These are the headers of an HTTP exchange, both the request and the response. Headers are descriptive information about the requester. \n\nThe top line is called the Request Line - it describes the HTTP method being invoked (in this case, “GET”, because we are getting a resource from the server), the location of the resource, and the protocol (HTTP/1.1)\n\nAfter that, you see a collection of header lines. Each header is a piece of information that helps the server respond to request appropriately. *speak about various headers*\n\nAnd in the response, at the top you see the protocol and a status code. You’ve seen these status codes before - if you’ve ever seen “404 not found”, that’s an HTTP status code saying that the server couldn’t locate the resource that was requested. \n\nThe response headers describe the content to the requester so that the browser can handle it - you can see that it’s an image, so the browser can treat it as an image. You can also see information that helps the browser cache the image to reduce load time and bandwidth usage.\n\nWhat you are seeing here is just metadata - data that describes the interaction. The actual payload, in case the image file, is sent back to the browser after the response header has been sent.\n
We’ve got this great protocol for handling the transfer of resources between remote machines. It’s been around for 20 years, it’s widely supported, well-documented, and easy to use in code. All of that is pretty great and it’s pretty much exactly the sort of transport layer that we need to enable web API calls.\n\nA browser uses HTTP to request resources like HTML files, images, and javascript from a server that knows how to respond to HTTP requests. HTTP is also used to send data to the web server, such as when we fill out a form on a website, but instead of using a HTTP GET, in this case we would use an HTTP POST request to indicate that we are sending information to the server rather than retrieving it.\n\nA web API uses HTTP in an analogous way, but for different types of resources. Let’s use Klout’s API as an example. You can send a request via HTTP to Klout’s API that says “hey Klout, I’d like to retrieve my klout score.” The API would take that request and pass it to Klout’s code that handles retrieving that information from a database, receive a set of results from the database, and then package the retrieved resources so they can be sent back to the requester in a way that makes sense.\n
Here is a look at the actual request we’d make to Twitter asking for the list of users that we follow, along with the response headers.\n\n\n\n
Just as our browser understands how to handle HTML and images that are returned by HTTP requests, the programs that we write to interact with APIs have to understand how to handle the resource payloads that the API returns. That means that the payload needs to be machine-readable so that code can easily interact with the resource. And ideally, we’d like the payload to be at least somewhat human-readable so that we can understand it outside of the context of code as well.\n\nThere have been several different approaches for achieving these goals over the years. For the most part, two formats have won out, and they are XML and JSON. It’s incredibly rare to find a web API that doesn’t use these formats nowadays, and more often than not APIs will offer both. \n\nThis is really powerful stuff. It’s taking something that would typically be complicated and making it incredibly easy. When you use your voice-activated toaster, you don’t really worry about how the toast is made, you simply tell the toaster to make toast and the toaster gives you toast; you don’t have to understand how the circuitry in the toaster works.\n\n**Talk about apigee/mashery and how people can try out API calls**\n
Web APIs are usually described in terms of end points, methods, and calls. \n\nYou can think of end points as generally corresponding to types of resources; in Klout’s API, there are end points for score, for users, and for relationships.\n\nMethods are just specific actions on those end points; for example, you can retrieve detailed information on a user, or you can retrieve a list of a user’s topics of influence. Both of these actions are referred to as methods on the User endpoint.\n\nCalls are just instances of these actions. You can think of each HTTP request that you make to an API method as a call. It’s really no different than a phone call - you specify a destination, establish a connection, communicate, and then disconnect.\n
APIs make the world smaller - say you wanted to include maps on your website. Mapping is a tough problem in a lot of ways; finding and storing the map data, adding a nice user interface on top of it, etc. Luckily some really smart people at Google have solved that problem. \n\nThere are 4000+ APIs listed on Programmable Web. \n\nDiscover awesome use cases by mashing up APIs - for example, API that crawls Craigslists for rental properties and mashes it up with Google maps.\n\nAPIs give you access to a lot of features that wouldn’t otherwise be feasible - handle phone calls, send email, accept credit card payments without having to worry about PCI compliance.\n\nAPIs reduce the amount of infrastructure that you have to build and maintain. \n\nAPIs reduce the amount of code your developers have to write. Why write an authentication module when you can authenticate with Twitter or Facebook.\n\nAPIs increase customer loyalty - if code has been written around a specific API, their is a cost associated with making a change to a different provider.\n\nIf you build an API before you build your application, you enable flexibility if you decide to change course/do unexpected things\n
Interesting to note that Klout actually built their whole business model on the fact that companies like Twitter and Facebook have an API - they use those APIs to pull data that allows them to apply mathematical models of influence to the users or those services. And then they have an API for accessing those results - which are then used in many social media clients.\n\n\n
Full Contact started out with a specific application that received inbound email, added a bunch of contact information about the sender, and then delivered it.\n\nWhen they went to the TechStars program, David Cohen basically told them to stop worrying about the specific implementations and focus on the core problem - filling contact information. Now they have over 1100 developers using their platform.\n