The Power of Open Data
Phillip J. Windley, Ph.D.
CTO, Kynetx
www.windley.com
Is your gas pump accurate?




          How do you know?
Where is the inspection data?




                       Johanna Kirk, Deseret News




         Unfortunately, not online...
The Deseret News built an app...




                        2003 Deseret News Publishing Company
The Deseret News built an app...




      FA
             IL!
                !
                        2003 Deseret News...
We don’t need the
Department of Agriculture to
 build the Web application...
We just need them to
make the data available
      so we can!
Here’s an example...
Tired of buying books on Amazon?




                so was Jon Udell...
Is the book in the local library?
Library lookup...


http://www.amazon.com/exec/obidos/ASIN/   0738206679
Library lookup...


http://www.amazon.com/exec/obidos/ASIN/   0738206679
Library lookup...


http://www.amazon.com/exec/obidos/ASIN/       0738206679



http://ksclib.keene.edu/search/a?searcharg=
Library lookup...


http://www.amazon.com/exec/obidos/ASIN/



http://ksclib.keene.edu/search/a?searcharg=   0738206679
Library lookup...


http://www.amazon.com/exec/obidos/ASIN/



http://ksclib.keene.edu/search/a?searcharg=   0738206679
With a simple Javascript
 bookmarklet, Jon was
  able to mashup two
independent Web sites
What made this possible?
Both systems
 referenced resources
using meaningful URLS
Small, scripted
   aggregations lead to
serendipitous applications
This is the power of open data
Two forces are making
open data important...
Disaggregation
is the new norm
Yahoo! is a “portal”
“Portal for the Global Pet Food Industry”
Web sites are cleaving
along functional lines
Just blog comments...
Integrated on my blog
Integrated on my blog
Just events...
Integrated on the IIW site
Integrated on the IIW site
Here’s another cool word:
  “deperimeterization”
Mommy, I’m
scared of the
outside world!
Constantinople City Wall
Fairs
Trebuchet, B. Windley
Markets
Which provided more
opportunity for merchants?
Which provides more
opportunity for you?
The architecture of the Web
 supports and encourages
      network effects
Network effects
   happen when
participating makes
 the entire network
   more valuable
The Web’s architecture
    is called REST
REpresentational
     State
   Transform
Roy Fielding
               via duncandavidson on Flickr
At the heart of REST
    are resources
Web pages, XML documents,
images, JSON, SVG, and so on
  represent these resources
nouns

        verbs
URLs are the nouns
HTTP methods are the verbs
HTTP methods are the verbs
HTTP methods are the verbs
 POST
HTTP methods are the verbs
 POST
 GET
HTTP methods are the verbs
 POST
 GET
 PUT
HTTP methods are the verbs
 POST
 GET
 PUT
 DELETE
HTTP methods are the verbs
 POST          Create
 GET
 PUT
 DELETE
HTTP methods are the verbs
 POST          Create
 GET           Retrieve
 PUT
 DELETE
HTTP methods are the verbs
 POST          Create
 GET           Retrieve
 PUT           Update
 DELETE
HTTP methods are the verbs
 POST          Create
 GET           Retrieve
 PUT           Update
 DELETE        Delete
HTTP methods are the verbs
 POST            Create
 GET             Retrieve
 PUT             Update
 DELETE          Dele...
Constraining the
application interface
      increases
   client flexibility
Ideally, transformations
are in the representation
Benefits of REST
Benefits of REST
Benefits of REST
  Simple
Benefits of REST
  Simple
  Flexible
Benefits of REST
  Simple
  Flexible
  Fast
Small
marginal
 cost...
You’re already building a
   Web application...
   just give it an API
Start by viewing every data
  element as a resource
Collections and queries too...
Every resource
should have a URL
Cool URLs don’t change!
Preserve the structure of data
until the last possible minute
Preserve the structure of data
until the last possible minute
Preserve the structure of data
until the last possible minute
  Use XML
Preserve the structure of data
until the last possible minute
  Use XML
  Use JSON
Preserve the structure of data
until the last possible minute
  Use XML
  Use JSON
  Use RDFa
Preserve the structure of data
until the last possible minute
  Use XML
  Use JSON
  Use RDFa
  Use microformats
Preserve the structure of data
until the last possible minute
  Use XML
  Use JSON
  Use RDFa
  Use microformats
  Use the...
XML Example
JSON Example
RDFa
RDFa
RDFa
RDFa
RDFa
RDFa
RDFa
RDFa
RDFa
Microformats
Microformats
Demo time...
http://phil.windley.org/
Operator Shows Microformats
Export HCard data to Google Maps
Export the HCard data
to an address book
Play nice with HTTP’s verbs
Play nice with HTTP’s verbs
Play nice with HTTP’s verbs

 Queries should use a GET
Play nice with HTTP’s verbs

 Queries should use a GET
 Use POST to create new resources
Play nice with HTTP’s verbs

 Queries should use a GET
 Use POST to create new resources
 Don’t forget PUT and DELETE
Use existing standards
   where you can
Use existing standards
   where you can
Use existing standards
   where you can
  RSS
Use existing standards
   where you can
  RSS
  ATOM
Use existing standards
   where you can
  RSS
  ATOM
  OPML
Use existing standards
   where you can
  RSS
  ATOM
  OPML
  GEDCOM
Handle authentication and
authorization in standard ways
Handle authentication and
authorization in standard ways
Handle authentication and
authorization in standard ways
   HTTP AUTH
Handle authentication and
authorization in standard ways
   HTTP AUTH
   OAuth
Document your API and
 data structure online
Document your API and
 data structure online
Document your API and
 data structure online
Use HTML documents
Document your API and
 data structure online
Use HTML documents
Use XML Schemas
Document your API and
 data structure online
Use HTML documents
Use XML Schemas
Follow conventions
An Example:
Programming Twitter
Twitter is Microblogging
Twitter is Microblogging
Twitter is Microblogging
  Social network
Twitter is Microblogging
  Social network
  140 character limit
Twitter is Microblogging
  Social network
  140 character limit
  Asymmetric follow
Building a Retweeter: The Algorithm
Building a Retweeter: The Algorithm
Building a Retweeter: The Algorithm
  Authenticate
Building a Retweeter: The Algorithm
  Authenticate
  Find relevant tweets in friends timeline
Building a Retweeter: The Algorithm
  Authenticate
  Find relevant tweets in friends timeline
  Post them to utahpolitics ...
Twitter API: friends_timeline
Twitter API: friends_timeline
Twitter API: friends_timeline
Twitter API: friends_timeline
Twitter API: friends_timeline
Twitter API: friends_timeline
Twitter API: update
Twitter API: update
Demo time...
http://twitter.com/statuses/friends_timeline.xml
First we authenticate
Here’s the timeline in XML
Here’s the same data in RSS
We can use a browser to
discover API functionality
The Perl Version
The Perl Version
The Perl Version
The Perl Version
REST makes writing
  the library easy
Twitter’s creators didn’t foresee
Twitter’s creators didn’t foresee
Twitter’s creators didn’t foresee
  Retweeters
Twitter’s creators didn’t foresee
  Retweeters
  Analytics
Twitter’s creators didn’t foresee
  Retweeters
  Analytics
  Hash tags
Twitter’s creators didn’t foresee
  Retweeters
  Analytics
  Hash tags
  or a hundred other things
The value of data is unforeseen
Design for serendipity
Open data enables serendipity
Open data is a
radical innovation
that is generating
wealth
knowledge
& opportunity
at a small relative cost




        $
at a small relative cost
That’s the power of
    open data
       Contact info:
      phil@windley.org
      www.windley.com
          @windley
The Power of Open Data
The Power of Open Data
The Power of Open Data
The Power of Open Data
The Power of Open Data
The Power of Open Data
The Power of Open Data
The Power of Open Data
The Power of Open Data
The Power of Open Data
The Power of Open Data
The Power of Open Data
Upcoming SlideShare
Loading in...5
×

The Power of Open Data

1,221

Published on

Talk given to the Family History Developers Conferenceon Mar 11, 2009

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,221
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
40
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide



  • Because the data was garnered by GRAMMA instead of being online, the application quickly became hopelessly out of date.












  • Both systems used meaningful URLs and a common format to refer to records about books
  • Jon’s application was unexpected by either Amazon or the creators of the library software. But he was able to take advantage of their openness to create a useful tool.










  • The second trend I want to talk about is deperimeterization...
  • Does this sounds scary? It’s not the first time that it’s happened and it is a good thing...
  • Cities used to be surrounded by walls to keep the bad guys out.
  • Commerce happened at fairs that were held weekly, monthly, or seasonally
  • The trebuchet and other technological advances changed that by knocking down the walls. The cities were deperimeterized.
  • The consequence was that commerce now happened in markets that were open everyday. Now 24/7.


  • The World Wide Web is the most successful information system ever invented.
    Why did the Web win?

  • 2 devices, 1 connection
    5 devices, 10 connections
    10 devices, 45 connections
  • 2 devices, 1 connection
    5 devices, 10 connections
    10 devices, 45 connections
  • 2 devices, 1 connection
    5 devices, 10 connections
    10 devices, 45 connections
  • 2 devices, 1 connection
    5 devices, 10 connections
    10 devices, 45 connections





  • A large, unlimited really, set of nouns and a small, fixed set of universal verbs

  • Affectionately known as “CRUD” Who says computer scientists can make up cute acronyms?
  • Affectionately known as “CRUD” Who says computer scientists can make up cute acronyms?
  • Affectionately known as “CRUD” Who says computer scientists can make up cute acronyms?
  • Affectionately known as “CRUD” Who says computer scientists can make up cute acronyms?
  • Affectionately known as “CRUD” Who says computer scientists can make up cute acronyms?
  • Affectionately known as “CRUD” Who says computer scientists can make up cute acronyms?
  • Affectionately known as “CRUD” Who says computer scientists can make up cute acronyms?
  • Affectionately known as “CRUD” Who says computer scientists can make up cute acronyms?
  • Affectionately known as “CRUD” Who says computer scientists can make up cute acronyms?
  • Affectionately known as “CRUD” Who says computer scientists can make up cute acronyms?

  • Some verbs are more universal than others...
  • That is, the body of the POST or PUT. And thus, the state of the system is affected...








  • URLs are the nouns of this system. A lingua franca

  • Is more than one answer appropriate? Then support more than one.
  • Is more than one answer appropriate? Then support more than one.
  • Is more than one answer appropriate? Then support more than one.
  • Is more than one answer appropriate? Then support more than one.
  • Is more than one answer appropriate? Then support more than one.
  • Is more than one answer appropriate? Then support more than one.
  • Lots of tools,
    automatic formating (with XSL stylesheet)
    heavyweight, hard to parse
  • Supported by lots of languages
    Easy to parse
    No schema checking














  • using HTTP verbs correctly is good for performance and ensures that caches will work properly. Universal verbs should be created whenever possible.
  • using HTTP verbs correctly is good for performance and ensures that caches will work properly. Universal verbs should be created whenever possible.
  • using HTTP verbs correctly is good for performance and ensures that caches will work properly. Universal verbs should be created whenever possible.
  • using HTTP verbs correctly is good for performance and ensures that caches will work properly. Universal verbs should be created whenever possible.
  • Existing standards means that tools exist to use them.
  • Existing standards means that tools exist to use them.
  • Existing standards means that tools exist to use them.
  • Existing standards means that tools exist to use them.
  • Existing standards means that tools exist to use them.



  • Not just arguments, but input, output, and error codes. See Flickr and Twitter APIs for a good examples.
  • Not just arguments, but input, output, and error codes. See Flickr and Twitter APIs for a good examples.
  • Not just arguments, but input, output, and error codes. See Flickr and Twitter APIs for a good examples.
  • Not just arguments, but input, output, and error codes. See Flickr and Twitter APIs for a good examples.

























  • but they didn’t have to because they created an open API to access the data.
  • but they didn’t have to because they created an open API to access the data.
  • but they didn’t have to because they created an open API to access the data.
  • but they didn’t have to because they created an open API to access the data.
  • but they didn’t have to because they created an open API to access the data.
  • This isn’t unusual. The creators of data often fail to recognize the true value it contains.
  • if they lock it up, their data will never be used for anything more than what their limited imaginations can envision.
  • The answer is to design systems so that they can participate in serendipitous mashups.
  • But to do that, the data has to be out in the open...









  • The Power of Open Data

    1. 1. The Power of Open Data Phillip J. Windley, Ph.D. CTO, Kynetx www.windley.com
    2. 2. Is your gas pump accurate? How do you know?
    3. 3. Where is the inspection data? Johanna Kirk, Deseret News Unfortunately, not online...
    4. 4. The Deseret News built an app... 2003 Deseret News Publishing Company
    5. 5. The Deseret News built an app... FA IL! ! 2003 Deseret News Publishing Company
    6. 6. We don’t need the Department of Agriculture to build the Web application...
    7. 7. We just need them to make the data available so we can!
    8. 8. Here’s an example...
    9. 9. Tired of buying books on Amazon? so was Jon Udell...
    10. 10. Is the book in the local library?
    11. 11. Library lookup... http://www.amazon.com/exec/obidos/ASIN/ 0738206679
    12. 12. Library lookup... http://www.amazon.com/exec/obidos/ASIN/ 0738206679
    13. 13. Library lookup... http://www.amazon.com/exec/obidos/ASIN/ 0738206679 http://ksclib.keene.edu/search/a?searcharg=
    14. 14. Library lookup... http://www.amazon.com/exec/obidos/ASIN/ http://ksclib.keene.edu/search/a?searcharg= 0738206679
    15. 15. Library lookup... http://www.amazon.com/exec/obidos/ASIN/ http://ksclib.keene.edu/search/a?searcharg= 0738206679
    16. 16. With a simple Javascript bookmarklet, Jon was able to mashup two independent Web sites
    17. 17. What made this possible?
    18. 18. Both systems referenced resources using meaningful URLS
    19. 19. Small, scripted aggregations lead to serendipitous applications
    20. 20. This is the power of open data
    21. 21. Two forces are making open data important...
    22. 22. Disaggregation is the new norm
    23. 23. Yahoo! is a “portal”
    24. 24. “Portal for the Global Pet Food Industry”
    25. 25. Web sites are cleaving along functional lines
    26. 26. Just blog comments...
    27. 27. Integrated on my blog
    28. 28. Integrated on my blog
    29. 29. Just events...
    30. 30. Integrated on the IIW site
    31. 31. Integrated on the IIW site
    32. 32. Here’s another cool word: “deperimeterization”
    33. 33. Mommy, I’m scared of the outside world!
    34. 34. Constantinople City Wall
    35. 35. Fairs
    36. 36. Trebuchet, B. Windley
    37. 37. Markets
    38. 38. Which provided more opportunity for merchants?
    39. 39. Which provides more opportunity for you?
    40. 40. The architecture of the Web supports and encourages network effects
    41. 41. Network effects happen when participating makes the entire network more valuable
    42. 42. The Web’s architecture is called REST
    43. 43. REpresentational State Transform
    44. 44. Roy Fielding via duncandavidson on Flickr
    45. 45. At the heart of REST are resources
    46. 46. Web pages, XML documents, images, JSON, SVG, and so on represent these resources
    47. 47. nouns verbs
    48. 48. URLs are the nouns
    49. 49. HTTP methods are the verbs
    50. 50. HTTP methods are the verbs
    51. 51. HTTP methods are the verbs POST
    52. 52. HTTP methods are the verbs POST GET
    53. 53. HTTP methods are the verbs POST GET PUT
    54. 54. HTTP methods are the verbs POST GET PUT DELETE
    55. 55. HTTP methods are the verbs POST Create GET PUT DELETE
    56. 56. HTTP methods are the verbs POST Create GET Retrieve PUT DELETE
    57. 57. HTTP methods are the verbs POST Create GET Retrieve PUT Update DELETE
    58. 58. HTTP methods are the verbs POST Create GET Retrieve PUT Update DELETE Delete
    59. 59. HTTP methods are the verbs POST Create GET Retrieve PUT Update DELETE Delete CRUD
    60. 60. Constraining the application interface increases client flexibility
    61. 61. Ideally, transformations are in the representation
    62. 62. Benefits of REST
    63. 63. Benefits of REST
    64. 64. Benefits of REST Simple
    65. 65. Benefits of REST Simple Flexible
    66. 66. Benefits of REST Simple Flexible Fast
    67. 67. Small marginal cost...
    68. 68. You’re already building a Web application... just give it an API
    69. 69. Start by viewing every data element as a resource
    70. 70. Collections and queries too...
    71. 71. Every resource should have a URL
    72. 72. Cool URLs don’t change!
    73. 73. Preserve the structure of data until the last possible minute
    74. 74. Preserve the structure of data until the last possible minute
    75. 75. Preserve the structure of data until the last possible minute Use XML
    76. 76. Preserve the structure of data until the last possible minute Use XML Use JSON
    77. 77. Preserve the structure of data until the last possible minute Use XML Use JSON Use RDFa
    78. 78. Preserve the structure of data until the last possible minute Use XML Use JSON Use RDFa Use microformats
    79. 79. Preserve the structure of data until the last possible minute Use XML Use JSON Use RDFa Use microformats Use them all...
    80. 80. XML Example
    81. 81. JSON Example
    82. 82. RDFa
    83. 83. RDFa
    84. 84. RDFa
    85. 85. RDFa
    86. 86. RDFa
    87. 87. RDFa
    88. 88. RDFa
    89. 89. RDFa
    90. 90. RDFa
    91. 91. Microformats
    92. 92. Microformats
    93. 93. Demo time... http://phil.windley.org/
    94. 94. Operator Shows Microformats
    95. 95. Export HCard data to Google Maps
    96. 96. Export the HCard data
    97. 97. to an address book
    98. 98. Play nice with HTTP’s verbs
    99. 99. Play nice with HTTP’s verbs
    100. 100. Play nice with HTTP’s verbs Queries should use a GET
    101. 101. Play nice with HTTP’s verbs Queries should use a GET Use POST to create new resources
    102. 102. Play nice with HTTP’s verbs Queries should use a GET Use POST to create new resources Don’t forget PUT and DELETE
    103. 103. Use existing standards where you can
    104. 104. Use existing standards where you can
    105. 105. Use existing standards where you can RSS
    106. 106. Use existing standards where you can RSS ATOM
    107. 107. Use existing standards where you can RSS ATOM OPML
    108. 108. Use existing standards where you can RSS ATOM OPML GEDCOM
    109. 109. Handle authentication and authorization in standard ways
    110. 110. Handle authentication and authorization in standard ways
    111. 111. Handle authentication and authorization in standard ways HTTP AUTH
    112. 112. Handle authentication and authorization in standard ways HTTP AUTH OAuth
    113. 113. Document your API and data structure online
    114. 114. Document your API and data structure online
    115. 115. Document your API and data structure online Use HTML documents
    116. 116. Document your API and data structure online Use HTML documents Use XML Schemas
    117. 117. Document your API and data structure online Use HTML documents Use XML Schemas Follow conventions
    118. 118. An Example: Programming Twitter
    119. 119. Twitter is Microblogging
    120. 120. Twitter is Microblogging
    121. 121. Twitter is Microblogging Social network
    122. 122. Twitter is Microblogging Social network 140 character limit
    123. 123. Twitter is Microblogging Social network 140 character limit Asymmetric follow
    124. 124. Building a Retweeter: The Algorithm
    125. 125. Building a Retweeter: The Algorithm
    126. 126. Building a Retweeter: The Algorithm Authenticate
    127. 127. Building a Retweeter: The Algorithm Authenticate Find relevant tweets in friends timeline
    128. 128. Building a Retweeter: The Algorithm Authenticate Find relevant tweets in friends timeline Post them to utahpolitics tweetstream
    129. 129. Twitter API: friends_timeline
    130. 130. Twitter API: friends_timeline
    131. 131. Twitter API: friends_timeline
    132. 132. Twitter API: friends_timeline
    133. 133. Twitter API: friends_timeline
    134. 134. Twitter API: friends_timeline
    135. 135. Twitter API: update
    136. 136. Twitter API: update
    137. 137. Demo time... http://twitter.com/statuses/friends_timeline.xml
    138. 138. First we authenticate
    139. 139. Here’s the timeline in XML
    140. 140. Here’s the same data in RSS
    141. 141. We can use a browser to discover API functionality
    142. 142. The Perl Version
    143. 143. The Perl Version
    144. 144. The Perl Version
    145. 145. The Perl Version
    146. 146. REST makes writing the library easy
    147. 147. Twitter’s creators didn’t foresee
    148. 148. Twitter’s creators didn’t foresee
    149. 149. Twitter’s creators didn’t foresee Retweeters
    150. 150. Twitter’s creators didn’t foresee Retweeters Analytics
    151. 151. Twitter’s creators didn’t foresee Retweeters Analytics Hash tags
    152. 152. Twitter’s creators didn’t foresee Retweeters Analytics Hash tags or a hundred other things
    153. 153. The value of data is unforeseen
    154. 154. Design for serendipity
    155. 155. Open data enables serendipity
    156. 156. Open data is a radical innovation that is generating
    157. 157. wealth
    158. 158. knowledge
    159. 159. & opportunity
    160. 160. at a small relative cost $
    161. 161. at a small relative cost
    162. 162. That’s the power of open data Contact info: phil@windley.org www.windley.com @windley
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×