Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
DANS is een instituut van KNAW en NWO
Data Archiving and Networked ServicesData Archiving and Networked Services
Stop maki...
Data-driven research
Data
collection
Data cleaning
and integration
Data
processing
Tool
New data Existing data
Happy users
What kind of tool ?
●
Could be
– An interactive web site
– An “app” for smart phones
– A stand-alone software
●
Goal is to...
Behaving scholars do a bit more
Data
collection
Data cleaning
and integration
Data
processing
Tool
New data Existing data
...
The myths of long term use
●
Data and software sent to a digital trusted
repository will for sure be re-used later
●
Tools...
In reality
●
Data that is not easy to use is not used
●
Tools are not maintained once the person
who coded it has moved on...
Data re-use: could you do it ?
CEDAR all open on
github: data, queries
and scripts.
●
Usage example:
– Download dumps
– In...
Data is the important thing
http://redmonk.com/jgovernor/2007/04/05/why-
applications-are-like-fish-and-data-is-like-wine/...
Where we're going we don't need “tools”
So what needs to be done ?
Do not bake the data into the tool. Instead
build the tool on top of the data, and ensure
other...
In fact, do not write any tool
●
Focus on exposing the data
– Less time spent coding and less code
– Easier and cheaper to...
The magic keyword 1 : “API”
●
“In computer programming, an application
programming interface (API) is a set of
routines, p...
Example (courtesy of Wikipedia)
●
In this code “nextLine” and “close” are part
of the API of “Scanner”
APIs can be on the Web too
●
HTTP can be used as an API too.
●
Get a specific record from a database
– http://example.com/...
Generic design for tool + API
●
Tools consume the data provided by a set of
APIs over the Web
●
If you are coding tools
– ...
The magic keyword 2 : “REST”
●
“Representational State Transfer (REST) is a
software architecture style consisting of
guid...
The magic keyword 3 : “JSON”
●
“JSON (/ d e s n/ JAY-s n), or JavaScriptˈ ʒ ɪ ə ə
Object Notation, is an open standard for...
A step further with JSON-LD
●
JSON-LD is Linked Data expressed in JSON.
Let users follow links across datasets
●
Example o...
Web APIs
●
There is a lot of them (> 12k) and their
number is increasing rapidly. See:
http://www.programmableweb.com/
●
S...
Bonuses
© All Seeing, Flickr
Give less to share more
●
Noticed something about the examples given
in the previous slide on Web APIs ?
●
None of them wo...
Monetize a service, not a dataset
●
APIs open up the opportunity for monetizing
the usage of the data instead of the data
...
Extra technical bonuses
●
Most of the processing happens on the client
side, so less resources needed to serve the data
●
...
Ending on some more examples...
Facilityregistry.org
The website is the API. No interface of any kind
Nlgis.nl
API and a simple data visualisation tool using it
Lod.cedar-project.nl/cedar
Generic query interface + extra API
To summarise
●
When your data is ready to be shared make first
an API for it. This will minimise friction in re-use.
●
If ...
Upcoming SlideShare
Loading in …5
×

Stop making tools! Nobody likes them anyway...

1,232 views

Published on

This presentation aimed at DH scholars is about the making of tools with baked in data as compared to making data available via APIs.

Published in: Education
  • You can try to use this service ⇒ www.WritePaper.info ⇐ I have used it several times in college and was absolutely satisfied with the result.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Stop making tools! Nobody likes them anyway...

  1. 1. DANS is een instituut van KNAW en NWO Data Archiving and Networked ServicesData Archiving and Networked Services Stop making tools ! Nobody likes them anyway... Christophe Guéret (@cgueret) New Trends in eHumanities 16 April 2015 DANS is een instituut van KNAW en NWO
  2. 2. Data-driven research Data collection Data cleaning and integration Data processing Tool New data Existing data Happy users
  3. 3. What kind of tool ? ● Could be – An interactive web site – An “app” for smart phones – A stand-alone software ● Goal is to always let users consume the data for their need ● Actual tooling will depend on the skills and preferences of the team member coding it!
  4. 4. Behaving scholars do a bit more Data collection Data cleaning and integration Data processing Tool New data Existing data Happy users
  5. 5. The myths of long term use ● Data and software sent to a digital trusted repository will for sure be re-used later ● Tools can be maintained after the project and further improved to fit new needs ● If the tool is not being used enough it should be adapted to fit more user needs
  6. 6. In reality ● Data that is not easy to use is not used ● Tools are not maintained once the person who coded it has moved onto other things ● It is not possible to make everyone happy and fit all research questions with one tool
  7. 7. Data re-use: could you do it ? CEDAR all open on github: data, queries and scripts. ● Usage example: – Download dumps – Install triple store – Load data & wait – Recursively query for provenance
  8. 8. Data is the important thing http://redmonk.com/jgovernor/2007/04/05/why- applications-are-like-fish-and-data-is-like-wine/ Data Tool
  9. 9. Where we're going we don't need “tools”
  10. 10. So what needs to be done ? Do not bake the data into the tool. Instead build the tool on top of the data, and ensure others can do the same Data collection Data cleaning and integration Data processing Data exposition Tool 1 Tool 2 ...
  11. 11. In fact, do not write any tool ● Focus on exposing the data – Less time spent coding and less code – Easier and cheaper to maintain ● To increase availability, expose your data on the Web ● Exposing != Make a package and put it somewhere
  12. 12. The magic keyword 1 : “API” ● “In computer programming, an application programming interface (API) is a set of routines, protocols, and tools for building software applications” - Wikipedia ● Regardless of data, all the software you use is a layered cake bound by software APIs – Presentation software > GUI toolkit > Rendering System > Operating System > Hardware
  13. 13. Example (courtesy of Wikipedia) ● In this code “nextLine” and “close” are part of the API of “Scanner”
  14. 14. APIs can be on the Web too ● HTTP can be used as an API too. ● Get a specific record from a database – http://example.com/api?action=show&id=500 ● Delete a record in a database – http://example.com/api?action=delete&id=500 ● But don't do it that way! This is abusing the role of the “GET” method from HTTP
  15. 15. Generic design for tool + API ● Tools consume the data provided by a set of APIs over the Web ● If you are coding tools – Forget about server-side page rendering – Learn Javascript Data API ToolMySQL, R, ... HTTP, JSON, ...
  16. 16. The magic keyword 2 : “REST” ● “Representational State Transfer (REST) is a software architecture style consisting of guidelines and best practices for creating scalable web services” - Wikipedia ● For example: instead of using GET to do a delete just use the DELETE method from HTTP on the target resource
  17. 17. The magic keyword 3 : “JSON” ● “JSON (/ d e s n/ JAY-s n), or JavaScriptˈ ʒ ɪ ə ə Object Notation, is an open standard format that uses human-readable text to transmit data objects consisting of attribute–value pairs” - Wikipedia
  18. 18. A step further with JSON-LD ● JSON-LD is Linked Data expressed in JSON. Let users follow links across datasets ● Example of JSON data that is not JSON-LD Ok, but what is the API call to get more information about the board ? ● Need to figure it out in some way ● With LD you would get a link Part of the result from http://api.openonderwijsdata.nl/api/v1/get_document/duo/po_school/2013-20YF
  19. 19. Web APIs ● There is a lot of them (> 12k) and their number is increasing rapidly. See: http://www.programmableweb.com/ ● Some examples: – https://dev.twitter.com/rest/public – http://www.slideshare.net/developers/documentation – http://developer.rottentomatoes.com/docs – https://www.flickr.com/services/api/
  20. 20. Bonuses © All Seeing, Flickr
  21. 21. Give less to share more ● Noticed something about the examples given in the previous slide on Web APIs ? ● None of them would give you a copy of their dataset, yet they have an API to let you access the data ! ● => API enable fine-grained access to data
  22. 22. Monetize a service, not a dataset ● APIs open up the opportunity for monetizing the usage of the data instead of the data itself ● Users can be charged per API call ● Similar “download VS API” approaches – Paid game VS Free to play – Music download VS Streaming music
  23. 23. Extra technical bonuses ● Most of the processing happens on the client side, so less resources needed to serve the data ● Finer tracking of data usage ● Extra possibilities to do caching, do round-robin, use CDNs etc => more easy to scale
  24. 24. Ending on some more examples...
  25. 25. Facilityregistry.org The website is the API. No interface of any kind
  26. 26. Nlgis.nl API and a simple data visualisation tool using it
  27. 27. Lod.cedar-project.nl/cedar Generic query interface + extra API
  28. 28. To summarise ● When your data is ready to be shared make first an API for it. This will minimise friction in re-use. ● If you want/need to write a end-user tool make it use your own API (and others !) ● Plan maintenance for the API to keep it running.

×