Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Science London - Meetup, 28/05/15

1,814 views

Published on

Slides from my @ds_ldn talk about Ontologies in the Internet of Things. Note that this is a short version of a talk that I presented earlier this month on O'Reilly Webcasts, still viewable for a while at: http://www.oreilly.com/pub/e/3365

Published in: Data & Analytics

Data Science London - Meetup, 28/05/15

  1. 1. Semantic web warmed up: Ontologies for the IoT Dr. Boris Adryan @BorisAdryan @thingslearn Currently getting divorced from logic.sysbiol.cam.ac.uk
  2. 2. ‣Everything is connected ‣ Big, noisy, often unstructured data ‣ We are learning how biological entities depend on each other DNA > RNA > proteins have been
  3. 3. ‣ Everything is connected ‣ Big, noisy, often unstructured data www.thingslearn.com Analytics, context integration, machine learning and predictive modelling for the IoT.
  4. 4. 0 clean shirt left + washing machine estimates 97% of your last pack of powder used + it’s Wednesday, 23:55 + the last four Thursdays had a morning business meeting + the car is parked 20 m from a shop + last retail activity: 8 sec ago Send immediate text reminder to pick up washing powder + send tweet from @BorisHouse “need identified” AND “notification appropriate”Actionable insight. From everything.
  5. 5. NO ANALYTICAL FLEXIBILITY IN M2M/IOT Matt Hatton, Machina Research The BLN IoT ‘14 Internet replaces wire It’s all about the context M2M consumer IoT defined I-P-O like it’s 1975 context context context context context context context Is it hot?
  6. 6. LIFE SCIENCE STRATEGIES DON’T WORK IN THE IOT - There are no commonly accepted - ‘catalogue’ of things, - ‘ontology’ of things, - ‘data format’ of things, - ‘meta data’ for things. - Most businesses are driven by revenue, not long-term strategic vision - Service providers have no need to publish - Data can be highly personal (cheap excuse) unless they’re
  7. 7. META DATA, SHARING AND DATA REPOS founded in Nov. 1999 But this is a complex and ambitious project, and is one of the biggest challenges that bioinformatics has yet faced. Major difficulties stem from the detail required to describe the conditions of an experiment, and the relative and imprecise nature of measurements of expression levels. The potentially huge volume of data only adds to these difficulties. Nature Feb. 2000 “ “ Nov. 2000 Oct. 2002 Wide adoption: as requirement for publication in scientific journals
  8. 8. THE LIFE SCIENCES FIXED THEIR KNOWLEDGE REPRESENTATION PROBLEM
  9. 9. FORMALISING KNOWLEDGE
  10. 10. FORMALISING KNOWLEDGE WITH GENE ONTOLOGY
  11. 11. CURRENT GOVERNMENT INVESTMENTS INTO GENE ONTOLOGY NIH alone spent $44,616,906 on the ontology structure since 2001 (I don’t have data for UK/EU spendings) ~100 full-time salaries for experts with domain-specific knowledge ~40,000 terms
  12. 12. story measurements + meta data open, public repositories human curators ontology terms community PUBLISH OR PERISH ok? journal informal exchange - no credit! funders assessment The majority of this infrastructure is paid for by governments and charities industry!
  13. 13. measurements + meta data storage & provenance human curators ontology terms user PUBLISH OR YOU’RE NOT DOING IOT ok? Maybe the majority of this infrastructure should be paid for by governments? company cloud device registration “ “ privileges dataadded value
  14. 14. WHAT IS AN ONTOLOGY?
  15. 15. ARE PEOPLE NOT ALREADY USING ONTOLOGIES IN THE IOT?
  16. 16. ONTOLOGIES HAVE TO BE PRAGMATIC COMPROMISES Gene Ontology annotation 15 years of research 47 publications 100+ authors 50+ PhDs 15 direct annotations ~150 inferred annotations
  17. 17. THE THREE BRANCHES OF Adapted from Anurag et al., Mol. BioSyst., 2012,8, 346-352 Localization:Where is an entity acting? Function:What does the entity do? Process:When is the entity needed?
  18. 18. inferences on “is a” “part of” “regulates” “has part” from geneontology.org from Ashburner et al., Nat Genet. 2000, 25(1):25-9. GO AND CONTEXT
  19. 19. THE BRANCHES OF GO AND THE IOT Localization: inside, (my?) home, living room Function: measures temperature regulates temperature interacts with user directly interacts with user via app Process: regulation of temperature measurement of ambient temperature ‘is proxy / is avatar’ for presence? fire? ice age? winter?
  20. 20. A LAST WORD ON PRAGMATISM “perfect” ontology The SSN Ontology allows for inference entirely on the basis of its structure and annotation. In reality, many parameters are difficult to establish and the effort to annotate things outweighs the utility. “crude” ontology A simplified structure allows for quick annotation even by non- specialists. The lack of details can lead to clashes in the ontology => more smartness has to go into software; more coding effort. 1 billlion different things 1 milllion use cases
  21. 21. 0 clean shirt left + washing machine estimates 97% of your last pack of powder used + it’s Wednesday, 23:55 + the last four Thursdays had a morning business meeting + the car is parked 20 m from a shop + last retail activity: 8 sec ago Send immediate text reminder to pick up washing powder + send tweet from @BorisHouse “need identified” AND “notification appropriate”Actionable insight. From everything. “indicator of esteem” 3% left and not pressed “not home” “buying” credit card: “highly personal device” ~ alive and awake
  22. 22. Dr. Boris Adryan @BorisAdryan @thingslearn @SoftwareSaved Open software Open source Open data Fellow of the

×