Ontodia Overview - Semantics and Wikis panel - SemTech West 2012


Published on

Presentation given at SemTech West 2012 during the "Semantics and Wikis" panel.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Ontodia Overview - Semantics and Wikis panel - SemTech West 2012

  1. 1. Wikis and Semanticsin NYC and Open Government SemTech West June 2012 San Francisco, CA Joel Natividad, co-founder @jqnatividad
  2. 2. CROWDKNOWING Human-powered, Machine-accelerated, Collective Knowledge Systems
  3. 3. “Smart” Cities brand of Ontodia
  4. 4. "Over  the  next  decade,  cities  will  continue  to  grow  larger  at  a  rapid  pace.  At  the  same  time,  new  technologies  will  unlock  massive  streams  of  data  about  cities  and  their  residents.  As  these  forces  collide,  they  will  turn  every  city  into  a  unique  civic  laboratory—a  place  where  technology  is  adapted  in  novel  ways  to  meet  local  needs."   December  2010.    Institute  of  the  Futures  2020  Forecast  –   The  Future  of  Cities,  Information  and  Inclusion.  
  5. 5. Ci#es:Worldwide,  city  leaders  and  managers  need  cost-­‐effective  &  smart  solutions Populaon  Growth: -­‐ 221  cities  globally  with  more  than  1  million  citizens -­‐  China  will  move  300  million  people  to  cities  by  2020 -­‐  90%  of  these  cities  are  in  emerging  markets -­‐  In  2008,  more  people  lived  in  cities  (3.3  billion),  by  2030,  5  billion -­‐ Cities  are  more  efJicient  and  have  less  environmental  impact Cost  of  City  Services: Aging  infrastructure,  resource  constraints  &  waste -­‐ Washington  DC’s  water  system  has  elements  that  date  to  the  Civil  War -­‐ InefJiciency,  leaks  and  waste  rival  maintenance  and  expansion  costs -­‐ Legacy  infrastructure  in  megacities  like  NYC  that  are  too  cost-­‐ prohibitive  to  replace source:  Gartner  –  Is  Smart  Cities  the  Next  Big  Market?    March  2011
  6. 6. Open Data in NYCCouncil Member Gale Brewer
  7. 7. Int. No. 29: Accessibility to Public Data Sets “...requires  that  all  public  data  sets   maintained  by  City  agencies  shall  be   made  available  on  the  Internet  through   a  single  web  portal,  formatted  to  enable   viewing  by  web  browsers  and  mobile   devices  and  also  in  their  raw  or   unprocessed  form.”
  8. 8. Why NYC?• City Population: • Gross Metropolitan Product: 8.4 M (NYC estimate) USD $ 1.28 Trillion (Greyhill Advisors)• Metro NYC Population: 18.9 million (2010 Census) • De facto Capital of the World• City Density: • Fastest growing Tech Industry - 10,630/km2 (2010 Census) “New Tech City” (Center for an Urban Future)• Metro NYC Density: 1,085.7/km2 (2010 Census) • Second only to Silicon Valley for most startups• 50 million visitors a year • Emphasis on public-private• DoITT Annual Budget ~$325M partnerships
  9. 9. Large Organization Award
  10. 10. Grand Prize
  11. 11. 0. Huge Open Data1. Extract Metadata2. Derive ExtraMetadata (Semantics + Statistics + Algorithm + Crowd)3. Do Federated Queries on both the Metadata AND the DataCrowdknowing
  12. 12. Crowdknowing Human-powered, Machine-accelerated, Collective Knowledge Systems Curation, Comments, Ontology, Inferencing, SemanticMicrocontributions, Feedback, Mapping, Query Federation, Statistics, Bug Reports, Pattern Recognition, Multivariate Likes, Shares, Profile, Votes, Analysis & Forecasting, Automated Subscribes, Tagging, linking, Feeds, Notifications etc. etc. etc. etc. etc. etc.
  13. 13. a Semantic Data Dictionary
  14. 14. Semantic Steroids• Searchable • Faceted Search • Drilldown• Interlinked• Semantic Browsing• Queryable• Query Results Formats ~3.5M facts~950 datasets/views
  15. 15. ExtraMetadata?• Derived using Algorithm & the Crowd” “Semantics, Statistics,• “Supercharacterize” by sampling the underlying not just the schema, but each dataset data as well• Score each dataset - Pediacities Rank• Virtuous Feedback Loop around the Data micro-conversations/contributions
  16. 16. ExtraMetadataTop Level DetailExtraMetadata ExtraMetadata • Number of Rows • Top Values • Pediacities Rank • Descriptive statistics • Freshness Score • Nulls/Non-nulls • Sparseness Score • Smallest Value • Social Score • Largest Value • Views Score • “Uniqueness” • Download Score • Rating Score • Simple Visualization
  17. 17. “Crowd”Microconversations/contributions • Overall Rating • Comments (comment rating) • Bug Reports (data quality) • Likes/Shares • Downloads
  18. 18. NYC Open Data PoliHack Unconference - May 19, 2012
  19. 19. nyc.gov/datastandards
  20. 20. in tim e for 4.0•  More  Datasources!•  Not  just  Metadata!  Data  too!•  Federated  Queries!  •  SPARQL  support•  Collaborative  Ontology  Modeling•  Feeds  /  Subscriptions  /  NotiQications•  Microcontributions•  GamiQication•  combine  NYCDataWeb  and  NYCFacets•  Support  both  Web  2.0  &  Web  3.0  APIs
  21. 21. Linked LOV User Slideshare tags2con Audio Feedback 2RDF delicious Moseley Scrobbler Bricklink Sussex Folk (DBTune) Reading St. GTAA Magna- Lists Andrews Klapp- tune stuhl- Resource NTU DB club Lists Resource Tropes Lotico Semantic yovisto John Music Man- Lists Music Tweet chester Hellenic Peel Brainz NDL (DBTune) (Data Brainz Reading subjects FBD (zitgist) Lists Open EUTC Incubator) Linked Hellenic Library Open t4gm Produc- Crunch- PD Surge RDF info tions Discogs base Library Radio Ontos Source Code Crime ohloh Plymouth (Talis) (Data News LEM Ecosystem Reading RAMEAU Reports business Incubator) Crime data.gov. Portal Linked Data Lists SH UK Music Jamendo (En- uk Brainz (DBtune) LinkedL Ox AKTing) FanHubz gnoss ntnusc (DBTune) SSW CCN Points Thesau- Last.FM Poké- Thesaur Popula- artists pédia Didactal us rus W LIBRIS tion (En- (DBTune) Last.FM ia theses. LCSH Rådata reegle research patents MARC AKTing) (rdfize) my fr nå! data.gov. data.go Codes Ren. NHS uk v.uk Good- Experi- Classical List Energy (En- win flickr ment (DB Pokedex Norwe- Genera- AKTing) Mortality BBC Family wrappr Sudoc PSH Tune) gian (En- tors Program MeSH AKTing) semantic mes BBC IdRef GND CO2 educatio OpenEI web.org SW Energy Sudoc ndlna Emission n.data.g Music Dog VIAF EEA (En- Chronic- Linked (En- ov.uk Portu- Food UB AKTing) ling Event MDB AKTing) guese Mann- Europeana BBC America Media DBpedia Calames heim Ord- Recht- Wildlife Deutsche Open Revyu DDC Openly spraak. Finder Bio- lobid Election nance legislation Local nl RDF graphie Resources NSZL Swedish Data Survey Tele- data Ulm EU New Book Project data.gov.uk graphis bnf.fr Catalog Open Insti- York URI Open Mashup Cultural tutions Times Greek P20 UK Post- Burner Calais Heritage codes DBpedia ECS Wiki statistics lobid GovWILD data.gov. Taxon iServe South- Organi- LOIUS BNBBrazilian uk Concept ECS ampton sations Geo World OS BibBase STW GESIS Poli- ESD South- ECS Names Fact- (RKB ticians stan- reference ampton data.gov.uk book Freebase Explorer) Budapest dards data.gov. NASA EPrints uk intervals Project OAI Lichfield transport (Data DBpedia data Guten- Pisa Spen- data.gov. Incu- dcs RESEX Scholaro- ISTAT ding bator) Fishes berg DBLP DBLP uk Geo meter Immi- Scotland of Texas (FU (L3S) Pupils & Uberblic DBLP gration Species Berlin) IRIT Exams Euro- dbpedia data- (RKB London TCM ACM stat lite open- Explorer) NVD Gazette (FUB) Gene IBM Traffic Geo ac-uk Scotland TWC LOGD Eurostat Daily DIT Linked UN/ Data UMBEL Med ERA Data LOCODE DEPLOY Gov.ie CORDIS YAGO New- lingvoj Disea- (RKB some SIDER RAE2001 castle LOCAH CORDIS Explorer) Linked Eurécom Eurostat Drug CiteSeer Roma (FUB) Sensor Data GovTrack (Ontology (Kno.e.sis) Open Bank Pfam Course- Central) riese Enipedia Cyc Lexvo LinkedCT ware Linked PDB UniProt VIVO EURES EDGAR dotAC US SEC Indiana ePrints IEEE (Ontology totl.net (rdfabout) Central) WordNet RISKS (VUA) Taxono UniProt US Census EUNIS Twarql HGNC Semantic Cornetto (Bio2RDF) (rdfabout) my VIVO FTS XBRL PRO- ProDom STITCH Cornell LAAS SITE KISTI NSF Scotland Geo- GeoWord LODE graphy Net WordNet WordNet JISC (W3C) (RKB Climbing Linked Affy- KEGG SMC Explorer) SISVU Pub VIVO UF Piedmont GeoData metrix Drug ECCO- Finnish Journals PubMed Gene SGD Chem Munici- Accomo- El AGROV Ontology TCP Media dations Alpine bible palities Viajero OC Ski ontology Tourism KEGG Ocean Austria Enzyme PBAC Geographic Metoffice GEMET ChEMBL Italian Drilling OMIM KEGG Weather Open public Codices AEMET Linked MGI Pathway schools Forecasts Data Open InterPro GeneID Publications EARTh Thesau- KEGG Turismo rus Colors Reaction de Zaragoza Product Smart KEGG User-generated content Weather DB Link Medi Glycan Janus Stations Product Care KEGG AMP UniParc UniRef UniSTS Government Types Italian Homolo Com- Yahoo! Airports Museums pound Ontology Google Gene Geo Art Planet National wrapper Chem2 Cross-domain Radio- Bio2RDF activity UniPath JP Sears Open Linked OGOLOD way Life sciences Corpo- Amster- Reactome dam medu- Open rates Numbers Museum cator As of September 2011
  22. 22. .NYC - the First Linked Open Data City Mosele Folk GTAA Magna- tune DB Tropes John Mu Hellenic Peel Bra (DBTune) (D FBD EUTC Incub Hellenic Produc- PD Surge tions Radio Discogs .NYC Crime (Data Reports business Incubator) Crime data.gov. UK (En- uk AKTing) B Ox FanHubz Points (D Last.FM Popula- artists tion (En- (DBTune) Last.FM reegle research patents AKTing) (rdfize) data.gov. data.go Ren. NHS uk v.uk Energy (En- Genera- AKTing) Mortality BBC (En- tors Program AKTing) mes BBC CO2 educatio OpenEI Energy Emission n.data.g Music EEA (En- Chro AKTing) (En- ov.uk lin AKTing) BBC Ame Ord- Recht- Wildlife Open Finder Election nance Openly spraak. Data legislation Survey Local nl Tele- EU Insti- Project data.gov.uk graphis tutions UK Post- codes statistics GovWILD data.gov. Taxon LOIUS Brazilian uk Concept Geo Poli- ESD Names ticians stan- reference data.gov.uk dards data.gov. NASA uk intervals Lichfield transport (Data Spen- data.gov. Incu- ISTAT ding bator) Fishes uk Geo Immi- Scotland of Texas Pupils & Species gration Exams Euro- London stat Traffic Gazette (FUB) Geo Scotland TWC LOGD Eurostat Linked Data Data UMBEL Gov.ie CORDIS Y (RKB CORDIS Explorer) Linked Eurostat Sensor Data (FUB) (Ontology GovTrack (Kno.e.sis) Open Central) riese Cyc Linked EURES EDGAR (Ontology US SEC (rdfabout)
  23. 23. We need your help & feedback A Smart Data Exchange for All Data NYC Find out more at http://nyc.pediacities.com/facets@jqnatividad @samimirzabaig @pediacities @ontodia
  24. 24. CREDITS• Flickr User Weston Price, Paleo-Caveman-Omnivore- LowCarb-Meat-Diet-Info (http://www.flickr.com/ photos/paleo-atkins-meat-diet-info/with/6718805047/)• Flickr User Gao Yi (http://www.flickr.com/photos/gaoyi/ 178514677/)• Senator Arlen Specter being confronted at a Town Hall meeting after passage of Healthcare Reform Act (Bradley C Bower-AP)• Several pictures taken from NYC.gov/NYCEDC properties, Tumblr and Flickr accounts