Notes for talk on 12th June 2013 to Open Innovation meeting, Glasgow


Published on meeting in Glasgow

Published in: Business, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Notes for talk on 12th June 2013 to Open Innovation meeting, Glasgow

  1. 1. Notes for Talk on 12th June to Open Innovation meeting: PeterWinstanley06 June 201309:17Introduction1/ In any area of endeavour - business, administration, entertainment - the availability of resourcesfor innovation can be the difference between business as usual with standard margins and returns,and business as exceptional in which the greater efficiency and effectiveness from innovativeproducts and processes gives us more for less.2/ There are standard methods applicable to project management (e.g. PRINCE, PRiSM etc), orquality management (e.g. TQM, Six Sigma) but many people approach the creativity and ideationprocess with a clear canvas frame of mind. There is a belief that creativity and invention is an open,free-form activity with brainstorming/ 6 hats and simple cause-effect and fishbone diagrams as thetool sets. I want to mention TRIZ, as a systematised methodology for inventive problem solving andhighlight a few aspects which will act as a backdrop to this presentation. I think that this can andshould be incorporated into the development of programmes of work to develop smart cities just inthe same way that quality improvement methods are.2a/ There is often a mindset/problem of rampant incrementalism that starves the radicalinnovation.3/ TRIZ methodology asks designers to define what the ideal solution would look like - all thebenefits of an efficient system, but without the negative or wasteful aspects. TRIZ also recognisesthat in many cases a problem in one domain has actually been solved and is well understood inanother domain - so if we can remove the domain-specific language we can look for inventivesolutions by analogy. If we have to guide the mouse to take a route to the cheese, isnt it easier togo from known solutions than to be faced with an infinite set of options. TRIZ methodology promptsus to look at systems from the super-system level - from the birds eye view and to look forcontradictions that constrain solutions to curves of compromise where an improvement in onedimension is associated with degradation in another dimension.4/ With these elements of the TRIZ methodology in mind Ill take a tour of the data interoperabilitylandscape picking out some of the publicly available datasets and how they might fit into the contextof a Smart CityBirds Eye ViewData is explicitly included in the notion of a smart city .. Its assumed that those in a smart city will beusing data and generating data. "Big data" is a relative and perhaps meaningless term. However, inorder for the data to be shared and re-used by the various constituent parts of the smart city and itsneighbours the data needs to be interoperable. This is recognised, especially in EU Directives andFrameworks. The European Interoperability Framework describes interoperability at 4 levels:1. Legal - align legislation so that exchanged data is accorded proper legal weight2. Organisational - Co-ordinate processes across organisations
  2. 2. 3. Semantic - Precise meaning of exchanged information is preserved and understood by allparties4. Technical - Planning of technical issues involved in linking computer systems and servicesFocussing on the Semantic Interoperability an ideal solution might be that we all use a sharedcommon data model that allows data to be merged between parties without any "Extraction,Transformation and Loading" overhead. Semantic Web approaches that use the ResourceDescription Framework model of data are at the heart of the strategic path of the EU and otherorganisations to achieve semantic interoperability. Note that RDF is a model, not a syntax.This approach (described by Berners-Lee in a "1 star" to "5 star" classification scheme) is achievedby publishing data on the web, making it accessible by HTTP URLs using the RDF model and thenusing that RDF formula of triples to describe our data and to link it out to other datasets so as todevelop a network of Linked Data that can be queried by computers in a federated way using theSPARQL language.In addition to being technology-neutral and non-proprietary, other key benefits of using RDF for asignificant part of the data exchanges within a smart city are that the RDF is capable of being self-describing and that it naturally de-duplicates. By making data self-describing and automatically de-duplicating, the whole process of Extraction, Transformation and Loading (ETL), which is generallythe process with the biggest overhead in data exchanges and one that impedes the scalableinteroperations of data, shrinks to negligible proportionsWarningCities can easily lose leverage to private companies their citizens rely on, as the persistent battles ofpolitical leaders against telecom companies over price increases show. And private-sector software canoperate behind a veil: Townsend says that while cities have made lots of data freely available online,there’s less concern about opening up the proprietary tools used to analyze that data—software thatmight help a city official decide who is eligible for services, or which neighborhoods are crimehotspots. “It’s the algorithms in government that need to be brought out to the light of day, not thedata,” he says. “What I worry about are the de facto laws that are being coded in software withoutpublic scrutiny.”Pasted from <>Identifiers of thingsAny dataset describes some aspect of some thing. URLs are used in RDF to represent everything.URLs for information assets such as documents can resolve by returning these assets, but forabstract things such as concepts and for real world things such as people and places the URL canresolve to provide some information about the thing they represent.So, how do I design these URLs and how do I know if there are already URLs out there that I can re-use.From other disciplines we know that the cookbook approach is a good one to use to help with thedesign. So, for URLs there is a cookbook
  3. 3. [] and there is already a service, that is harvestingURLs from RDF on the web and organising them into groups that refer to the same entity. The UKGovernment Linked Data Delivery Group is developing a registry for URLs[ ]Clearly we dont want to have too many URLs that are identical in meaning, but it is happening at themoment because in many instances the organisations that have legal and moral authority topopulate this space are not doing do and others who need the identifiers for their datasets are justminting them. However, in time this redundancy will diminish and models will coalesce.So what is out there in the wild, available for use? These are some sources.• The Scottish Government has minted URLs for its buildings , e.g. as part of its RDF publication of half-hourly utilities consumption data• Ordnance Survey have a whole swathe of URLs ( )representing all the things in the OS dataset, the CodePoint postcode units, the features inthe 50K maps, and the boundary lines for administrative boundaries etc.• Companies House: URLs for every registered company. [ ]• Charities Regulators: CHC and SC have a web page for each registered organisation that canact as an RDF URL, e.g.• Legislation - laws and SIs in [ the RDF is available at]• RCAHMS have a URL-based approach to identifying buildings and other monuments. e.g.• The British Library through have URLs that identify authors and otherpeople in their catalogues.• Virtual International Authority File:,_Otto,_1891-1969• Europeana: through its API, the Europeana project has URLs that identify things in themuseums and galleries world.• DBPedia: The RDF version of Wikipedia http://dbpedia.orgSo there are identifiers for things at all scales from the identification of countries or administrativeunits to an individual museum or gallery artefact.TerminologyOnce we have identifiers for things and concepts we can then describe and link them together withterminology. This terminology can be similarly identified by URLs. There are many vocabularies inthe wild available for re-use. Some are core such as the vocabularies for RDF (and RDF Schema.Others are able to be very widely used, such as the vocabularies to describe people and their inter-relationships (FOAF).There are vocabularies to organise the relationships between data elements. The RDF Cubevocabulary [] is a development from the SDMX XMLvocabulary and it is used to describe cubes of data where datasets are arranged in slices thatcorrespond to the different dimensions of the dataset.
  4. 4. There are vocabularies for classification. These are often expressed in terms of another RDFvocabulary, SKOS. I recently worked with the UN to develop the multilingual RDF representation ofthe COFOG [ ] and other relatedUN/OECD classification schemes.Then there are vocabularies being developed as part of the EU ISA programme - "core vocabularies"that cover the minimum set of terms that are needed to describe a Person[] , a Business[] , a Location[ ] and a Public Service[ ].There are many domain-specific vocabularies that are available as RDF and are very widely used, inthe health area there is SNOMED-CT [] , in the cultural heritagearea there are the RCHAMS, AAT, CLAROS, Europeana etc. Town Planning has the TowntologyRetail: GoodRelations [ ]Ontologies, Schemas, etcWe all share models of the things around us in the world …. and well designed things are obvious tooperate - we dont need the manual most of the time.Ontologies and other schemas are just like that - they formally encapsulate all the legitimatepossibilities and constraints for a model.By using a specific ontology we buy in to these options and constraints.Many of the terminologies referred to earlier are also components of an ontology (others that areclassifications schemes, such as the UN COFOG classification, are concept schemes and dont havethe tight constraints that are present in an ontology)Ontologies allow inference and reasoning over the RDF. Inference allows the end user to create datarelationships that were not explicitly stated. For instance, if in natural language we state that"William is married to Mary" then the understanding that "is married to" is a reciprocal relationshipallows us to answer the question "Who is Mary married to?" and to make the statement "Mary ismarried to William".So, ontologies can help people to develop that feeling for what to expect with any thing, thepossible relationships that there might be between this thing and the types of thing permittedwithin the scope of the ontology.But, in contrast to the XML or the RDBMS world, which also use schemas as patterns andconstraints, where what is not stated as true is false, the RDF world uses an Open WorldAssumption encapsulated in the phrase "Anyone can say Anything about Anything" and where whatis not stated is unknown. The consequences of this area that RDF assertions might be made that areconflicting - the use of a schema doesnt by itself ensure the logical or the real truth, or a dataelement might be only partially described. However, the use of a reasoners and other tools thatquery the data and test constraints would be able to help in this situation.
  5. 5. SPARQLRDF has its own query language, SPARQL, and this can be delivered to endpoints over HTTP. Resultsare returned generally as XML or JSON but transformation to other formats is trivial.Scottish Government GOLSPIE Utilities consumption data: of RDF in real world systemsDBPediaOrdnance Survey Linked Data and Government GOLSPIE Bathing waters pilotTellmeScotland pilot World Cup and Olympic Games websites: Read about the dynamic publishing system atCIPFA lists of Revenue Account headings: andCookbooksURI design: Linked Data: Start for Decision Makers: data that could be very easily used in this way
  6. 6. Scottish Neighbourhood Statistics: could have an API such as Scotland: [raw timetabling data availablein XML from ]SQA Subjects data: Government Statistics: Flood warnings: many others: and learning from others and to keep an eye on
  7. 7. Scottish Neighbourhood Statistics: could have an API such as Scotland: [raw timetabling data availablein XML from ]SQA Subjects data: Government Statistics: Flood warnings: many others: and learning from others and to keep an eye on