Future of controlled vocabularies: better content, new career opportunities

5,020 views
4,971 views

Published on

A presentation to the San Andreas chapter of the Special Libraries Association in November 2009. Covers the spectrum of controlled vocabularies, what ontologies and other semantic technologies have to offer, and how it dovetails with the skills of information professionals.

Published in: Technology, Travel, Spiritual
3 Comments
1 Like
Statistics
Notes
No Downloads
Views
Total views
5,020
On SlideShare
0
From Embeds
0
Number of Embeds
20
Actions
Shares
0
Downloads
39
Comments
3
Likes
1
Embeds 0
No embeds

No notes for slide
  • We have schema into which we plug the the terms from our various controlled vocabularies.
  • We have accession numbers, shelf numbers, international standard numbers and still...
  • … we’re limited in what we can find, and how we find it. Be it in print, or online finding aids.
  • This is the card catalog room at the Sterling Memorial Library, Yale.
    Metadata goes back quite far, actually. In the British Museum are girginakku, Mesopotamian library boxes that have clay tablet labels on them - metadata. Go see David’s picture at http://www.flickr.com/photos/70494923@N00/2650269503/in/photostream/

    SO what are taxonomies, ontologies etc? Let’s talk about it.
  • Folksonomies provide personalized labels - they have meaning for the person that creates them, that is if they are memorable…
    Lists provide ambiguity control.
    Synonym rings provide equivalency control.
    Taxonomies provide ambiguity control, synonym control and hierarchical relationships (BT, NT).
    Thesauri provide ambiguity control, synonym control, hierarchical relationships (BT, NT), associative relationships (RT, See Also, USE/UF) and Scope Notes.
    Ontologies do all of the above but allow for custom relationship types (properties/predicates), inferencing, reasoning and “NOT.”
  • Folksonomies provide personalized labels - they have meaning for the person that creates them, that is if they are memorable…
    Lists provide ambiguity control.
    Synonym rings provide equivalency control.
    Taxonomies provide ambiguity control, synonym control and hierarchical relationships (BT, NT).
    Thesauri provide ambiguity control, synonym control, hierarchical relationships (BT, NT), associative relationships (RT, See Also, USE/UF) and Scope Notes.
    Ontologies do all of the above but allow for custom relationship types (properties/predicates), inferencing, reasoning and “NOT.”
  • Folksonomies provide personalized labels - they have meaning for the person that creates them, that is if they are memorable…
    Lists provide ambiguity control.
    Synonym rings provide equivalency control.
    Taxonomies provide ambiguity control, synonym control and hierarchical relationships (BT, NT).
    Thesauri provide ambiguity control, synonym control, hierarchical relationships (BT, NT), associative relationships (RT, See Also, USE/UF) and Scope Notes.
    Ontologies do all of the above but allow for custom relationship types (properties/predicates), inferencing, reasoning and “NOT.”
  • Folksonomies provide personalized labels - they have meaning for the person that creates them, that is if they are memorable…
    Lists provide ambiguity control.
    Synonym rings provide equivalency control.
    Taxonomies provide ambiguity control, synonym control and hierarchical relationships (BT, NT).
    Thesauri provide ambiguity control, synonym control, hierarchical relationships (BT, NT), associative relationships (RT, See Also, USE/UF) and Scope Notes.
    Ontologies do all of the above but allow for custom relationship types (properties/predicates), inferencing, reasoning and “NOT.”
  • Folksonomies provide personalized labels - they have meaning for the person that creates them, that is if they are memorable…
    Lists provide ambiguity control.
    Synonym rings provide equivalency control.
    Taxonomies provide ambiguity control, synonym control and hierarchical relationships (BT, NT).
    Thesauri provide ambiguity control, synonym control, hierarchical relationships (BT, NT), associative relationships (RT, See Also, USE/UF) and Scope Notes.
    Ontologies do all of the above but allow for custom relationship types (properties/predicates), inferencing, reasoning and “NOT.”
  • Folksonomies provide personalized labels - they have meaning for the person that creates them, that is if they are memorable…
    Lists provide ambiguity control.
    Synonym rings provide equivalency control.
    Taxonomies provide ambiguity control, synonym control and hierarchical relationships (BT, NT).
    Thesauri provide ambiguity control, synonym control, hierarchical relationships (BT, NT), associative relationships (RT, See Also, USE/UF) and Scope Notes.
    Ontologies do all of the above but allow for custom relationship types (properties/predicates), inferencing, reasoning and “NOT.”
  • Folksonomies provide personalized labels - they have meaning for the person that creates them, that is if they are memorable…
    Lists provide ambiguity control.
    Synonym rings provide equivalency control.
    Taxonomies provide ambiguity control, synonym control and hierarchical relationships (BT, NT).
    Thesauri provide ambiguity control, synonym control, hierarchical relationships (BT, NT), associative relationships (RT, See Also, USE/UF) and Scope Notes.
    Ontologies do all of the above but allow for custom relationship types (properties/predicates), inferencing, reasoning and “NOT.”
  • Folksonomies provide personalized labels - they have meaning for the person that creates them, that is if they are memorable…
    Lists provide ambiguity control.
    Synonym rings provide equivalency control.
    Taxonomies provide ambiguity control, synonym control and hierarchical relationships (BT, NT).
    Thesauri provide ambiguity control, synonym control, hierarchical relationships (BT, NT), associative relationships (RT, See Also, USE/UF) and Scope Notes.
    Ontologies do all of the above but allow for custom relationship types (properties/predicates), inferencing, reasoning and “NOT.”
  • Folksonomies provide personalized labels - they have meaning for the person that creates them, that is if they are memorable…
    Lists provide ambiguity control.
    Synonym rings provide equivalency control.
    Taxonomies provide ambiguity control, synonym control and hierarchical relationships (BT, NT).
    Thesauri provide ambiguity control, synonym control, hierarchical relationships (BT, NT), associative relationships (RT, See Also, USE/UF) and Scope Notes.
    Ontologies do all of the above but allow for custom relationship types (properties/predicates), inferencing, reasoning and “NOT.”
  • Here’s an example of folksonomy. How many of you make a list when you go grocery shopping? This is like going to the store with no list. There are some staples that everyone needs, but everything is kind of random.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • This is a list; typically used in forms. It reduces typos and the use of alternate terms and once an appropriately designed list is learned it improves efficiency.
  • Here we have a taxonomy - hierarchical relationships. We see these used in file systems and in search tools, where if subsumption is turned on we get higher recall at the higher classes and better precision in the lower classes.
  • Here we have a taxonomy - hierarchical relationships. We see these used in file systems and in search tools, where if subsumption is turned on we get higher recall at the higher classes and better precision in the lower classes.
  • Here we have a taxonomy - hierarchical relationships. We see these used in file systems and in search tools, where if subsumption is turned on we get higher recall at the higher classes and better precision in the lower classes.
  • Here we have a taxonomy - hierarchical relationships. We see these used in file systems and in search tools, where if subsumption is turned on we get higher recall at the higher classes and better precision in the lower classes.
  • Here we have a taxonomy - hierarchical relationships. We see these used in file systems and in search tools, where if subsumption is turned on we get higher recall at the higher classes and better precision in the lower classes.
  • In a polyhierarchical system we can find a term from many different starting points. Faceted search systems frequently make use of the rules of thumb regarding using isA, kindOf, and partOf relationships. But notice that we’re still not allowed to encode the relationships directly as “isA”, “kindOf” or “partOf” and instead still have to use NT/BT.
  • In a polyhierarchical system we can find a term from many different starting points. Faceted search systems frequently make use of the rules of thumb regarding using isA, kindOf, and partOf relationships. But notice that we’re still not allowed to encode the relationships directly as “isA”, “kindOf” or “partOf” and instead still have to use NT/BT.
  • In a polyhierarchical system we can find a term from many different starting points. Faceted search systems frequently make use of the rules of thumb regarding using isA, kindOf, and partOf relationships. But notice that we’re still not allowed to encode the relationships directly as “isA”, “kindOf” or “partOf” and instead still have to use NT/BT.
  • Our thesaurus. Filling requirements for many systems -see right list. Enterprise Search, content portals, business analytics packages.
  • I can do my BT/NT stuff, but I can also create classes and properties that i need for my own application

    I can do lots more in an ontology - using transitive, symmetric and functional properties, declaring data types and cardinality and I can also say NO - this object is NOT part of a certain class or have a certain property. In this diagram, I could easily add a branch for the British Isles, define it as having England, Wales, Scotland, Northern Ireland, Ireland, the Isle of Man, the Channel Islands etc, but be able to state specifically that Ireland is not part of the United Kingdom, so as not to create confusion for a machine processing the concept base. The power of no.

    I an also now take this graph (presuming I’d encoded it properly) and link it to other graphs that define the UK if I choose. I do this to take advantage of the work others have done -- to share, discover and have a more complete data set which, if I’ve chosen carefully, should provide better data and analysis.
  • one URI can be embedded everywhere vs. a web page which is maintained by one creator (entity)
  • why do this
    findability
    reuse
    share
    but most importantly to NEXT (analyze)
  • why do this
    findability
    reuse
    share
    but most importantly to NEXT (analyze)
  • ANALYZE
  • what do i want them to do:

    manage your own data
    branding and for data reuse

    We have to think more now about what we do, and how that impacts our digital environment. The ripples that our actions create. Make statements. Give them URIs. Revise those statements. Maintain history. Maintain your integrity.

    spammers and false data will out because other people can link to it
  • need to work on trust and security

    someone asked yesterday “How does it decide what is the most accurate info?”
    1st - “it” shouldn’t decide - you should
    2nd - trust layer needs work - ontologies, algorithms etc

    first steps - True Knowledge


    can be one way

    myths -

    security - open

    Key advances needed in the semantic web are user interface/interaction design patterns and security. The semweb community knows this! They need your help - send in your use cases!!!
  • So where does one get some of this data? As there are many data sets available, and I’d like to show you the growth as visualized in these graphs from Richard Cyganiak and Chris Bizer.

    Not all of these are completely FREE. I promise you that IEEE and ACM are NOT giving away their research papers.

    SO what are some examples of applications using this data?
  • So where does one get some of this data? As there are many data sets available, and I’d like to show you the growth as visualized in these graphs from Richard Cyganiak and Chris Bizer.

    Not all of these are completely FREE. I promise you that IEEE and ACM are NOT giving away their research papers.

    SO what are some examples of applications using this data?
  • So where does one get some of this data? As there are many data sets available, and I’d like to show you the growth as visualized in these graphs from Richard Cyganiak and Chris Bizer.

    Not all of these are completely FREE. I promise you that IEEE and ACM are NOT giving away their research papers.

    SO what are some examples of applications using this data?
  • So where does one get some of this data? As there are many data sets available, and I’d like to show you the growth as visualized in these graphs from Richard Cyganiak and Chris Bizer.

    Not all of these are completely FREE. I promise you that IEEE and ACM are NOT giving away their research papers.

    SO what are some examples of applications using this data?
  • So where does one get some of this data? As there are many data sets available, and I’d like to show you the growth as visualized in these graphs from Richard Cyganiak and Chris Bizer.

    Not all of these are completely FREE. I promise you that IEEE and ACM are NOT giving away their research papers.

    SO what are some examples of applications using this data?
  • So where does one get some of this data? As there are many data sets available, and I’d like to show you the growth as visualized in these graphs from Richard Cyganiak and Chris Bizer.

    Not all of these are completely FREE. I promise you that IEEE and ACM are NOT giving away their research papers.

    SO what are some examples of applications using this data?
  • Data from MusicBrainz and Wikipedia are combined - with a bit of editorial oversight - with playlists and story data from BBC properties
  • OK, but why do we do all of this?
  • explore, discover, magic, enjoy, learn
  • false starts, circular paths - much like enterprise data and paths through the web of unstructured data
  • What’s in these silos? How do we get in safely and get back out cleanly?

    Silos are ok - as long as they are clearly marked, and can be connected to the preceding and following steps in the workflow.

    Open world vs closed world
  • Organizations need to put their data IN the web instead of simply ON the web.
    Just as we use ISBNs, ISSNs, AACR2 and other standards, we need to embrace the methods being considered by diverse working groups to allow the data to be consistent. A common framework will allow us to use the data dynamically - from mashups to annotations. Consistent frameworks allow us to reduce costs in a few ways - shorter time period for learning new models, lower software costs for non-custom, COTS products. Our patrons win as well - they don’t have to learn new techniques for each data set. We gain shared meaning for concepts, reducing confusion.



    Embed ability to manipulate data rather than expend effort scraping it back out
    Re-purpose data rather than re-create it
    Improve product development with a global business vocabulary that feeds right into downstream applications such as portals, reporting programs and CRMs
    Increase online revenue and improve your customers’ online experience by cross-referencing industry classification codes and brand names
    Compliance
    Increase delivery channels for data and services
  • Communication, after all, is frequently a root cause of many good, and bad, events.

    “The Shannon–Weaver model of communication embodies the concepts of information source, message, transmitter, signal, channel, noise, receiver, information destination, probability of error, coding, decoding, information rate, channel capacity, etc.”

    On the web, separate protocols and languages handle similar concepts such as transport, encoding, noise reduction, feedback; all in the name of clarifying communication in a virtual space in a manner quite similar to the model defined here by a mathematician and information theorist.

    http://en.wikipedia.org/wiki/Shannon-Weaver_model
  • Exposure, recognition for work
    Identify works possibly targets or victims of theft/misappropriation of assets
    Sharing ~ embedding, commenting, tagging
    “Curate the content, not the container”
    Audience involvement. The stories, the facts, the beauty or repulsiveness of the artefact draws people in, and they are more likely to appreciate the efforts that went in to the collection and display of them. Engaged patrons are more likely to become loyal patrons, and more likely to become financially supporting patrons.

    What are some of the roles needed in this space?
  • There is much of interest in the paper by Mr. Vickery at this URL. I once agreed with the statement “This is the work of schoars and encyclopaedists.” I do not any longer. It hampered my personal career aspirations. I do agree that our jobs are to make knowledge available using tools that make it easier for people to find and use what they seek. Sometimes it’s good to stay inside the box. Sometimes you need to go outside the box.
  • If you are embedded with a group, or wish to be, building a knowledge base that can be consumed by existing or new systems, added to by team members - or not! - and as easy to query as a datastore will make you imminently more valuable.

    if you love your current job AACR2 is going away, RDA and more -- learn this
    RDFa
  • if you love your current job you can still make great use of these capabilities.
    AACR2 is going away, RDA and more are coming
    RDFa
    B.1 insert “ontologies” before the “etc” and er, add “CRUD”
    B.2 ontologies allow one to add data as it is found without breaking the existing model, connect to other dynamic datasets and give serendipity a chance
    C.3 ontologies are not prose - synthesizing the necessary concepts and relationships is as important in this realm as in an executive summary of research findings;
    inherent in ontological data models are the ability to inference and reason across the data -- to ask questions, to analyze and synthesize in ad hoc or scheduled transactions
    ontologies serialized in an open standard format also allow your users to integrate the data into their applications -- dashboards, RSS feeds, databases, CMS, DAM etc -- for real-time or near updates
  • D.1 How many of you have integrated taxonomies and thesauri into systems other than your OPAC? Are you using RSS? Digital Object Identifiers? Dublin Core? These technologies will become just as important.
    D.2 Everyone I’ve asked thinks that the librarian’s skill at developing knowledge organization system is a critical part of this next evolution of the web.
    D.3 Privacy, security, provenance, authority, trust are all becoming critical aspects of the system. Work has been done, but much is still needed. Sharing what we’ve learned - both our skills and experiences with - determining authoritative sources of information, protecting patron records and interactions is needed. There is still time to get involved, have an impact.
    D.4 There are emerging technologies. They aren’t young by internet standards, but they are yet to reach their prime. Librarians have always been on the forefront of technologies that will improve their services -- this is an incredible leap forward. Have you ever been frustrated by an anthology of works? Be it poetry, stories or professional papers, you may have just what your patron needs, but not know it’s there unless you have had time to look at every book in your collection. And who has time to do that anymore? Catalog records do not contain all of those article titles, but an ontology can.

    Now we’re going to look at some of the roles I believe are relevant to the industry and the professionals gathered here tonight.
  • SLA.org, Competencies for Information Professionals

    Information professionals have expertise in total management of information
    resources, including identifying, selecting, evaluating, securing and providing
    access to pertinent information resources.
    . These resources may be in any
    media or format. Information professionals recognize the importance of people
    as a key information resource.

    B.1 Manages the full life cycle of information from its creation or acquisition
    through its destruction. This includes organizing, categorizing, cataloguing,
    classifying, disseminating; creating and managing taxonomies, intranet and
    extranet content, thesauri etc.

    B.2 Builds a dynamic collection of information resources based on a deep
    understanding of clients’ information needs and their learning, work and/or
    business processes.

    B.3 Demonstrates expert knowledge of the content and format of information
    resources, including the ability to critically evaluate, select and filter them.

    B.4 Provides access to the best available externally published and internally
    created information resources and deploys content throughout the organization
    using a suite of information access tools.

    B.5 Negotiates the purchase and licensing of needed information products and
    services.

    B.6 Develops information policies for the organization regarding externally
    published and internally created information resources and advises on the
    implementation of these policies.
  • There is a computationally complex view of the web that involves Boolean logic, Bayesian algorithms, syntax, pattern recognition, neural networks and more. There is another view that is concerned about meaning, categorization, classification and relationships. This view tends to require more human power. Neither is particularly practical – one requires heavy-duty processing and lots of monitoring. The other requires a great deal of handcrafting and maintaining. Using the best of each world will get you further in the long run. There are brilliant minds working in the artificial intelligence space, and we make great use of those tools in our own processing platform, but that’s not what we’re going to focus on today.

    Today, we’ll be talking about a web of data – linked data; the vision promoted by the world wide web consortium. The semantic web is NOT a new web, in fact the specifications are on average a decade old. It is an open framework designed to allow data to be shared by as many people, organizations and applications as is desired.

    Right now the majority of the data on the web is locked up in applications and markup languages that jumble the format, the style, delivery mechanism and the content all together. The semantic web is a group of standards that provide the common format for describing data so that data from different sources can easily be combined and integrated rather than siloed.



    http://en.wikipedia.org/wiki/File:Artificial_neural_network.svg
    http://en.wikipedia.org/wiki/File:Xbarst1.jpg
    http://en.wikipedia.org/wiki/Naive_Bayes_classifier
  • An RDF Graph; a statement
  • Resource = thing
    Literal = string of characters (?lang, ?datatype)
    Statement = Triple = (s, p, o) =
    Property = (…, p, …)
    Graph = a set of Statements =
    RDF Description (of some thing) = a set of Statements (about that thing)
  • Resource = thing
    Literal = string of characters (?lang, ?datatype)
    Statement = Triple = (s, p, o) =
    Property = (…, p, …)
    Graph = a set of Statements =
    RDF Description (of some thing) = a set of Statements (about that thing)
  • Simply keep adding annotations
  • Let’s say we have a community ontology – Friends of the Library, church group, professional group, what have you.

    They key concepts in the domain of knowledge that describes this particular community are people, places events, and time.

    To continue the notion of a triple to a more familiar realm, you have an element-attribute-value set. “Events” is an element, but where do we get the values from? We get the values from the Events Taxonomy.
  • It’s enough to make you go crackers… cue Shirley Temple
  • Future of controlled vocabularies: better content, new career opportunities

    1. 1. The Evolution of Controlled Vocabularies Better content, new career opportunities A Presentation To The SLA San Andreas Chapter At San Jose State University, November 18, 2009 By Christine Connors Among Other Things A Librarian, Information Scientist, Semantic Web Advocate And Founder Of TriviumRLG
    2. 2. If you wish to Tweet whilst here... ✤ I am @cjmconnors ✤ Hashtag is #slasa
    3. 3. Quick Survey ✤ More interested in controlled vocabularies or career opportunities? ✤ Perform ‘traditional’ library duties? (Research, Reference, Circ, ILL…) ✤ How many of you are building ontologies? Rule bases? ✤ Looking for a job? ✤ Have a position to fill or expect to this year? ✤ Business oriented? ✤ Technology focused? ✤ Content & Information Architecture specialist?
    4. 4. ex libris |flickr
    5. 5. blmurch | flickr
    6. 6. yourpaldave | flickr
    7. 7. ragesoss | flickr
    8. 8. The Continuum We are building more complex and powerful data architectures; all types are available for use on the semantic web
    9. 9. Ontology Thesaurus Taxonomy Power Synonym Ring List Folksonomy Complexity The Continuum We are building more complex and powerful data architectures; all types are available for use on the semantic web
    10. 10. Andorra Austria Belgium Denmark Finland France Germany Name: Hungary Address: Ireland City: Italy Precision State/Province: Liechtenstein Country: Monaco Zip/Postal Code: Netherlands Norway Portugal Spain Sweden Switzerland United Kingdom
    11. 11. United Kingdom Synonym: UK Synonym: United Kingdom of Great Britain and Northern Ireland
    12. 12. United Kingdom Synonym: UK Synonym: United Kingdom of Great Britain and Northern Ireland Behind the scenes in search Recall
    13. 13. Europe NT ... United Kingdom NT England Scotland Wales Northern Ireland See http://www.nlsearch.com/rssearch.php
    14. 14. Europe NT Better ... Recall United Kingdom Advanced Search NT England Scotland Better Wales Precision Northern Ireland See http://www.nlsearch.com/rssearch.php
    15. 15. Europe NT ... United Kingdom NT England Scotland Wales Northern Ireland
    16. 16. Europe England NT BT ... Britain United Kingdom Great Britain NT United Kingdom England BT Scotland European Union Wales Group of Eight Northern Ireland United Nations Security Council NATO Faceted or Parametric Search; Guided Navigation See www.endeca.com or www.newssift.com
    17. 17. Europe NT United Kingdom Scope Note Situated in north-west Europe, the island nation was formed January 1, 1801. Use For UK Use For United Kingdom of Great Britain and Northern Ireland See Also Great Britain See Also Britain See Also British Isles NT England Scotland Wales Northern Ireland
    18. 18. Europe NT United Kingdom Scope Note Situated in north-west Europe, the island nation was formed January 1, 1801. Categorization Use For UK Classification Use For United Kingdom of Search Great Britain and Northern Ireland Advanced Search See Also Great Britain Rules-based Coding See Also Britain See Also British Isles *Precision ? Recall ? NT England Scotland Wales Northern Ireland
    19. 19. NT England Britain BT NT NT BT BT Wales Great Britain NT NT BT Scotland BT United NT Northern Kingdom BT Ireland
    20. 20. NT England Britain BT God and my right NT NT BT BT Wales motto Great Britain NT NT BT Scotland BT flag United NT Northern God Save the Queen Kingdom BT Ireland anthem official English language capital currency legislature London pound sterling Parliament
    21. 21. OpenCyc Large ontology based on the Cyc Knowledge Base http://sw.opencyc.org/concept/Mx4rvViRhJwpEbGdrcN5Y29ycA
    22. 22. DBpedia A sizable ontology derived from data in Wikipedia http://dbpedia.org/page/United_kingdom http://wiki.dbpedia.org/Datasets
    23. 23. Umbel Subjects Concept Explorer http://umbel.zitgist.com/explorer.php?concept=http%3A%2F%2Fumbel.org%2Fumbel%2Fsc %2FUnitedKingdomOfGreatBritainAndNorthernIreland
    24. 24. Region 1 Region 2 100 70 75 52.5 50 35 25 17.5 0 0 2007 2008 2009 2010
    25. 25. http://dbpedia.org/page/Dow_Jones_Industrial_Average
    26. 26. Semantic Web Layer Cake Key components; time left to influence - publish your use cases http://www.w3.org/2007/03/layercake.png 22
    27. 27. As of March 2008
    28. 28. As of March 2008
    29. 29. As of March 2008
    30. 30. As of March 2008
    31. 31. Circle size indicates the # of triples in the dataset Circle Size Triple Count Very large > 1B Large 1B-10M Medium 10M-500k Small 500k-10k Very small <10k Arrow direction indicates that a given dataset contains concepts from the indicated dataset Arrow thickness indicates the # of shared triples Arrow Thickness Triple Count Thick >100k Medium 100k-1k Thin <1k As of March 2008
    32. 32. As of March 2008
    33. 33. MultimediaN Eculture Project
    34. 34. BBC MusicBeta Data from users of MusicBrainz and Wikipedia, with BBC editorial oversight. http://www.bbc.co.uk/music/
    35. 35. European Digital Library Access to the collection of the 48 libraries of the EU. Http://search.theeuropeanlibrary.org/portal/en/index.html
    36. 36. Wonderful objects with no metadata A secret garden
    37. 37. Objects with can’t-be- bothered A maze
    38. 38. Lots of unmarked repositories Silos
    39. 39. Benefits ✤ Interoperable ✤ Increase delivery channels for data and services ✤ Consistent ✤ Dynamic ✤ Greater Return on Investment/Effort ✤ Re-purpose data rather than re-create it ✤ Improved discovery ✤ Improved analytics ✤ Shared meaning ✤ Compliance
    40. 40. Communication Clarity
    41. 41. Benefits of Communicating Clearly ✤ Authority ✤ Trust ✤ Provenance ✤ Joint research / build on existing research ✤ Larger audience ✤ User engagement
    42. 42. Winter 2003-2004
    43. 43. The Dream Team 2009 ✤ Information Scientist ✤ Information Architect/User Experience Designer ✤ Cognitive Scientists ✤ Developers ✤ Business Analyst ✤ Leader/Evangelist/Community Manager ✤ Project Manager ✤ Security Expert ✤ Legal Advisor
    44. 44. Where’s the box? ✤ To organise knowledge is to gather together what we know into a comprehensive organised structure, to show its parts and their relationships. This is the work of scholars and encyclopaedists. ✤ Our tasks are to make knowledge (whether organised or unorganised) available to those who seek it, to store it in an accessible way, and to provide tools and procedures that make it easier for people to find what they seek in those stores. ✤ Brian Vickery, http://www.lucis.me.uk/knowlorg.htm
    45. 45. Embedded Librarians ✤ Collaborated on or contribute to your customer group’s electronic communications and/or collaborative workspaces, including email, wikis, blogs and other web-based workspaces. ✤ Shoemaker and Talley, http://www.sla.org/pdfs/ EmbeddedLibrarianshipFinalRptRev.pdf, funded by an SLA Research Grant, 2007, published June, 2009.
    46. 46. Human Search Engines? It was cute once, but is this all we are?
    47. 47. Competencies for Info Pros ✤ From the SLA special report ✤ B.1 ✤ Manages the full life cycle of information from its creation or acquisition through its destruction. This includes organizing, categorizing, cataloguing, classifying, disseminating; creating and managing taxonomies, intranet and extranet content, thesauri etc. ✤ B.2 ✤ Builds a dynamic collection of information resources based on a deep understanding of clients' information needs and their learning, work and/or business processes. ✤ C.3 ✤ Researches, analyzes and synthesizes information into accurate answers or actionable information for clients, and ensures that clients have the tools or capabilities to immediately apply these.
    48. 48. Competencies for Info Pros ✤ D.1 ✤ Assesses, selects and applies current and emerging information tools and creates information access and delivery solutions ✤ D.2 ✤ Applies expertise in databases, indexing, metadata, and information analysis and synthesis to improve information retrieval and use in the organization ✤ D.3 ✤ Protects the information privacy of clients and maintains awareness of, and responses to, new challenges to privacy ✤ D.4 ✤ Maintains current awareness of emerging technologies that may not be currently relevant but may become relevant tools of future information resources, services or applications.
    49. 49. Information Architects & User Experience designers ✤ We need help ✤ Powerful, but UI/UX is not friendly ✤ Organizing elements, pathfinding, labeling, building relationships, consistent experiences ✤ How to design for n-dimensional space in a 2D or 3D environment?
    50. 50. 2001, SuccessFactors has multiple worldwide offices collaborating for strong local support of customers. SENIOR FRONT-END ENGINEER, USER INTERFACE The world of web development is quickly evolving from a thin client model to one with rich and robust browser-based user interface functionality, using the latest developments in front-end technology. At SuccessFactors, we strive to constantly improve the way we build user interfaces and not only employ the latest UI development methodologies, but also *push the envelope* to discover and establish our own, unique approaches to UI development. From that perspective, we are actively seeking candidates who want to flex UI development muscles and be part of a dynamic, growing team of engineers dedicated to defining and creating the next generation in front-end, UI technology here at SuccessFactors, Inc.. Duties and Responsibilities: As a UI Engineer, your responsibilities will include working with a team to develop rich user interfaces for enterprise-level Software as a Service (SaaS) applications; constantly driving for consistent user interaction by not only building cutting-edge UI functionality, but also consolidating common JavaScript and DHTML code to improve our current user interface. In addition, you will be able to clearly communicate your ideas and both openly and honestly provide and receive regular input from your peers. While teamwork is of the utmost in importance, you can also work independently with minimal supervision, and take the initiative to constantly keep yourself engaged. Also, you never settle for second best. You possess a strong focus on quality and attention to detail, and possess the ability to quickly understand and solve unique and even undocumented programming problems. Naturally curious, you have a penchant for and drive to quickly learn and master skills in new technologies. Finally, you have a strong sense of fun and a passion for being part of a movement. As part
    51. 51. of the SuccessFactors family, you have a unique opportunity to join an exciting, dynamic and closely-interactive group of people whose focus is to change the way the world works! Proven experience in the following areas (not necessarily in order of importance) * 6+ years hands-on experience in full development life cycle software development, predominately in User Interface development. * Hand-coded, W3C-compliant and semantic (X)HTML and CSS with an emphasis on CSS-driven page layouts. * High level of proficiency with JavaScript (including object-oriented JS), DHTML, XMLHttpRequest, XML, JSON. * Familiarity with best-practices for usability and accessibility standards. * Writing high-quality, testable, maintainable, and well-documented code. * Proficiency in a server-side scripting language such as JSP, PHP, or ASP. * Solid understanding of working within a Model-View-Controller program architecture paradigm. * Bachelor*s degree in Computer Science, Engineering (any type) or a related field Desired skills * Experience with the integration and use of a third-party JavaScript library such as: Yahoo UI Library, Prototype, jQuery, DOJO, etc. * Experience in creating standalone, JavaScript-based UI widgets. * Java/J2EE server-side development * Experience with Flash, Flex, and SQL are a huge plus, but not required. Please visit us at www.SuccessFactors.com to learn more about us, to view all current job postings, and to apply.
    52. 52. Job Information d(s) here----- Title: Data Architect IV Location: San Francisco, CA Job Type: Contract Compensation: per Hour Reference Code: 922481-NRC Description: Our client is seeking a Data Architect to analyze data requirements and create logical and physical models and specifications of data. The Senior Content Engineer will work directly with the editors, project managers, system architects and software developers to develop editorial tools and delivery products that utilize data markup. Functions: Analyze complex data and product requirements Lead the development of data models and specifications in a variety of markup language syntaxes (W3C schema, RELAX NG Schema, XML DTDs, and RDF) Perform the change control and update process for maintaining modular data markup specifications Lead the development and maintenance of data transformation scripts Lead the development and maintenance of data conformance validation Develop ontology/vocabulary to be shared across disparate content types Work with editors, project managers, system architects, and software developers to define and develop editorial tools and products that utilize data markup Deliver presentations and/or train users on use of data markup, as required Organize and lead data modeling workshops to develop markup specifications Write documentation for markup specifications and design principles Research industry standards to contribute to recommendations for architectural direction Candidates are preferred to work in either Dayton, OH, Bethesda, MD, Colorado Springs, CO, Albany, NY, New Providence, NJ, Charlottesville, or San Francisco, CA; however, strong candidates from anywhere should apply without consideration to relocation. Occasional travel is required.
    53. 53. Information Scientist ✤ Organizing information ✤ Cataloging and classification ✤ Knowledge sharing ✤ Primary & secondary research ✤ Searching & finding ✤ Presentation of results in user’s preferred format ✤ Metrics
    54. 54. ONTOLOGIST / SR. VOCABULARY SPECIALIST Ontologist / Sr. Vocabulary Specialist Rosslyn, VA gineer 10% Job Description: The Ontologist / Sr. Vocabulary Specialist will work in a team-oriented environment, directly interface with a Department of Defense (DoD) customer and be a member of an Enterprise Vocabulary Team. The Vocabulary specialist will be the primary interface to assigned communities or offices through development of vocabulary artifacts and provide specific support and guidance consistent with approved methodologies. Relationship with DoD community points of contact in support of their data strategy implementation. Development of a Standard Information Structure that includes a glossary, ong- semantic model, and an object model. any Provide glossary development support. Provide thesaurus development support. on by Provide ontology development support. Elicit knowledge from Subject Matter Experts. Support handoff of vocabulary products to XML developers. Support optimization of search engine tuning parameters based on content of vocabulary content. Conduct vocabulary analysis and harmonization activities. Develop documentation and presentations for delivery to clients. Help resolve problems and ensure customer satisfaction. Foster positive client relationships DoD customers on adopted approach to development of vocabularies that are based on business process definition and identifying authoritative data sources.
    55. 55. Jack-Of-All-Trades ✤ Employees who are early adopters frequently have to be able to do a little of this, a little of that ✤ Academia
    56. 56. ver Spring, MD 1 Title: Knowledge Manager LLTIME Skills: self-starter, business systems analysis, databases, Microsoft Office SharePoint lary Server 2007, web part development, site definitions and workflow, information taxonomy and other features, Office 2007 l Time pos705775 Date: 6-2-2009 X143098 ne Description: Job Summary The Knowledge Manager gathers, reviews, analyzes, and evaluates business systems and user needs and writes detailed description of user needs, program functions, and general requirements. This person should have good understanding of relational database concepts and client-server concepts. May lead and direct the work of others, including managing vendors and vendor contracts. Working with limited managerial oversight, the Knowledge Manager is responsible for fielding requests from users, analyzing those requests, producing business and functional requirements, creation of metrics and test plans to prove that functional requirements have been met, documenting, and training both the end-user and the User Support Department on the end product. This individual will determine and demonstrate whether SharePoint can be effectively used "out-of-the-box" to meet requirements or whether custom coding will be required. Mentoring IPM and IT on SharePoint policies and best practices as part of project planning and scoping as needed, including site structure, security and other areas. The Knowledge Manager will apply advanced problem-solving skills including hypothesis generation, testing, successful resolution and communication. Assist project manager with issue and risk identification. The Knowledge Manager will also serve on the IPM SharePoint Governance Board. Job Responsibilities
    57. 57. Position ID: 679636 Date: 5-21-2009 Dice ID: RTL84898 avel Required: none Description: Telecommute: no We're looking for a Senior Manager/Front-End Engineer to serve as Team Lead for our client-side development efforts. Are you the person we're looking for? Maybe, if: You understand how to gracefully degrade styles and JavaScript behaviors, how to structure information and keep it and presentation style separate. You read sites like A List Apart and Ajaxian. You know what our yslow score is and have many suggestions for how to make it better. You have an opinion on the Semantic Web, Microformats, HTML5, jquery vs mootools vs ext, and what doctypes we should be using. You know your way around conditional CSS selectors and the DOM. You used Blueprint when it came out, but now find Fluid 960 a bit more to your taste. You know what "template inheritance" means and how extension differs from composition. Having done this before, you'll find it even more fun to lead a team of other front-end folks, and keep a whole bunch of back-end engineers intrigued with new things to figure out. You are going to look on our sites and despair. And then make them more efficient, functional, and beautiful, and show us how you did it. And you think that sounds like fun. ::About Us:: We're looking for someone with at least 5 years of front-end engineering experience, with a super-strong preference for team lead experience. We want someone who enjoys teaching others how to manage the front-end work, and has built a set of tools for doing so in the past. Our technology stack includes Linux, Postgres, Ruby+Rails, Python+Django, Javascript+Ext, with several ongoing experiments using Pylons, Jinja, jQuery, Dojo and other tools and frameworks. We use Subversion to manage our code, and Mercurial to manage our content. Our applications are designed for, built and run in high-availability environments. So this is not front-end for the sake of front-end, but front-end for the sake of results. Directly. You
    58. 58. Semantic Technology: Knowledge Engineering Associates =========================================== Applications are invited for Research Associate, Graduate Student and PDF positions in the Knowledge Navigation Infrastructure Team (KNIT) lead by the Innovatia Research Chair at the University of New Brunswick, Saint John http://www.unb.ca/news/view.cgi?id=1680 Researchers will collaboratively develop reusable infrastructures to support custom vertical search across multiple application domains in biomedical, health care, energy and telecom. Researchers will develop solutions for enterprise search using semantic web technologies, service oriented architecture, ontologies and text mining. Successful candidates will have demonstrated software development expertise and familiarity with knowledge engineering lifecycles. Candidates with demonstrated experience with two or more of the following technologies are encouraged to apply: 1) Web based Content Acquisition Strategies 2) Knowledge Representation with OWL / RDF 3) Text Mining / Natural Language Processing 4) Ontology based reasoning over instance / triple stores 5) Terminology Management and Curation 6) Literature and (Meta) Data Integration 7) Provenance Tracking / Tracability 8) Algorithm Design / Graph Matching 9) Human Computer Interaction 10) Semantically Enabled Software Architecture 11) Semantic Desktops and Publishing Enquiries and applications and can be made by sending a full CV and cover letter to: bakerc@unbsj.ca
    59. 59. Thank You! Christine Connors TriviumRLG.com cjmconnors@triviumrlg.com Nick: CJMConnors
    60. 60. Backup
    61. 61. Machine learning, natural language processing, artificial intelligence Predicate Subject Object Ontologies, metadata and linked data
    62. 62. Predicate Subject Object ‘abcdefghijklmnop’ Predicate Literal Subject 53
    63. 63. ✤ <This book> is titled “Organising Knowledge” dc:title ‘Organising Knowledge’ Predicate Object Subject 54
    64. 64. ✤ Since we need to represent a physical object in the subject, we will use the book’s ISBN ✤ Prefix / QName ✤ Shorthand method of referencing a namespace ✤ dc:title = http://purl.org/dc/elements/1.1/title uri:isbn:1843342286 dc:title ‘Organising Knowledge’ Predicate Object Subject 55
    65. 65. ‘Organising Knowledge’ dc:title uri:isbn:1843342286 dc:creator dc:subject ‘Patrick Lambe’ ex:taxonomy rdf:type skos:Concept 56
    66. 66. Ontology People Places Events Time Authority Geographic Events ISO File Thesaurus Taxonomy 8601 57
    67. 67. HTML XML XHTML XTM SKOS RDFa Microformats OWL RDF/RDFS FOAF eRDF Dublin Core 58

    ×