Your SlideShare is downloading. ×
Found in Space: Creating and Visualizing IEEE Document Space
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Found in Space: Creating and Visualizing IEEE Document Space

522
views

Published on

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
522
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • A 125 year professional society, with over 148 journals, conference transactions and magazinesSponsor approx 800 conferences annuallyTotal Membership over 400,000 as of Dec 31, 2009Span the globe, with participation in 160 countries
  • We knew there was “gold in them thare hills!” but how to unlock it?As a leading source of research materials, could we extract new directions?Are the societies living up to their charters and covering the topical areas they think they are?Are there trends that are just a spike in interest or are they really emerging? Are they still vigorously being investigated or were they just a flash in the pan?What other things might we learn?Introducing Dick Klavens
  • Access Innovations and its software brand Data Harmony are known for the high caliber of data. It is clean, well formed and very accurately semantically enriched. They updated the IEEE thesaurus in 2005, building a rule base for use in indexing at the same time. The application of the terms to the IEEE content was 90% accurate – that is 90% of the terms suggested are what well trained indexers would use from a controlled vocabulary, and 80% accurate from the more difficult proceedings data at launch of the project. Since that time the rule base has improved over time and the IEEE production team only needs to spot check about 10% of the documents to insure a high standard of indexing is maintained. It has allowed IEEE to process a lot more documents with the same team and made the process more fun at the same time. The indexers are allowed time to think about the content, the thesaurus terms, what should be added and what other information can be collected to continue to enrich the files because the Data harmony software removes many of the clerical aspects of the indexing process, leveraging the mental processing of the staff. The accuracy is high enough that we simply indexed the entire contents of the eXplore database back to the earliest records in a single overnight process. Then to explore the edges of science we also indexed the 1.2 million records using Medical Subject headings and the defense Technical Information Center thesauri with similar accuracy results.
  • Two bases for the collaboration: Our reputation for accuracy- in this case- how to do the layouts so that the ‘picture’ is accurate Our committement to peripheral vision- doing global vs. local maps.
  • Transcript

    • 1. SciTech Strategies, Inc. Better Maps Better Decisions Found in Space:Creating and Visualizing IEEE Document Space IEEE William Pickering Access Innovations / Data Harmony Marjorie M.K. Hlava SciTech Strategies Dick Klavans June 13, 2011
    • 2. Agenda  IEEE Challenge » Where are our publication strengths? » What are the emerging topics?  Access Innovation‟s Response » Expanding the IEEE Thesaurus  SciTech‟s Support » Mapping the Expanded IEEE Thesaurus  Lessons LearnedSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 2
    • 3. Who is the IEEE?3
    • 4. About IEEE… Founded in 1884, IEEE is the world’s largest professional association advancing technology for the benefit of humanity. We publish 150 technical journals, transactions and magazines, sponsor nearly 1200 conferences annually, develop technology standards, and support the professional interests of more than 400,000 members in over 160 countries. Members participate in 38 societies and 7 councils The IEEE Xplore® digital library provides access to IEEE journals, transactions, letters, magazines and conference proceedings, IET and other 3rd Party journals and conference proceedings, IEEE Standards and IEEE educational courses. – Approaching 3 million documents4
    • 5. Specific Challenges Is there a way, using our own information, to forecast our direction? Where is the industry headed? What about by technology sector? Does our coverage match our mission and vision? Can we become smarter about our data and potential markets using our collection in new ways? Are the societies publishing and talking about what their charter indicates they cover? What are the trends – are topics emerging/cooling? Can we use technology and our own data to explore these questions while enhancing our data?5
    • 6. Access Innovation’s Response  Access Innovation‟s Thesaurus  Expanding the IEEE Thesaurus  Requirements for VisualizationSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 6
    • 7. Access Innovations / Data Harmony  Founded in 1978  Suite of Semantic Enrichment tools  Updated the IEEE Thesaurus in 2005  Built a rule base to auto index IEEE content » “90 % accuracy out of the box on journal data”* » “80% out of the box on proceedings data”*  Auto indexed 1.2 million Xplore records » With the IEEE thesaurus terms rule base » With the MeSH rule base » With DTIC rule base *Adam D. Philippidis, Manager, Indexing & Database Production, IEEESciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 7
    • 8. Mapping IEEE thesaurus space  We are more interested in an expanded map that includes adjacencies to the IEEE data » Expanded term set shows adjacent white space; opportunities for expansion » Similar process to that for simple map except … » We need additional terms to add  Criteria for additional terms » Low occurrence rate in IEEE documents » Linkage to terms in IEEE documents » Similar level of detail to current IEEE thesaurus terms  Where do we find these terms? How can we add them?SciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 8
    • 9. Defining expanded term space 1. Select related corpus 14k DTIC 2k terms IEEE 475k patents PubMed 1.2M documents 525k docs 24k MeSHSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 9
    • 10. Defining expanded term space 2. Identify related terms 2k terms IEEE 1.2M documentsSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 10
    • 11. Defining expanded term space 2. Identify related terms 2k terms IEEE 1.2M documentsSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 11
    • 12. Defining expanded term space 3. Resulting term set 2k terms IEEE 1.2M documentsSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 12
    • 13. Defining expanded term space 4. Term:Term MatrixSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 13
    • 14. Requirements for Visualization  From a society / publisher perspective » Which topical areas form our core? periphery? » Where is the coverage dense? thin? » Which topical areas are most active? least active? » Which topical areas seem to be emerging? declining? » Which topical areas are interrelated? isolated? » What are the overlaps between journals / segments? » Where are the potential expansion points?  From a thesaurus perspective » What terms are too broadly defined? » How do actual topical relationships differ from the thesaurus structure?SciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 14
    • 15. SciTech Strategies, Inc.  Founded in 1982 (Center for Research Planning)  Using Bibliometric to Identify „Micro-communities‟  Better Maps » Accuracy  Better Decisions » Peripheral VisionSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 15
    • 16. Conference StrategySciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 16
    • 17. Publication Strategy JASIST referenceSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 17
    • 18. Requirements  From a society / publisher perspective » Identify Core, Boundary and Cross Border » Provides Indicators  Activity  Growth  Relatedness  Centrality » Locates Journal domains  From a thesaurus perspective » Identifies terms that are too broadly defined » Potential Improvements in thesaurus structure using topic structuresSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 18
    • 19. Visualization Strategies Visualization Matrix SoftwareSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 19
    • 20. Radial VisualizationSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 20
    • 21. Instrumentation Compon, Dielectr El Instr Ultrason, Electromag Packag … Insul Soc Measur Soc Ferro … Compat Soc Prod Saf Council Magnetics Sensors Antennas Engng Soc Supercond Soc Council Propag Soc Nanotech Oceanic Geosci Rem Nucl Plasma Council Engng Soc Sens Soc Sci SocSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony
    • 22. Power / Circuits Power Power & Industry Industr Electron Soc Energy Soc Appl Soc Electr Soc Electron Circuits & Solid St Microwave Dev Soc Systems Circuits Soc Theory SocSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 22
    • 23. Additional Profiles Photonics Eng Med Electromag Antennas Soc Biol Sci Compat Soc Propag Soc Commun Vehicular Consumer Broadcast Soc Techn Soc Electr Soc Techn Soc Aerosp Electr Intell Transp Info Theory Sys Soc Sys Soc SocSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 23
    • 24. Diverse Profiles Reliability Prof Education Council Electr Society Commun Society Design Auto Society Robot Social Sys Man Computer Autom Soc Impl Techn Cyber Intelligence Society Society Control Systems Computer Signal Sys Soc Council Society Proc Soc 24SciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 24
    • 25. IEEE Portfolio Electromag Compat Soc Prof Reliability Commun Society Society Education Sensors Ultrason, Robot Society Oceanic Council Ferro … Autom Soc Engng Soc Instr Measur Soc CouncilDielectr El Nucl Plasma SupercondInsul Soc Sys Man Sci Soc Computer Cyber Prod Saf Society Photonics Compon, Systems Society Engng Soc Magnetics Council Soc Packag … Soc Nanotech Social Council Impl Techn Computer Intelligence Society Eng Med Biol Sci Council Electr Design Auto Industr Industry Geosci Rem Electr Soc Appl Soc Sens Soc Antennas Propag Soc Power Power & Electron Soc Microwave Energy Soc Theory Soc Circuits & Signal Consumer Systems Electron Proc Soc Electr Soc Dev Soc Broadcast Intell Transp Techn Soc Sys Soc Solid St Circuits Soc Aerosp Electr Vehicular Sys Soc Techn Soc Commun Soc Info Theory SocSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 25
    • 26. Lessons Learned  Map didn‟t „feel right‟  Many Terms are too broadly defined  Effective Maps require » More contextual data » More detailed data » Natural classification methodsSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 26
    • 27. Maps didn’t feel right Previous Experience IEEE ExperienceSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 27
    • 28. Terms are too Broadly DefinedSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 28
    • 29. Use a Thesaurus to Label Maps Construction Packaging Consumer Products Vehicles, Parts Welding Gearing Automotive + Flow Defense Boats Appliances Food Brakes Hygiene Aircraft Dynamics Sprayers Cleaning IC Engines Turbines Industrial Pumps ValvesProducts Exhaust Leisure Fitness Outerwear Footwear Control Medical Pipes Devices Toys Health Care Clocks Games Blasting Radiology Cooling Measurement Energy Med Instruments Agriculture Cables Heating Plants, Micro-orgs Conveyers Oilfield Services Pharma Lamps Components Printing Telecom Computer Motors Acyclic Comp HW/SW Semiconductors Lubricants Metals Optics Lasers Rubber Molding Paper Displays Electronics Catalysis Magn/Elect Conductors Layers Circuits Textiles Electrochem Magnets Macromolecules Disk Amplifiers Photochem Chemicals CoatingsSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 29
    • 30. Future Improvements Current Term:Term Matrix Proposed Term:Term MatrixSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 30
    • 31. Future Improvements  Use citations and/or text to generate maps  Use thesaurus to label mapsSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 31
    • 32. Thank youSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 32
    • 33. IEEE Portfolio Electromag Compat Soc Prof Reliability Commun Society Society Education Sensors Ultrason, Robot Society Oceanic Council Ferro … Autom Soc Engng Soc Instr Measur Soc CouncilDielectr El Nucl Plasma Insul Soc Supercond Sys Man Sci Soc Computer Cyber Prod Saf Society Photonics Compon, Systems Society Engng Soc Magnetics Council Soc Packag … Soc Nanotech Social Council Impl Techn Computer Intelligence Society Eng Med Biol Sci Council Electr Design Auto Industr Industry Geosci Rem Electr Soc Appl Soc Sens Soc Antennas Propag Soc Power Power & Electron Soc Microwave Energy Soc Theory Soc Circuits & Signal Consumer Systems Electron Proc Soc Electr Soc Dev Soc Broadcast Intell Transp Techn Soc Sys Soc Solid St Circuits Soc Aerosp Electr Vehicular Sys Soc Techn Soc Commun Soc Info Theory Soc SciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access InnovSciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 33
    • 34. SciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony
    • 35. SciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony
    • 36. SciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 36
    • 37. SciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 37
    • 38. SciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 38
    • 39. SciTech Strategies Better Maps Better Decisions Well Formed Data • Semantic Enrichment • Access Innovations • Data Harmony 39
    • 40. SciTech Strategies, Inc.Better Maps Better Decisions Thank YouIEEE William PickeringAccess Innovations / Data Harmony Marjorie M.K. HlavaSciTech Strategies Dick Klavans June 13, 2011