InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
InfoChem Product Presentation
ICIC - International Conference for the Information Community
Heidelberg, Germany, October 23, 2017
Dr. Valentina Eigner-Pitto
1 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
InfoChem at a Glance
People
• 24 employees
(Munich office)
• 1 consultants in UK
• Team of freelance abstractors
(residing offshore)
Company
• specialized in chemoinformatics
• founded in 1989
• based in Munich, Germany
• owned by Springer Nature
Business Areas
• Software products
• Projects
• Text/Data mining
• Database building
Customers
• Pharmaceutical industry
• Chemical industry
• Scientific publishers
• Academia
• IP Professionals
Company overview2 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
InfoChem News
Company overview
• Since 2015 100% Springer Nature
• Sommer 2016 moved the Springer
Nature Munich offices
• Since 2017 report in the Nature
Research Group
3 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
InfoChem News
Company overview
• Since 2015 100% Springer Nature
• Sommer 2016 moved the Springer
Nature Munich offices
• Since 2017 report in the Nature
Research Group
4 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
Business Areas
Software
• ICFSE, ICCARTRIDGE, ICCHEMDESK
• ICMAP, CLASSIFY, ICNameRXN
• ICSYNTH, ICFRP, ICEDIT, ICTOOLS
• Markush…
Services
• Project development
• Consulting
• Database building
Content
• SPRESIweb, SPRESImobile
• Patents / Structures
• Chemisches Zentralblatt
Structural Database
Data Mining Projects
• Chemical entity recognition
• Name to structure conversion
• Image to structure conversion
• ChemDraw CDX files work-up
Company overview5 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
Services
• Project development
• Consulting
• Database building
Content
• SPRESIweb, SPRESImobile
• Patents / Structures
• Chemisches Zentralblatt
Structural Database
Software
• ICFSE, ICCARTRIDGE, ICCHEMDESK
• ICMAP, CLASSIFY, ICNameRXN
• ICSYNTH, ICFRP, ICEDIT, ICTOOLS
• Markush…
Business Areas
Company overview
Data Mining Projects
• Chemical entity recognition
• Name to structure conversion
• Image to structure conversion
• ChemDraw CDX files work-up
6 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
Full-Text Annotation: ICANNOTATOR - ICN2S
Data mining7 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
Full-Text Annotation Service
Data mining
• Buildup and maintainance of dictionaries and taxonomies for annotation
• Development of formats for annotation in and output
(standoff- and inline-annotation)
• Integration of annotation-modules in flexible workflows (e.g. KNIME, Hadoop)
• Processing of mass data with Docker technology
• Development of workflows for real-time annotation of modified or new
documents
• Results evaluation with high quality, manually curated gold-standards
• Specific dictionaries in the fields of organic, inorganic, pharmaceutical and
biochemistry, diseases, medicine, biology
8 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
Language Extension Packages
Data mining
• German
• French
• Russian
• Chinese
• Korean
• Japanese
9 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
Language 1: German and French
Data mining
Annotation of German and French documents with translation rules and dictionaries
German:
• Proprietary translation rules and dictionaries
• CZB processing: OCR and annotation challenges
• DB: 1 million unique names, 500,000 unique structures
French:
• Translation rules are collected by French speaking chemists
• Fragment dictionary created for parts of chemical names that differ from English
10 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
Language 1: Russian
Data mining
Annotation of Russian patent documents
• Stemming (identification of chemical name, without declination endings)
• Transliteration
• Translation
11 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
Language 2: Annotation of Asian Patents
Data mining
• ChemNameTranslate by NextMove Software
for the initial translation of systematic names
• InfoChem proprietary dictionaries for trade,
brand and trivial names including WIPO
machine translated names
• Various filters to reduce the number of false
positive annotations
Annotation of Korean, Chinese and Japanese documents
尿酸 (uric acid) should be
annotated,
but not inside of
高尿酸血症 (hyperuricemia)
12 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
Language 2: Quantitative QA Analysis
Data mining
QA analysis of FP and FN annotations has been performed by native speaking chemists
2,941 annotations
14 patents
3,961 annotations
15 patents
4,352 annotations
13 patents
13 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
WIPO Project: PATENTSCOPE
Data mining
Josef Eiblmaier, ICIC 2016 Heidelberg
Addition of Chemical Search Capabilities to the WIPO PATENTSCOPE Search System
Patent documents dealing with chemistry:
• Chinese 1,800,000
• Japanese 1,000,000
• Korean 500,000
• EPO (E, G, F) 800,000
• Russian 200,000
14 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
• Simplified query form for beginners
• Tree comparison
• Improved cluster handling
• Collaborative platform enabling user team work
• Rxn network graph is also offered as stand alone reaction analysis tool
New in Reaction Handling and Prediction
Other news from InfoChem15 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
New in Reaction Handling and Prediction
Other news from InfoChem16 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
New in Reaction Handling and Prediction
Other news from InfoChem17 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
New in Reaction Handling and Prediction
Other news from InfoChem18 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
New in Reaction Handling and Prediction
Other news from InfoChem19 / 20
InfoChem Copyright © 2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23
InfoChem GmbH: www.infochem.de, www.spresi.com, info@infochem.de
Visit us at the InfoChem booth!
20 / 20

ICIC 2017: New Poduct presentations InfoChem

  • 1.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 InfoChem Product Presentation ICIC - International Conference for the Information Community Heidelberg, Germany, October 23, 2017 Dr. Valentina Eigner-Pitto 1 / 20
  • 2.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 InfoChem at a Glance People • 24 employees (Munich office) • 1 consultants in UK • Team of freelance abstractors (residing offshore) Company • specialized in chemoinformatics • founded in 1989 • based in Munich, Germany • owned by Springer Nature Business Areas • Software products • Projects • Text/Data mining • Database building Customers • Pharmaceutical industry • Chemical industry • Scientific publishers • Academia • IP Professionals Company overview2 / 20
  • 3.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 InfoChem News Company overview • Since 2015 100% Springer Nature • Sommer 2016 moved the Springer Nature Munich offices • Since 2017 report in the Nature Research Group 3 / 20
  • 4.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 InfoChem News Company overview • Since 2015 100% Springer Nature • Sommer 2016 moved the Springer Nature Munich offices • Since 2017 report in the Nature Research Group 4 / 20
  • 5.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 Business Areas Software • ICFSE, ICCARTRIDGE, ICCHEMDESK • ICMAP, CLASSIFY, ICNameRXN • ICSYNTH, ICFRP, ICEDIT, ICTOOLS • Markush… Services • Project development • Consulting • Database building Content • SPRESIweb, SPRESImobile • Patents / Structures • Chemisches Zentralblatt Structural Database Data Mining Projects • Chemical entity recognition • Name to structure conversion • Image to structure conversion • ChemDraw CDX files work-up Company overview5 / 20
  • 6.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 Services • Project development • Consulting • Database building Content • SPRESIweb, SPRESImobile • Patents / Structures • Chemisches Zentralblatt Structural Database Software • ICFSE, ICCARTRIDGE, ICCHEMDESK • ICMAP, CLASSIFY, ICNameRXN • ICSYNTH, ICFRP, ICEDIT, ICTOOLS • Markush… Business Areas Company overview Data Mining Projects • Chemical entity recognition • Name to structure conversion • Image to structure conversion • ChemDraw CDX files work-up 6 / 20
  • 7.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 Full-Text Annotation: ICANNOTATOR - ICN2S Data mining7 / 20
  • 8.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 Full-Text Annotation Service Data mining • Buildup and maintainance of dictionaries and taxonomies for annotation • Development of formats for annotation in and output (standoff- and inline-annotation) • Integration of annotation-modules in flexible workflows (e.g. KNIME, Hadoop) • Processing of mass data with Docker technology • Development of workflows for real-time annotation of modified or new documents • Results evaluation with high quality, manually curated gold-standards • Specific dictionaries in the fields of organic, inorganic, pharmaceutical and biochemistry, diseases, medicine, biology 8 / 20
  • 9.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 Language Extension Packages Data mining • German • French • Russian • Chinese • Korean • Japanese 9 / 20
  • 10.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 Language 1: German and French Data mining Annotation of German and French documents with translation rules and dictionaries German: • Proprietary translation rules and dictionaries • CZB processing: OCR and annotation challenges • DB: 1 million unique names, 500,000 unique structures French: • Translation rules are collected by French speaking chemists • Fragment dictionary created for parts of chemical names that differ from English 10 / 20
  • 11.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 Language 1: Russian Data mining Annotation of Russian patent documents • Stemming (identification of chemical name, without declination endings) • Transliteration • Translation 11 / 20
  • 12.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 Language 2: Annotation of Asian Patents Data mining • ChemNameTranslate by NextMove Software for the initial translation of systematic names • InfoChem proprietary dictionaries for trade, brand and trivial names including WIPO machine translated names • Various filters to reduce the number of false positive annotations Annotation of Korean, Chinese and Japanese documents 尿酸 (uric acid) should be annotated, but not inside of 高尿酸血症 (hyperuricemia) 12 / 20
  • 13.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 Language 2: Quantitative QA Analysis Data mining QA analysis of FP and FN annotations has been performed by native speaking chemists 2,941 annotations 14 patents 3,961 annotations 15 patents 4,352 annotations 13 patents 13 / 20
  • 14.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 WIPO Project: PATENTSCOPE Data mining Josef Eiblmaier, ICIC 2016 Heidelberg Addition of Chemical Search Capabilities to the WIPO PATENTSCOPE Search System Patent documents dealing with chemistry: • Chinese 1,800,000 • Japanese 1,000,000 • Korean 500,000 • EPO (E, G, F) 800,000 • Russian 200,000 14 / 20
  • 15.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 • Simplified query form for beginners • Tree comparison • Improved cluster handling • Collaborative platform enabling user team work • Rxn network graph is also offered as stand alone reaction analysis tool New in Reaction Handling and Prediction Other news from InfoChem15 / 20
  • 16.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 New in Reaction Handling and Prediction Other news from InfoChem16 / 20
  • 17.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 New in Reaction Handling and Prediction Other news from InfoChem17 / 20
  • 18.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 New in Reaction Handling and Prediction Other news from InfoChem18 / 20
  • 19.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 New in Reaction Handling and Prediction Other news from InfoChem19 / 20
  • 20.
    InfoChem Copyright ©2017 Dr. Valentina Eigner PittoProduct Presentation ICIC 2017, Heidelberg, Germany, October 23 InfoChem GmbH: www.infochem.de, www.spresi.com, info@infochem.de Visit us at the InfoChem booth! 20 / 20