Prof. Eric Nyberg (CMU) poster for IBM Yorktown Cognitive Systems Institute (Oct 30, 2014)

  1. 1. Cognitive Information Agents: Effective Learning in the Wild Eric Nyberg, Professor & Director, Master of Computational Data Science Program “Architecture and applications to support intelligent, natural interaction with all kinds of information in support of complex human tasks.” • Extended Configuration Description (ECD): Specification language to describe space of analytic configurations for a task [1] • Configuration Space Exploration (CSE): Evaluation and selection of best-performing analytic configuration(s) for a task [1,2,3] • Phased Ranking Models: Rank outputs of any multi-phase, multi-strategy system based on the features of the derivation paths that produced them [4] • Automatic Source Expansion: Multi-faceted machine reading to improve in-task performance on a specific topic [5]; pioneered in Watson [6]; trained on human-labeled relevance judgments Architecture Automatically build and execute analytic solutions Perform 1 Specification of required analytic input/output types, desired information sources, example dataset. Learn Reflect Sample Applications 2 Train Measure Proactively evaluate task performance, analyze errors, propose learning tasks Bioinformatics Question Answering (BioQA): Document and passage retrieval which can be automatically optimized for new datasets (applied to TREC Genomics, CLEF and and corporate sponsor datasets)[2,3] Question Answering for Decision Support (QUADS): Automatically learn how to leverage QA systems to support complex human decision-making with multiple decision factors (for gene target prediction and product ranking)[7] Team 1. Garduno, E., Yang, Z., Maiberg, A., McCormack, C., Fang, Y. and E. Nyberg (2013). “CSE Framework: A UIMA-based Distributed System for Configuration Space Exploration”, Proceedings of the 3rd Workshop on Unstructured Information Management Architecture, International Conference of the German Society for Computational Linguistics and Language Technology. 2. Yang, Z., Garduno, E., Fang, Y., Maiberg, A., McCormack, C. and Nyberg, E. (2013). “Building Optimal Information Systems Automatically: Configuration Space Exploration for Biomedical Information Systems”, Proceedings of the ACM CIKM Conference. 3. A. Patel, Z. Yang, E. Nyberg, and T. Mitamura (2013). “Building an Optimal QA System Automatically Using Configuration Space Exploration for QA4MRE”, Proceedings of CLEF 2013. 4. Liu, R. and Nyberg, E. (2013). “A Phased Ranking Model for Question Answering”, Proceedings of the ACM Conference on Information and Knowledge Management. 5. Schlaefer, N. (2012). Statistical Source Expansion for Question Answering, Ph.D. Thesis, Language Technologies Institute, School of Computer Science, Carnegie Mellon University. 6. N. Schlaefer, J. Chu-Carroll, E. Nyberg, J. Fan, W. Zadrozny, D. Ferrucci (2011). “Statistical Source Expansion for Question Answering”, Proceedings of the ACM CIKM Conference. 7. Z. Yang, Y. Li, J. Cai, and E. Nyberg (2014). “QUADS: Question Answering for Decision Support”, Proceedings of the ACM SIGIR Conference on Information Retrieval, 2014. 3 Subject Matter Experts (SMEs) Analyst’s Information Need Configure Optimize Automatically execute learning tasks, update models, KBs, etc. Machine Learning Agents • Targeted Machine Reading • E-R Extraction • Set Extension • Clarification Dialogs • Type/instance knowledge • Concept learning Crowdsourcing • Type instance labeling • New feature extraction • Relevance judgments Rui Liu Ph.D. Candidate Phased Ranking Models Leo Boytsov Ph.D. Candidate BioQA, Semantic Retrieval Hugo Rodriguez Ph.D. Candidate Question Generation Di Wang Ph.D. Candidate Source Expansion Zi Yang Ph.D. Candidate CSE, BioQA, QUADS Avner Maiberg MLT Candidate ECD, CSE, BioQA Eric Nyberg Team Leader