Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using AI to unleash the power of unstructured government data

595 views

Published on

Applications and examples of natural language processing (NLP) across government. https://deloi.tt/305Zjt4

Published in: Business
  • Be the first to comment

  • Be the first to like this

Using AI to unleash the power of unstructured government data

  1. 1. Using AI to unleash the power of unstructured government data: Applications and examples of natural language processing (NLP) across government Deloitte Center for Government Insights Copyright © 2019 Deloitte Development LLC. All rights reserved.
  2. 2. Why is NLP required? Government agencies awash in unstructured data. Difficult to analyze unstructured text. Useful information may be trapped inside the data. How to tap into such information? Solution Derive actionable insights Connect the dots Facilitate policy analysis Natural Language Processing (NLP) But… Copyright © 2019 Deloitte Development LLC. All rights reserved.
  3. 3. What is NLP? • A branch of Computer Science • Also known as Computational Linguistics • Allows the computer to communicate with humans (text, audio, …) Help generate information and absorb information Use technology from • Linguistics • AI • Machine learning • Formal Language Theory Ultimate goal is to help communication Copyright © 2019 Deloitte Development LLC. All rights reserved. Source: Ayn de Jesus, “AI for speech recognition”, Aug.2018, SAS Institute Inc., “Natural language processing: What it is and why it matters,” accessed December 19,2018.
  4. 4. NLP The Evolution of NLP and underlying algorithms Turing test of “Computing Machinery and Intelligence” Advanced speech recognition technologies Topic modelling introduced IBM sponsored the ‘Index Thomistic us’ a computer- readable compilation of St. Aquinas’ works Richer Statistical Models Georgetown Russian translation experiment Machine Learning algorithms introduced Natural Language Generation takes off Pattern recognition and “nearest neighbour” algorithms Source: Roberto Busa, S.J., and the Invention of the Machine-Generated Concordance, Bhargav Shah,"The power of natural language processing: Today's boom in artificial intelligence", Medium, Chris Smith et al., "The history of artificial intelligence," University of Washington, Eric Eaton, "Introduction to machine learning," presentation, University of Pennsylvania, Kendall Fortney, "Pre- processing in natural language machine learning," Towards Data Science, Clark Boyd, "The past, present, and future of speech recognition technology," Medium, Hofmann, “Probabilistic latent semantic indexing,” Proceedings of the twenty-second Annual International SIGIR Conference on Research and Development in Information Retrieval; Robert Dale, Barbara Di Eugenio, and Donia Scott, “Introduction to the special issue on natural language generation,” September 1998, Medium, “History and frontier of the neural machine translation,” August 17, 2017; Ram Menon, “The rise of the conversational AI,” Forbes, December 4, 2017. The term ‘deep learning’ introduced Advanced topic models such as LDA introduced Neural Machine Translation gets implemented Conversational AI gathers momentum 1949 1954 1980s 2000s 2006 2017 1950 1960s 1990s 2003 2015-16 Copyright © 2019 Deloitte Development LLC. All rights reserved.
  5. 5. Key capabilities of NLP Information Extraction Automatically extracts structured information from unstructured documents. Text Categorization Automatically categorizes documents with a predefined set of categories Text Clustering Groups documents into clusters so that documents within a cluster. Establishes semantic relations between entities. Relationship extraction Named Entity Resolution Automatically extracts and classifies the named entities into pre- defined labels and links them to a specific ontology. Automatically uncovers hidden topics from large collections of documents. Topic Modeling Decodes the meaning behind human language and helps analyze people’s sentiment. Sentiment Analysis Copyright © 2019 Deloitte Development LLC. All rights reserved. Source: Fabrizio Sebastiani,”Text categorization”, 2005. Meaning Cloud, “What is text clustering?,” Paul A. Watters, ”Named entity resolution in social media,” Elsevier, 2016, Nguyen Bach and Sameer Badaskar, “A review of relation extraction”, May 2011, Stanford.edu. “Sentiment Analysis- What is Sentiment Analysis”.
  6. 6. NLP cuts across domains and can help address critical government issues Healthcare Defense and National Security Financial Services Energy and Environment Analyze Public Feedback Improve predictions Improve regulatory compliance Enhance policy analysis Government IssuesDomains NLP Copyright © 2019 Deloitte Development LLC. All rights reserved.
  7. 7. Improving predictions at the US Food and Drug Administration (FDA) The National Center for Toxicological Research (NCTR) at FDA used NLP to identify relevant drug groups. Topic modeling was used on: ➢ 10 years of reports extracted from the FDA’s Adverse Event Reporting System (FAERS). ➢ Over 60,000 drug adverse event pairs. The objective: To better predict potential adverse drug reactions. Source: Mitra Rocca, “Lessons Learned from NLP Implementations at FDA,” U.S. Food and Drug Administration, June 15, 2017. Copyright © 2019 Deloitte Development LLC. All rights reserved.
  8. 8. DoD’s Defense Advanced Research Projects Agency (DARPA), launched in 2012, the Deep Exploration and Filtering of Text (DEFT) program. The DEFT program uses NLP in an effort to uncover connections implicit in large text documents. The objective: To improve the efficiency of defense and intelligence analysts who investigate multiple documents to detect anomalies and causal relationships. Improving forensics investigations at DARPA, Department of Defense Source: Boyan Onyshkevych, “Deep Exploration and Filtering of Text (DEFT),” US Department of Defense, Defense Advanced Research Projects Agency. Copyright © 2019 Deloitte Development LLC. All rights reserved.
  9. 9. Analyzing public feedback for the Government of UK Source: Gov.UK, “Understanding More from User Feedback,” by Dan Heron, Data in Government Blog, November 9, 2016, https://dataingovernment.blog.gov.uk/2016/11/09/understanding-more-from-user-feedback/ . The UK government uses Latent Dirichlet Allocation (LDA), a method of topic modeling, to analyze public comments on GOV.UK. LDA is designed to help the government to uncover any relation between customer complaints and comments. For e.g., mortgage complaints often contain allegations of racial discrimination. The objective: To enable the government to better address public feedback. Copyright © 2019 Deloitte Development LLC. All rights reserved.
  10. 10. The Center for Tobacco Products (CTP), part of the FDA, uses topic modeling to identify key terms and cluster documents based on topics (menthol, youth consumption of menthol etc.) The objective: ➢ To understand the impact of manufacture and distribution of tobacco products. ➢ To inform policy-making, particularly concerning the implicit marketing of tobacco products to youths. Informing policy-making at the Food and Drug Administration (FDA) Source: U.S. Food and Drug Administration, “Data Mining at FDA,” by Hesha J. Duggirala et al, August 20, 2018, https://www.fda.gov/scienceresearch/dataminingatfda/ucm446239.html. Copyright © 2019 Deloitte Development LLC. All rights reserved.
  11. 11. Getting started with NLP Define the problem Build the team Identify the data Develop models Test and deploy the model Define the problem that the agency faces and identify which technologies, including NLP, might best address them. Create a team at the beginning of the project and define specific responsibilities. Recruit data science experts from outside to build a robust capability. Clean the data, create labels and perform exploratory analysis. Some datasets may be easily acquired; others may not be in a machine-readable format. Develop the NLP models that best suit the needs of the initiative leaders. The data science team could develop ways to reuse the data and codes in the future. Amend the NLP model based on user feedback and deploy it after thorough testing. Most importantly, train the end users. Copyright © 2019 Deloitte Development LLC. All rights reserved.
  12. 12. Contacts William D. Eggers Executive Director, Deloitte Center for Government Insights Deloitte Services LP +1 571 882 6585 weggers@deloitte.com Matt Gracie Managing Director, Strategy and Analytics, Deloitte Consulting LLP +1 410 507 7839 magracie@deloitte.com Copyright © 2019 Deloitte Development LLC. All rights reserved.
  13. 13. About Deloitte Deloitte refers to one or more of Deloitte Touche Tohmatsu Limited, a UK private company limited by guarantee (“DTTL”), its network of member firms, and their related entities. DTTL and each of its member firms are legally separate and independent entities. DTTL (also referred to as “Deloitte Global”) does not provide services to clients. In the United States, Deloitte refers to one or more of the US member firms of DTTL, their related entities that operate using the “Deloitte” name in the United States and their respective affiliates. Certain services may not be available to attest clients under the rules and regulations of public accounting. Please see www.deloitte.com/about to learn more about our global network of member firms. Disclaimer This publication contains general information only and Deloitte is not, by means of this publication, rendering accounting, business, financial, investment, legal, tax, or other professional advice or services. This publication is not a substitute for such professional advice or services, nor should it be used as a basis for any decision or action that may affect your business. Before making any decision or taking any action that may affect your business, you should consult a qualified professional advisor. Deloitte shall not be responsible for any loss sustained by any person who relies on this publication. Copyright © 2019 Deloitte Development LLC. All rights reserved.

×