Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Ai & I at Work


Published on

Presentation about AI to the University of Texas in Dallas students in the MIS department

Published in: Education
  • Be the first to comment

  • Be the first to like this

The Ai & I at Work

  1. 1. The AI & I at Work Tarek Hoteit – PhD, IT Director TR Labs at Thomson Reuters October 19, 2018 – University of Texas in Dallas MIS Club
  2. 2. Agenda • Short history of Data Science & AI • Data Science and AI together 4 ever • Practical Techniques for data scientists when using AI • Real demos for work
  3. 3. History of Data Science 1962 John W. Turkey predicted effect of modern-day electronic computing on data analysis as an empirical science1
  4. 4. History of Data Science 1965 “Programma 101” 1st commercial programmable desktop calculator
  5. 5. History of Data Science 1981 IBM released 1st personal computer, followed by Apple in 1983 with GUI
  6. 6. Fast forward 20 years later….
  7. 7. History of Data Science Data Scientists (from Chapter 3: Roles & responsibilities of individuals and institutions ) The interests of data scientists – the information and computer scientists, database and software engineers and programmers, disciplinary experts, curators and expert annotators, librarians, archivists, and others, who are crucial to the successful management of a digital data collection – lie in having their creativity and intellectual contributions fully recognized. In pursuing these interests, they have the responsibility to: • conduct creative inquiry and analysis; enhance through consultation, collaboration, and coordination the ability of others to conduct research and education using digital data collections; • be at the forefront in developing innovative concepts in database technology and information sciences, including methods for data visualization and information discovery, and applying these in the fields of science and education relevant to the collection; • implement best practices and technology; serve as a mentor to beginning or transitioning investigators, students and others interested in pursuing data science; • design and implement education and outreach programs that make the benefits of data collections and digital information science available to the broadest possible range of researchers, educators, students, and the general public. 2005 National Science Board advocates data science career
  8. 8. History of Data Science 2010 data science takes center stage in computer technology / customers use more technology devices, social media, mobile & machines become faster
  9. 9. In the mean time….
  10. 10. 1956 -The 1956 Dartmouth summer research project on artificial intelligence was initiated August. 31, 1955 proposal authored by: J. McCarthy, Dartmouth College M. L. Minsky, Harvard University N. Rochester, I.B.M. Corporation C.E. Shannon, Bell Telephone Laboratories History of AI
  11. 11. 1968 – Space Odyssey 2001 by Stanley Kubrick is released featuring intelligent computer, HAL 9000. History of AI
  12. 12. 1950 – 60s : reasoning AI, prototypes – high interests 1971: winter AI came up 1980s – 1990s: another hype with expert systems, neural networks, 1990s: AI Winter 2 History of AI
  13. 13. Late 90’s 2000’s – hype starts again (Deep Blue beats Kasparov in chess 2006 – University of Toronto develops deep learning 2011 – Watson wins at Jeopardy 2016 – Alpha Go beats GO champions History of AI 2017 – Alpha Go Zero beats Alpha Go 100 to 0 after starting from scratch
  14. 14. Now everyone is into artificial intelligence
  15. 15. So where does data science and artificial intelligence cross its other?
  16. 16. Data Science, Artificial Intelligence cross path in all places
  17. 17. We now have two types of Data Scientists Data Scientists Type A – Analytical • Focuses on the why • Heavy on statistics, machine learning fundamentals, data wrangling • Use Python/R, SQL Data Scientist Type B – Builder/Machine Learning Engineer • Focused on creating new products • Heavy on machine learning, software engineering, linear algebra and differential equations • Use Python/Java/Scala, Docker, cloud computing Jesse Steinweg-Woods
  18. 18. Common coding grounds? Python favorable among machine learning and data science jobs Based on last updated late 2017
  19. 19. Python & Data Science libraries are heavily used for data analysis “The number of Data Scientists is constantly growing and at the moment the number of Data Scientists is larger than the number of Web Developers among Python users.” – JetBrains 2018 “The State of Developer Ecosystem Survey in 2018” osystem-2018/python/
  20. 20. Note: Java and JavaScript are still the most popular programming languages for developers but more people continue to learn Python JetBrains 2018 “The State of Developer Ecosystem Survey in 2018”
  21. 21. To move from Data Scientist Type A to Type B You need to build a solid foundation for your data and move up the pyramid Monica Rogati “The AI Hierarchy of Needs”
  22. 22. Some practical coding techniques for Data Scientists
  23. 23. Jupyter Notebooks for data scientists • Python Python Python • Core modules: NumPy, SciPy, MatplotLib • Work environment: Jupyter Notebooks – works with Python, R, C++, Julia and more • Anaconda or VirtualEnv to isolate Python work environment
  24. 24. Complete Coding experience using JupyterLab • Try Jupyter Labs - next- generation web-based user interface • Pip install jupyterlab or conda install -c conda- forge jupyterlab
  25. 25. or cloud based development solutions – Google Collab Free Google Collab . You can leverage their GPUs
  26. 26. More useful resources for researchers • Nurture AI – curated summary of research papers • Auto ML • Public Data Search • Google Dataset Search
  27. 27. AI & I at Work
  28. 28. Sentiment Analysis on Twitter using Django, Docker containers, Python & Google NLP Twitter API using Tweepy Python Library & Twitter Dev Account Local Docker running PostGresql database Django & Python to run and manage the code and data Google Cloud Natural Language Processing SDK to run sentiment analysis GITHUB Source Code:
  29. 29. Training Google AutoML for categorizing customer reviews Searched for a dataset on setsearch Found “100K+ Scraped Course Reviews from the Coursera Website (As of May 2017) Analyzed the data, cleaned when necessary (pretraining step) Created Google Cloud AutoML project & activiated NLP APIs, uploaded data No AI expertise needed! Dataset import Train/Evaluate/Predict model GitHub Source Code
  30. 30. Fun time with AWS DeepLens - Deep learning-enabled video camera Chose a project template on Registered Deeplens Device & Deployed Project Model configured using SageMaker, in this case: SSD architecture with a ResNet-50 feature extractor on S3, accessible via Lambda