Presented by David Taieb, Architect, IBM Cloud Data Services
Along with Spark Streaming, Spark SQL and GraphX, MLLib is one of the four key architectural components of Spark. It provides easy-to-use (even for beginners), powerful Machine Learning APIs that are designed to work in parallel using Spark RDDs. In this session, we’ll introduce the different algorithms available in MLLib, e.g. supervised learning with classification (binary and multi class) and regression but also unsupervised learning with clustering (K-means) and recommendation systems. We’ll conclude the presentation with a deep dive on a sample machine learning application built with Spark MLLib that predicts whether a scheduled flight will be delayed or not. This application trains a model using data from real flight information. The labeled flight data is combined with weather data from the “Insight for Weather” service available on IBM Bluemix Cloud Platform to form the training, test and blind data. Even if you are not a black belt in machine learning, you will learn in this session how to leverage powerful Machine Learning algorithms available in Spark to build interesting predictive and prescriptive applications.
About the Speaker: For the last 4 years, David has been the lead architect for the Watson Core UI & Tooling team based in Littleton, Massachusetts. During that time, he led the design and development of a Unified Tooling Platform to support all the Watson Tools including accuracy analysis, test experiments, corpus ingestion, and training data generation. Before that, he was the lead architect for the Domino Server OSGi team responsible for integrating the eXpeditor J2EE Web Container in Domino and building first class APIs for the developer community. He started with IBM in 1996, working on various globalization technologies and products including Domino Global Workbench (used to develop multilingual Notes/Domino NSF applications) and a multilingual Content Management system for the Websphere Application Server. David enjoys sharing his experience by speaking at conferences. You’ll find him at various events like the Unicode conference, Eclipsecon, and Lotusphere. He’s also passionate about building tools that help improve developer productivity and overall experience.
5. Business Analytics Types
Descriptive Analytics Predictive Analytics Prescriptive Analytics
Look at the reason for
past success or failure
What is probably going
to happen in the future?
What’s my best actions?
• Use interactive querying and
visualization to explore and
communicate data
• Discover insight and trends
• correlation between 2
seemingly unrelated
variables
• Data mining
• Generate hypothesis and
models
• Predict occurrence of future
events using probability
(confidence)
• Product recommendations
• Classification
• Help make the right decision
based on the data
• Find optimal solution to a
given problem
6. Taking Analytics a step further with Cognitive Systems
‣ Use natural language processing and machine learning algorithms to unlock knowledge
from massive amount of structured and unstructured data
Decide
• Ingest and analyze domain sources, info models
• Generate evidence based decisions with confidence
• Learn with new outcomes and actions
• e.g. - Next generation Apps Probabilistic Apps
Ask
• Leverage vast amounts of data
• Ask questions for greater insights
• Natural language inquiries
• e.g. - Next generation Chat
Discover
• Find the rationale for given answers
• Prompt for inputs to yield improved responses
• Inspire considerations of new ideas
• e.g. - Next generation Search Discovery
IBM Watson
7. IBM Cloud Data Services
Resources for developers to get, build, and analyze on the IBM Cloud