AIaaS - Artificial Intelligence of the Shelf
by Jana Kludas, Data Scientist @ *um
Data Natives 2017
Outline
• Introduction
• Low level vs. High level AI
• Market overview of AIaaS
• Usage Examples
• Services as Black Box
• Conclusions
27.11.2017 The unbelievable Machine Company 2
Introduction
Definition and Scope of AI
"The study of the computations that make it possible to perceive, reason, and
act” (Winston, 1992)
• Reasoning, Problem solving
• Knowledge representation
• Planning
• Learning (ie. from example/experience)
• Natural Language Processing
• Perception (Vision and other Sensors)
• Acting and Manipulating
• Social Intelligence
• General Intelligence
27.11.2017 The unbelievable Machine Company 4
Everything-as-a-Service (X-a-a-S)
27.11.2017 The unbelievable Machine Company 5
Software-aaS
Platform-aaS
Infrastructure-aaS
Cloud hosters offer simple access and on demand usage of computational resources (can be really
anything!)
Pros Cons
- focus on the core business - dependency on the service provider and a working and fast data
connection
- transparent costs through pay as you use - offer being limited to standard solutions (standardization instead
of innovation)
- reduced development time and investment risk - reduced security of data and transactions
- increased strategic flexibility
Artificial-Intelligence-as-a-Service (AIaaS)
27.11.2017 The unbelievable Machine Company 6
AIaaS
Pros
• improves the Time-to-Value significantly
• fast testing of new approaches to your problems
• no big investments in hardware or software required
… but there are also some downsides that will be discussed in the following
Low level vs. High level AI
Low level AI
Classic Machine Learning algorithms
– Classification, Regression, Clustering
– Bayesian Networks
– Reinforcement Learning, Representation Learning
– Genetic Algorithms
• Solutions or algorithms for classes of problems
• processing pipeline: data collection and pre-processing, training, parameter
optimization, testing
LL AIaaS:
 requires expert knowledge
allows creation of innovative solutions for unsolved, non-standardized problems
27.11.2017 The unbelievable Machine Company 8
High level AI
Deep learning with Artificial Neural Networks (ANNs)
– Convolutional neural network (CNN)
– Autoencoder
– Recurrent neural network (RNN)
– Long/Short Term Memory (LSTM)
– Generative Adversarial Network (GAN)
• problem oriented solutions/algorithms i.e. face recognition, text-to-speech
• solve a standardized problem
HL AIaaS:
• have a simple interface
 easy to handle by non AI experts
27.11.2017 The unbelievable Machine Company 9
Market overview of AIaaS
AIaaS providers
27.11.2017 The unbelievable Machine Company 11
… and more (http://www.butleranalytics.com/20-
machine-learning-service-platforms/) Apr. 2017
27.11.2017 The unbelievable Machine Company 12
AI Services Low level High level
MS Azure ML Studio
- Anomaly Detection
- Classification (binary/multi class)
- Clustering
- Statistical Functions
- Text Analytics
- Computer Vision
Cognitive Services:
- Vision
- Speech
- Knowledge
- Language
- Search
AWS AWS Machine Learning
- Algorithms: Regression, classification
(binary/multiclass)
Amazon Lex:
- natural language understanding (NLU)
- automatic speech recognition (ASR)
Amazon Rekognition:
- visual search and image recognition
Amazon Polly: text-to-speech (TTS)
AWS Deep Learning AMI: custom AI models
Google Cloud Large Scale Machine Learning Service
- Custom models from regression models to image
classification based on deep learning
- Google Cloud Job Discovery
- Google Cloud Video Intelligence
- Google Cloud Vision
- Google Cloud Speech
- Google Natural Language
- Google Cloud Translation
IBM Watson IBM Data Science Experience
- Spark ML algorithms
- RStudio
- Deep Learning libraries
Watson Developer:
- Conversation
- Knowledge (Discovery, NLU, Document Conversion)
- Vision
- Speech
- Language (Translator, Classifier, Retrieve & Rank)
- Empathy (Personality Insights, Tone Analyzer)
Stability and Availability of the Services
Lively growth market:
• waves of selection and consolidation across the market that yield winners
and losers
• regularly new and improved services emerge
• old ones are updated (including the API) or shut down, even within a platform
Search for the best service can be daunting
Challenging for usage in production systems
27.11.2017 The unbelievable Machine Company 13
Usage Examples
27.11.2017 The unbelievable Machine Company 15
Usage of High Level AI
Usage of Low Level AI
Casual bikers Registered bikers
Validation Err Testing Err Validation Err Testing Err
Python 3 (Sklearn) 16.70 (0.42) 11.71 (0.06) 70.39 (0.25) 92.86 (0.21)
Dataiku (standard) 24.33 (0.45) 17.28 (0.01) 89.20 (0.30) 106.80 (0.18)
Dataiku (manual) 24.47 (0.45) 17.16 (0.06) 92.47 (0.27) 93.35 (0.20)
Azure ML Studio 33.62 (0.002) 23.20 (-0.58) 97.32 (0.26) 104.23 (0.15)
27.11.2017 The unbelievable Machine Company 16
Bike Rental Data (UCI) – number of bike rentals in Washington D.C. for casual and registered
users; 2 years of data
Features: weather, season, holidays, weekdays, time
Testing Data: December 2012
Training Data: Rest
Metrics: median or mean absolute error (R2)
Model: Ridge Regression (L2 Norm)
Usage of Low Level AI
27.11.2017 The unbelievable Machine Company 17
Casual bikers Registered bikers
Validation Err Testing Err Validation Err Testing Err
Python 3 (Sklearn) 37.15 23.90 128.09 136.59
Dataiku (standard) 36.53 23.33 124.00 138.80
Dataiku (manual) 36.98 22.65 129.00 137.28
Azure ML Studio 49.95 29.44 130.54 140.84
AWS 32.58 20.93 91.86 105.21
Bike Rental Data (UCI) – number of bike rentals in Washington D.C. for casual and registered
users; 2y of data
Features: weather, season, holidays, weekdays, time
Testing Data: December 2012
Training Data: Rest
Metric: root mean square error (RMSE) – only metric offered by AWS
Model: Ridge Regression (L2 Norm)
27.11.2017 The unbelievable Machine Company 18
User Interfaces
27.11.2017 The unbelievable Machine Company 19
Services as Black Box
?
What’s behind the services?
• mostly not open source
• often not even clear what algorithm is implemented, what parameters are
used
Like any other software: implementations can be buggy
Systematic tests recommended!
• different aspects of AIaaS hinder comparability of services between providers
– no versioning (except: Algorithmia)
– HL AI: specialized tasks, different output formats
– LL AI: Limitations of what is (easily) possible in the environment
27.11.2017 The unbelievable Machine Company 20
What’s the data behind the services? (HL AI)
Important: most algorithms are “learning-by-example”
• What is the model trained on?
– Data biases
– Corner cases
– Missing examples
• Adversarial attacks
27.11.2017 The unbelievable Machine Company 21
The Black Box in the Black Box
General Black Box AI problem
– White Box AI: see talk of my colleague Ulf Schöneberg at Predictive Analytics
World this Tuesday 14.11.
General Data Protection Regulation (GDPR) - goes into effect on May 25, 2018
– defines and strengthens data protection for consumers and harmonizes data security
rules within the EU
– controls on data processing and consumer profiling
– right to an explanation
– prevent discriminatory effects based on racial or ethnic origin, political opinion, religion or
beliefs, trade union membership, genetic or health status or sexual orientation
27.11.2017 The unbelievable Machine Company 22
27.11.2017 The unbelievable Machine Company 23
Conclusions

DN 2017 | AIaaS - Artificial Intelligence off the Shelf | Jana Kludas | um*

  • 1.
    AIaaS - ArtificialIntelligence of the Shelf by Jana Kludas, Data Scientist @ *um Data Natives 2017
  • 2.
    Outline • Introduction • Lowlevel vs. High level AI • Market overview of AIaaS • Usage Examples • Services as Black Box • Conclusions 27.11.2017 The unbelievable Machine Company 2
  • 3.
  • 4.
    Definition and Scopeof AI "The study of the computations that make it possible to perceive, reason, and act” (Winston, 1992) • Reasoning, Problem solving • Knowledge representation • Planning • Learning (ie. from example/experience) • Natural Language Processing • Perception (Vision and other Sensors) • Acting and Manipulating • Social Intelligence • General Intelligence 27.11.2017 The unbelievable Machine Company 4
  • 5.
    Everything-as-a-Service (X-a-a-S) 27.11.2017 Theunbelievable Machine Company 5 Software-aaS Platform-aaS Infrastructure-aaS Cloud hosters offer simple access and on demand usage of computational resources (can be really anything!) Pros Cons - focus on the core business - dependency on the service provider and a working and fast data connection - transparent costs through pay as you use - offer being limited to standard solutions (standardization instead of innovation) - reduced development time and investment risk - reduced security of data and transactions - increased strategic flexibility
  • 6.
    Artificial-Intelligence-as-a-Service (AIaaS) 27.11.2017 Theunbelievable Machine Company 6 AIaaS Pros • improves the Time-to-Value significantly • fast testing of new approaches to your problems • no big investments in hardware or software required … but there are also some downsides that will be discussed in the following
  • 7.
    Low level vs.High level AI
  • 8.
    Low level AI ClassicMachine Learning algorithms – Classification, Regression, Clustering – Bayesian Networks – Reinforcement Learning, Representation Learning – Genetic Algorithms • Solutions or algorithms for classes of problems • processing pipeline: data collection and pre-processing, training, parameter optimization, testing LL AIaaS:  requires expert knowledge allows creation of innovative solutions for unsolved, non-standardized problems 27.11.2017 The unbelievable Machine Company 8
  • 9.
    High level AI Deeplearning with Artificial Neural Networks (ANNs) – Convolutional neural network (CNN) – Autoencoder – Recurrent neural network (RNN) – Long/Short Term Memory (LSTM) – Generative Adversarial Network (GAN) • problem oriented solutions/algorithms i.e. face recognition, text-to-speech • solve a standardized problem HL AIaaS: • have a simple interface  easy to handle by non AI experts 27.11.2017 The unbelievable Machine Company 9
  • 10.
  • 11.
    AIaaS providers 27.11.2017 Theunbelievable Machine Company 11 … and more (http://www.butleranalytics.com/20- machine-learning-service-platforms/) Apr. 2017
  • 12.
    27.11.2017 The unbelievableMachine Company 12 AI Services Low level High level MS Azure ML Studio - Anomaly Detection - Classification (binary/multi class) - Clustering - Statistical Functions - Text Analytics - Computer Vision Cognitive Services: - Vision - Speech - Knowledge - Language - Search AWS AWS Machine Learning - Algorithms: Regression, classification (binary/multiclass) Amazon Lex: - natural language understanding (NLU) - automatic speech recognition (ASR) Amazon Rekognition: - visual search and image recognition Amazon Polly: text-to-speech (TTS) AWS Deep Learning AMI: custom AI models Google Cloud Large Scale Machine Learning Service - Custom models from regression models to image classification based on deep learning - Google Cloud Job Discovery - Google Cloud Video Intelligence - Google Cloud Vision - Google Cloud Speech - Google Natural Language - Google Cloud Translation IBM Watson IBM Data Science Experience - Spark ML algorithms - RStudio - Deep Learning libraries Watson Developer: - Conversation - Knowledge (Discovery, NLU, Document Conversion) - Vision - Speech - Language (Translator, Classifier, Retrieve & Rank) - Empathy (Personality Insights, Tone Analyzer)
  • 13.
    Stability and Availabilityof the Services Lively growth market: • waves of selection and consolidation across the market that yield winners and losers • regularly new and improved services emerge • old ones are updated (including the API) or shut down, even within a platform Search for the best service can be daunting Challenging for usage in production systems 27.11.2017 The unbelievable Machine Company 13
  • 14.
  • 15.
    27.11.2017 The unbelievableMachine Company 15 Usage of High Level AI
  • 16.
    Usage of LowLevel AI Casual bikers Registered bikers Validation Err Testing Err Validation Err Testing Err Python 3 (Sklearn) 16.70 (0.42) 11.71 (0.06) 70.39 (0.25) 92.86 (0.21) Dataiku (standard) 24.33 (0.45) 17.28 (0.01) 89.20 (0.30) 106.80 (0.18) Dataiku (manual) 24.47 (0.45) 17.16 (0.06) 92.47 (0.27) 93.35 (0.20) Azure ML Studio 33.62 (0.002) 23.20 (-0.58) 97.32 (0.26) 104.23 (0.15) 27.11.2017 The unbelievable Machine Company 16 Bike Rental Data (UCI) – number of bike rentals in Washington D.C. for casual and registered users; 2 years of data Features: weather, season, holidays, weekdays, time Testing Data: December 2012 Training Data: Rest Metrics: median or mean absolute error (R2) Model: Ridge Regression (L2 Norm)
  • 17.
    Usage of LowLevel AI 27.11.2017 The unbelievable Machine Company 17 Casual bikers Registered bikers Validation Err Testing Err Validation Err Testing Err Python 3 (Sklearn) 37.15 23.90 128.09 136.59 Dataiku (standard) 36.53 23.33 124.00 138.80 Dataiku (manual) 36.98 22.65 129.00 137.28 Azure ML Studio 49.95 29.44 130.54 140.84 AWS 32.58 20.93 91.86 105.21 Bike Rental Data (UCI) – number of bike rentals in Washington D.C. for casual and registered users; 2y of data Features: weather, season, holidays, weekdays, time Testing Data: December 2012 Training Data: Rest Metric: root mean square error (RMSE) – only metric offered by AWS Model: Ridge Regression (L2 Norm)
  • 18.
    27.11.2017 The unbelievableMachine Company 18 User Interfaces
  • 19.
    27.11.2017 The unbelievableMachine Company 19 Services as Black Box ?
  • 20.
    What’s behind theservices? • mostly not open source • often not even clear what algorithm is implemented, what parameters are used Like any other software: implementations can be buggy Systematic tests recommended! • different aspects of AIaaS hinder comparability of services between providers – no versioning (except: Algorithmia) – HL AI: specialized tasks, different output formats – LL AI: Limitations of what is (easily) possible in the environment 27.11.2017 The unbelievable Machine Company 20
  • 21.
    What’s the databehind the services? (HL AI) Important: most algorithms are “learning-by-example” • What is the model trained on? – Data biases – Corner cases – Missing examples • Adversarial attacks 27.11.2017 The unbelievable Machine Company 21
  • 22.
    The Black Boxin the Black Box General Black Box AI problem – White Box AI: see talk of my colleague Ulf Schöneberg at Predictive Analytics World this Tuesday 14.11. General Data Protection Regulation (GDPR) - goes into effect on May 25, 2018 – defines and strengthens data protection for consumers and harmonizes data security rules within the EU – controls on data processing and consumer profiling – right to an explanation – prevent discriminatory effects based on racial or ethnic origin, political opinion, religion or beliefs, trade union membership, genetic or health status or sexual orientation 27.11.2017 The unbelievable Machine Company 22
  • 23.
    27.11.2017 The unbelievableMachine Company 23 Conclusions

Editor's Notes

  • #4 … get a picture!
  • #5 … a lot of things that are easy for humans are difficult for machines/computers ie. recognize cats in pictures, walking, grabbing things from a table, making decisions AI comprises a wide range of algorithms and application areas, roughly everything that make machines seem smart … In the development of the area, intelligence has been broken down into subproblems: Deduction, inference, guessing, decision-making, also in the presence of incomplete knowledge about the world ways to represent knowledge about the world (objects, properties, categories and relations between objects, situations, events, states and time, causes and effects) and how to reason logically with that knowledge planning several steps ahead - set goals and develop a strategy on how to achieve them, for example in navigating across country or playing chess methods for generating knowledge, study of computer algorithms that improve automatically through experience read and understand human language Understand the surrounding world ie. object recognition, speech recognition Robotics ie. navigation, walking like a human, grabbing objects Affective computing: recognize, interpret, process, and simulate human affects Combination of all of the above … not yet there, and not even close! Basic research started in the 50s End of 90s: early adopters – military, financial industry, aviation (dynamic pricing), … Usage spreading to more and more application areas/ industry sectors due to progress on AI algorithms, mainly driven by deep learning Now: AI hype, everyone wants to do AI, making data driven business decisions – this paves the way for AIaaS
  • #6 IaaS: operation of a technological infrastructure PaaS: development and runtime environment: develop, test, run, and manage applications – supporting the whole life cycle of software; can be public, private or hybrid SaaS: supply and maintenance of software
  • #7 previously, companies required a lot of time and money to build up the technical know how and infrastructure to develop AI applications Particularly when using data and computationally intensive methods, i.e. both common in Deep Learning, the service requester can simply use on demand preconfigured big data infrastructures (GPUs/ Hadoop cluster) and pretrained Neural Nets. Being aware of problems with AiaaS prevents you of making bad business decisions …
  • #8 When talking about AI, it makes sense to discern between high level and low level AI …
  • #9  applicable to a multitude of different tasks i.e. churn prediction or fraud detection before modeling, the data needs to be cleaned (garbage in, garbage out), a subset of variables needs to be selected or new ones need to be created from the existing, and the data needs to be prepared according to the selected algorithm i.e. by normalization
  • #10 … not necessarily required to understand the underlying algorithm … when developed from scratch, takes even more resources and know how that low level AI
  • #11 very active research area; new approaches published regularly AIaaS lively growth market with a multitude of players and newly founded start-ups; high speed development
  • #12 .. More than 20 providers!
  • #13 Overview of the services provided by the 4 big providers … jungle of AIaaS offerings All have more or less of: Accessing data and computing resources, evaluation of algorithms, visualisations of input data and results, different amount of pre-processing, UI, code integration Microsoft Azure: Machine Learning and Cognitive Services https://docs.microsoft.com/en-gb/azure/machine-learning/studio/studio-overview-diagram https://azure.microsoft.com/en-us/services/cognitive-services/ AWS AI: https://aws.amazon.com/blogs/ai/ai-tech-talk-an-overview-of-ai-on-the-aws-platform/ AWS ML: https://aws.amazon.com/aml/details/ Google cloud AI/ML API (https://cloud.google.com/products/machine-learning/?hl=en) and Analytics (https://cloud.google.com/api-analytics/) IBM Watson Developer: https://www.ibm.com/watson/products-services/ IBM Data Science Experience https://datascience.ibm.com/ Jungle of offerings: getting alone a full overview of the market can be a daunting task
  • #14 .. decision for or against a provider is not as easy as with SaaS, where the size of the provider correlates with the stability of the service
  • #16 - Simple interface, usable for non AI experts https://cloud.google.com/vision/?hl=en This is only the test interface, normal access via command line and API Grumpy cat jpeg … Make a screenshot of the results!
  • #17 Experiments not 100% comparable! – different cv strategies, different solvers, different parameter tuning strategies, different evaluation metrics Azure: tried to tweak datatypes manually, but results on training data were bad – really huge MAE R2: coefficient of determination
  • #18 AWS: the most limited platform of all the tested ones! only a single evaluation measure, only one model type, no parameter tuning No data preprocessing possible, not even column selection But has data transform recipes: grouping and quantization (actually leads to a loss of information) Can the results be trusted? Much better than the rest!? Highly unlikely …. RMSE: weights larger errors higher …
  • #19 Screen shots …
  • #21 documentation gives formulas or links to a publication that the implementation is based on
  • #22 - Algorithm do not ‘understand’ data – most frequent observations are more ’true’ than rare ones Examples: Face recognition trained on Caucasian faces wont recognize faces of other ethnic groups Automatic prediction of criminal relapse: data gathered in court is biased towards black persons since they have been discriminated – so algorithms will be even better at discrimination Microsoft chatbot Tay goes Nazi: most of its inputs was about this kind of content data shows that more males than females are directors in corporate boards: DL will simply prefer male candidates in job applications Attacks: local generalization makes algos vulnerable to minor changes unseen by the human eye …
  • #24 Wrapping up … - AIaaS allows everyone, independent of their knowledge, to utilize Artificial Intelligence. For developers simple APIs are provided, for users without coding skills graphical user interfaces along with detailed instructions are made available by which means a data processing pipeline can be clicked together convenience as well as the self marketing of the service providers suggest that everyone can easily apply AI algorithms Ok for HL AI and standardized problems like Logistic regression with 3 numerical input features However, tasks in practice are much more complex than text book examples … Without in-depth knowledge of the functioning of the underlying algorithm and data processing pipeline, there is a danger of systematic errors in the results of the application and thus in the business decisions that are derived from them If AI projects fail because of the lack of competence, the confidence in AI can be permanently damaged another example of this is the decade long misuse of p-values in social science and humanities, as well as in medical studies, that caused a considerable loss of trust in these research areas Low level AI should not be used without an in-depth knowledge of the algorithm to non-standardized problems wide variety of very good services for high level AI such as face recognition, speech-to-text, text-to-speech that can be easily applied by non experts Nice example: for testing with AIaaS and then develop their own custom solution https://blogs.dropbox.com/tech/2017/04/creating-a-modern-ocr-pipeline-using-computer-vision-and-deep-learning/