IBM Middle East Data Science Connect 2016 - Doha, Qatar

•

1 like•359 views

The document discusses the growth of connected devices and the Internet of Things. It then summarizes machine learning approaches for IoT data, including online learning which allows real-time model updates and historic learning which enables algorithmic improvements but requires high storage. Deep learning techniques for IoT like neural networks, convolutional neural networks, and LSTMs are also overviewed. Finally, the document presents challenges of deep learning and potential solutions like distributed hardware.

Technology

Unless stated otherwise all images are taken from wikipedia.org or openclipart.org

Why IoT (now) ?
• 15 Billion connected devices in 2015
• 40 Billion connected devices in 2020
• World population 7.4 Billion in 2016

Machine Learning on
historic data
Source: deeplearning4j.org

Online Learning
Source: deeplearning4j.org

online vs. historic
• Pros
• low storage costs
• real-time model update
• Cons
• algorithm support
• software support
• no algorithmic improvement
• compute power to be inline
with data rate
• Pros
• all algorithms
• abundance of software
• model re-scoring / re-
parameterisation (algorithmic
improvement)
• batch processing
• Cons
• high storage costs
• batch model update

DeepLearning
DeepLearning
Apache Spark
Hadoop

Learning of a function
A neural network can basically learn any
mathematical function

LSTM
“vanishing error problem” == inﬂuence of past inputs decay
quickly over time

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

• Outperformed traditional methods, such as
• cumulative sum (CUSUM)
• exponentially weighted moving average (EWMA)
• Hidden Markov Models (HMM)
• Learned what “Normal” is
• Raised error if time series pattern haven't been seen
before

Learning of an algorithm
A LSTM network is touring complete

Problems
• Neural Networks are computationally very complex
•especially during training
•but also during scoring
CPU (2009) GPU (2016) IBM TrueNorth (2017)

IBM TrueNorth
• Scalable
• Parallel
• Distributed
• Fault Tolerant
• No Clock ! :)
• IBM Cluster
• 4.096 chips
• 4 billion neurons
• 1 trillion synapses
• Human Brain
• 100 billion neurons
• 100 trillion synapses
• 1.000.000 neurons
• 250.000.000 synapses

DeepLearning
the future in cloud based analytics
Storage Layer (OpenStack SWIFT / Hadoop HDFS / IBM GPFS)
Execution Layer (Spark Executor, YARN, Platform Symphony)
Hardware Layer (Bare Metal High Performance Cluster)
GraphXStreaming SQL MLLib BlinkDB
DeepLearning4J 
 
ND4J
R MLBase H2O
Y
O
U
GPUAVX
Intel Xeon E7-4850 v2 48 core, 3 TB RAM, 72 GB HDD, 10Gbps, NVIDIA TESLA M60 GPU
(cu)BLAS
jcuBLAS
S
T
R
E
A
M
S

data
https://github.com/romeokienzler/pmqsimulator
https://ibm.biz/joinIBMCloud

IBM Middle East Data Science Connect 2016 - Doha, Qatar

At the recent Spark & Machine Learning Meetup in Brussels, François Garillot of Skymind delivered this lightning talk to a sold-out crowd. Specifically, François offered a tour of the DeepLearning4J architecture intermingled with applications. He went over the main blocks of this deep learning solution for the JVM that includes GPU acceleration, a custom n-dimensional array library, a parallelized data-loading swiss army tool, deep learning and reinforcement learning libraries — all with an easy-access interface. Along the way, he pointed out the strategic points of parallelization of computation across machines and gave insight on where Spark helps — and where it doesn't.

Metaflow was started at Netflix to answer a pressing business need: How to enable an organization of data scientists, who are not software engineers by training, build and deploy end-to-end machine learning workflows and applications independently. We wanted to provide the best possible user experience for data scientists, allowing them to focus on parts they like (modeling using their favorite off-the-shelf libraries) while providing robust built-in solutions for the foundational infrastructure: data, compute, orchestration, and versioning. Today, the open-source Metaflow powers hundreds of business-critical ML projects at Netflix and other companies from bioinformatics to real estate. In this talk, you will learn about: - What to expect from a modern ML infrastructure stack. - Using Metaflow to boost the productivity of your data science organization, based on lessons learned from Netflix. - Deployment strategies for a full stack of ML infrastructure that plays nicely with your existing systems and policies. https://www.aicamp.ai/event/eventdetails/W2021080510

Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018

Sri Ambati

This talk was recorded in London on Oct 30, 2018 and can be viewed here: https://youtu.be/p4iAnxwC_Eg The good news is building fair, accountable, and transparent machine learning systems is possible. The bad news is it’s harder than many blogs and software package docs would have you believe. The truth is nearly all interpretable machine learning techniques generate approximate explanations, that the fields of eXplainable AI (XAI) and Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) are very new, and that few best practices have been widely agreed upon. This combination can lead to some ugly outcomes! This talk aims to make your interpretable machine learning project a success by describing fundamental technical challenges you will face in building an interpretable machine learning system, defining the real-world value proposition of approximate explanations for exact models, and then outlining the following viable techniques for debugging, explaining, and testing machine learning models Mateusz is a software developer who loves all things distributed, machine learning and hates buzzwords. His favourite hobby data juggling. He obtained his M.Sc. in Computer Science from AGH UST in Krakow, Poland, during which he did an exchange at L’ECE Paris in France and worked on distributed flight booking systems. After graduation he move to Tokyo to work as a researcher at Fujitsu Laboratories on machine learning and NLP projects, where he is still currently based.

Automated Hyperparameter Tuning, Scaling and Tracking

Databricks

Automated Machine Learning (AutoML) has received significant interest recently. We believe that the right automation would bring significant value and dramatically shorten time-to-value for data science teams. Databricks is automating the Data Science and Machine Learning process through a combination of product offerings, partnerships, and custom solutions. This talk will focus on how Databricks can help automate hyperparameter tuning. For both traditional Machine Learning and modern Deep Learning, tuning hyperparameters can dramatically increase model performance and improve training times. However, tuning can be a complex and expensive process. In this talk, we'll start with a brief survey of the most popular techniques for hyperparameter tuning (e.g., grid search, random search, and Bayesian optimization). We will then discuss open source tools that implement each of these techniques, helping to automate the search over hyperparameters. Finally, we will discuss and demo improvements we built for these tools in Databricks, including integration with MLflow: Apache PySpark MLlib integration with MLflow for automatically tracking tuning Hyperopt integration with Apache Spark to distribute tuning and with MLflow for automatic tracking Recording and notebooks will be provided after the webinar so that you can practice at your own pace. Presenters Joseph Bradley, Software Engineer, Databricks Joseph Bradley is a Software Engineer and Apache Spark PMC member working on Machine Learning at Databricks. Previously, he was a postdoc at UC Berkeley after receiving his Ph.D. in Machine Learning from Carnegie Mellon in 2013. Yifan Cao, Senior Product Manager, Databricks Yifan Cao is a Senior Product Manager at Databricks. His product area spans ML/DL algorithms and Databricks Runtime for Machine Learning. Prior to Databricks, Yifan worked on two Machine Learning products, applying NLP to find metadata and applying machine learning to predict equipment failures. He helped build the products from ground up to multi-million dollars in ARR. Yifan started his career as a researcher in quantum computing. Yifan received his B.S in UC Berkeley and Master from MIT.

Kubeflow and Data Science in Kubernetes

John Liu

Deploying machine learning pipelines robustly at scale is one of the biggest challenges within an organization. Kubeflow is an open-source platform for distributed training, tuning, and serving models on Kubenetes. As a comprehensive solution for deploying and managing end-to-end data science and machine learning pipelines, Kubeflow is rapidly accelerating analytics innovation and adoption. John provides an overview of Kubeflow and how he has been using it in the wild.

Adopting software design practices for better machine learning

MLconf

Hopsworks - The Platform for Data-Intensive AI

QAware GmbH

Cloud Native Night July 2019, Munich: Talk by Steffen Srohsschmiedt (@grohsschmiedt, Head of Cloud at LogicalClocks) === Please download slides if blurred! === Abstract: Machine Learning (ML) pipelines are the fundamental building block for productionizing ML code. Building such pipelines with Big Data is a complex process. The different stages in ML pipelines also need to be orchestrated, from data ingestion and data transformation, to feature engineering, to model training, serving and monitoring. Hopsworks is an open-source data platform that can be used to both develop and operate horizontally scalable machine learning (ML) pipelines. A key part of our pipelines is the world's first open-source Feature Store, that acts as a data warehouse for features, providing a natural API between data engineers - who write feature engineering code - and Data Scientists, who select features from the feature store to generate training/test data for models. Join us next time: https://www.meetup.com/Cloud-Native-muc/events

Best Practices for Engineering Production-Ready Software with Apache Spark

Databricks

SparkML: Easy ML Productization for Real-Time Bidding

Databricks

dataxu bids on ads in real-time on behalf of its customers at the rate of 3 million requests a second and trains on past bids to optimize for future bids. Our system trains thousands of advertiser-specific models and runs multi-terabyte datasets. In this presentation we will share the lessons learned from our transition towards a fully automated Spark-based machine learning system and how this has drastically reduced the time to get a research idea into production. We'll also share how we: - continually ship models to production - train models in an unattended fashion with auto-tuning capabilities - tune and overbooked cluster resources for maximum performance - ported our previous ML solution into Spark - evaluate the performance of high-rate bidding models Speakers: Maximo Gurmendez, Javier Buquet

No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...

Spark Summit

Building accurate machine learning models has been an art of data scientists, i.e., algorithm selection, hyper parameter tuning, feature selection and so on. Recently, challenges to breakthrough this “black-arts” have got started. In cooperation with our partner, NEC Laboratories America, we have developed a Spark-based automatic predictive modeling system. The system automatically searches the best algorithm, parameters and features without any manual work. In this talk, we will share how the automation system is designed to exploit attractive advantages of Spark. The evaluation with real open data demonstrates that our system can explore hundreds of predictive models and discovers the most accurate ones in minutes on a Ultra High Density Server, which employs 272 CPU cores, 2TB memory and 17TB SSD in 3U chassis. We will also share open challenges to learn such a massive amount of models on Spark, particularly from reliability and stability standpoints. This talk will cover the presentation already shown on Spark Summit SF’17 (#SFds5) but from more technical perspective.

Machine Learning for the Sensored IoT

Hank Roark

Production ready big ml workflows from zero to hero daniel marcous @ waze

Ido Shilon

More Data Science with Less Engineering: Machine Learning Infrastructure at N...

Ville Tuulos

Data Intensive Applications with Apache Flink

Simone Robutti

Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...

Databricks

This talk describes migrating a large random forest classifier from scikit-learn to Spark's MLlib. We cut training time from 2 days to 2 hours, reduced failed runs, and track experiments better with MLflow. Kount provides certainty in digital interactions like online credit card transactions. One of our scores uses a random forest classifier with 250 trees and 100,000 nodes per tree. We used scikit-learn to train using 60 million samples that each contained over 150 features. The in-memory requirements exceeded 750 GB, took 2 days, and were not robust to disruption in our database or training execution. To migrate workflow to Spark, we built a 6-node cluster with HDFS. This provides 1.35 TB of RAM and 484 cores. Using MLlib and parallelization, the training time for our random forests are now less than 2 hours. Training data stays in our production environment, which used to require a deploy cycle to move locally-developed code onto our training server. The new implementation uses Jupyter notebooks for remote development with server-side execution. MLflow tracks all input parameters, code, and git revision number, while the performance and model itself are retained as experiment artifacts. The new workflow is robust to service disruption. Our training pipeline begins by pulling from a Vertica database. Originally, this single connection took over 8 hours to complete with any problem causing a restart. Using sqoop and multiple connections, we pull the data in 45 minutes. The old technique used volatile storage and required the data for each experiment. Now, we pull the data from Vertica one time and then reload much faster from HDFS. While a significant undertaking, moving to the Spark ecosystem converted an ad hoc and hands-on training process into a fully repeatable pipeline that meets regulatory and business goals for traceability and speed. Speaker: Josh Johnston

Stacked Ensembles in H2O

Sri Ambati

Ensemble machine learning methods are often used when the true prediction function is not easily approximated by a single algorithm. Practitioners may prefer ensemble algorithms when model performance is valued above other factors such as model complexity and training time. The Super Learner algorithm, also called "stacking", learns the optimal combination of the base learner fits. The latest version of H2O now contains a "Stacked Ensemble" method, which allows the user to stack H2O models into a Super Learner. The Stacked Ensemble method is the the native H2O version of stacking, previously only available in the h2oEnsemble R package, and now enables stacking from all the H2O APIs: Python, R, Scala, etc. Erin is a Statistician and Machine Learning Scientist at H2O.ai. Before joining H2O, she was the Principal Data Scientist at Wise.io (acquired by GE Digital) and Marvin Mobile Security (acquired by Veracode) and the founder of DataScientific, Inc. Erin received her Ph.D. from University of California, Berkeley. Her research focuses on ensemble machine learning, learning from imbalanced binary-outcome data, influence curve based variance estimation and statistical computing.

Machine Learning Infrastructure

SigOpt

As data science workloads grow, so does their need for infrastructure. But, is it fair to ask data scientists to also become infrastructure experts? If not the data scientists, then, who is responsible for spinning up and managing data science infrastructure? This talk will address the context in which ML infrastructure is emerging, walk through two examples of ML infrastructure tools for launching hyperparameter optimization jobs, and end with some thoughts for building better tools in the future. Originally given as a talk at the PyData Ann Arbor meetup (https://www.meetup.com/PyData-Ann-Arbor/events/260380989/)

Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...

DataWorks Summit/Hadoop Summit

DeepLearning is not just a hype - it outperforms state-of-the-art ML algorithms. One by one. In this talk we will show how DeepLearning can be used for detecting anomalies on IoT sensor data streams at high speed using DeepLearning4J on top of different BigData engines like ApacheSpark and ApacheFlink. Key in this talk is the absence of any large training corpus since we are using unsupervised machine learning - a domain current DL research threats step-motherly. As we can see in this demo LSTM networks can learn very complex system behavior - in this case data coming from a physical model simulating bearing vibration data. Once draw back of DeepLearning is that normally a very large labaled training data set is required. This is particularly interesting since we can show how unsupervised machine learning can be used in conjunction with DeepLearning - no labeled data set is necessary. We are able to detect anomalies and predict braking bearings with 10 fold confidence. All examples and all code will be made publicly available and open sources. Only open source components are used.

"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...

Edge AI and Vision Alliance

For the full video of this presentation, please visit: http://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2015-embedded-vision-summit-baidu For more information about embedded vision, please visit: http://www.embedded-vision.com Dr. Ren Wu, former distinguished scientist at Baidu's Institute of Deep Learning (IDL), presents the keynote talk, "Enabling Ubiquitous Visual Intelligence Through Deep Learning," at the May 2015 Embedded Vision Summit. Deep learning techniques have been making headlines lately in computer vision research. Using techniques inspired by the human brain, deep learning employs massive replication of simple algorithms which learn to distinguish objects through training on vast numbers of examples. Neural networks trained in this way are gaining the ability to recognize objects as accurately as humans. Some experts believe that deep learning will transform the field of vision, enabling the widespread deployment of visual intelligence in many types of systems and applications. But there are many practical problems to be solved before this goal can be reached. For example, how can we create the massive sets of real-world images required to train neural networks? And given their massive computational requirements, how can we deploy neural networks into applications like mobile and wearable devices with tight cost and power consumption constraints? In this talk, Ren shares an insider’s perspective on these and other critical questions related to the practical use of neural networks for vision, based on the pioneering work being conducted by his former team at Baidu. Note 1: Regarding the ImageNet results included in this presentation, the organizers of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) have said: “Because of the violation of the regulations of the test server, these results may not be directly comparable to results obtained and reported by other teams.” (http://www.image-net.org/challenges/LSVRC/announcement-June-2-2015) Note 2: The presenter, Ren Wu, has told the Embedded Vision Alliance that “There was some ambiguity with the rules. According to the ‘official’ interpretation of the rules, there should be no more than 52 submissions within a half year. For us, we achieved the reported results after 200 tests total within a half year. We believe there is no way to obtain any measurable gains, nor did we try to obtain any gains, from an 'extra' hundred tests as our networks have billions of parameters and are trained by tens of billions of training samples.”

What's hot

Deep learning on a mixed cluster with deeplearning4j and spark

François Garillot

Future of ai on the jvm

Adam Gibson

Machine learning model to production

Georg Heiler

Bringing Deep Learning into production

Paolo Platter

Metaflow: The ML Infrastructure at Netflix

Bill Liu

Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018

Sri Ambati

Automated Hyperparameter Tuning, Scaling and Tracking

Databricks

Kubeflow and Data Science in Kubernetes

John Liu

Adopting software design practices for better machine learning

MLconf

Hopsworks - The Platform for Data-Intensive AI

QAware GmbH

Best Practices for Engineering Production-Ready Software with Apache Spark

Databricks

SparkML: Easy ML Productization for Real-Time Bidding

Databricks

No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...

Spark Summit

Machine Learning for the Sensored IoT

Hank Roark

Production ready big ml workflows from zero to hero daniel marcous @ waze

Ido Shilon

More Data Science with Less Engineering: Machine Learning Infrastructure at N...

Ville Tuulos

Data Intensive Applications with Apache Flink

Simone Robutti

Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...

Databricks

Stacked Ensembles in H2O

Sri Ambati

Machine Learning Infrastructure

SigOpt

What's hot (20)

Deep learning on a mixed cluster with deeplearning4j and spark

Future of ai on the jvm

Machine learning model to production

Bringing Deep Learning into production

Metaflow: The ML Infrastructure at Netflix

Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018

Automated Hyperparameter Tuning, Scaling and Tracking

Kubeflow and Data Science in Kubernetes

Adopting software design practices for better machine learning

Hopsworks - The Platform for Data-Intensive AI

Best Practices for Engineering Production-Ready Software with Apache Spark

SparkML: Easy ML Productization for Real-Time Bidding

No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...

Machine Learning for the Sensored IoT

Production ready big ml workflows from zero to hero daniel marcous @ waze

More Data Science with Less Engineering: Machine Learning Infrastructure at N...

Data Intensive Applications with Apache Flink

Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, ...

Stacked Ensembles in H2O

Machine Learning Infrastructure

Similar to IBM Middle East Data Science Connect 2016 - Doha, Qatar

Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...

DataWorks Summit/Hadoop Summit

"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...

Edge AI and Vision Alliance

Deep Learning for Robotics

Intel Nervana

Yinyin Liu presents at SD Robotics Meetup on November 8th, 2016. Deep learning has made great success in image understanding, speech, text recognition and natural language processing. Deep Learning also has tremendous potential to tackle the challenges in robotic vision, and sensorimotor learning in a robotic learning environment. In this talk, we will talk about how current and future deep learning technologies can be applied for robotic applications.

OpenVINO introduction

Yury Gorbachev

Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...

Greg Makowski

Teaching Machine Learning with Physical Computing - July 2023

Hal Speed

Deep Learning with CNTK

Ashish Jaiman

Demystifying Machine Learning and Artificial Intelligence

EPCC, University of Edinburgh

AI on Greenplum Using  Apache MADlib and MADlib Flow - Greenplum Summit 2019

VMware Tanzu

2_Image Classification.pdf

FEG

Building High Available and Scalable Machine Learning Applications

Yalçın Yenigün

Ncku csie talk about Spark

Giivee The

Budapest Big Data Meetup Real-time stream processing

Gabor Boros

ICCV 2019 - A view

LiberiFatali

Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

MLconf

DL4J and DataVec for Enterprise Deep Learning Workflows: Applications in NLP, sensor processing (IoT), image processing, and audio processing have all emerged as prime deep learning applications. In this session we will take a look at a practical review of building practical and secure Deep Learning workflows in the enterprise. We’ll see how DL4J’s DataVec tool enables scalable ETL and vectorization pipelines to be created for a single machine or scale out to Spark on Hadoop. We’ll also see how Deep Networks such as Recurrent Neural Networks are able to leverage DataVec to more quickly process data for modeling.

Deep Learning: DL4J and DataVec

Josh Patterson

Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...

Maurice Nsabimana

Volunteers around the world increasingly act as human sensors to collect millions of data points. A team from the World Bank trained deep learning models, using Apache Spark and BigDL, to confirm that photos gathered through a crowdsourced data collection pilot matched the goods for which observations were submitted. In this talk, Maurice Nsabimana, a statistician at the World Bank, and Jiao Wang, a software engineer on the Big Data Technology team at Intel, demonstrate a collaborative project to design and train large-scale deep learning models using crowdsourced images from around the world. BigDL is a distributed deep learning library designed from the ground up to run natively on Apache Spark. It enables data engineers and scientists to write deep learning applications in Scala or Python as standard Spark programs-without having to explicitly manage distributed computations. Attendees of this session will learn how to get started with BigDL, which runs in any Apache Spark environment, whether on-premise or in the Cloud.

ITCamp 2011 - Cristian Lefter - SQL Server code-name DenaliITCamp

Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...

confluent

Apache Kafka is now nearly ubiquitous in modern data pipelines and use cases. While the Kafka development model is elegantly simple, operating Kafka clusters in production environments is a challenge. It’s hard to troubleshoot misbehaving Kafka clusters, especially when there are potentially hundreds or thousands of topics, producers and consumers and billions of messages. The root cause of why real-time applications is lag may be due to an application problem – like poor data partitioning or load imbalance – or due to a Kafka problem – like resource exhaustion or suboptimal configuration. Therefore getting the best performance, predictability, and reliability for Kafka-based applications can be difficult. In the end, the operation of your Kafka powered analytics pipelines could themselves benefit from machine learning (ML).

Cloud computing and Hadoop introduction

christian.perez

Similar to IBM Middle East Data Science Connect 2016 - Doha, Qatar (20)

Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...

"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...

Deep Learning for Robotics

OpenVINO introduction

Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...

Teaching Machine Learning with Physical Computing - July 2023

Deep Learning with CNTK

Demystifying Machine Learning and Artificial Intelligence

AI on Greenplum Using  Apache MADlib and MADlib Flow - Greenplum Summit 2019

2_Image Classification.pdf

Building High Available and Scalable Machine Learning Applications

Ncku csie talk about Spark

Budapest Big Data Meetup Real-time stream processing

ICCV 2019 - A view

Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Deep Learning: DL4J and DataVec

Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...

ITCamp 2011 - Cristian Lefter - SQL Server code-name Denali

Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...

Cloud computing and Hadoop introduction

More from Romeo Kienzler

Parallelization Stategies of DeepLearning Neural Network Training

Romeo Kienzler

Blockchain Technology Book Vernisage

Romeo Kienzler

Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...

Romeo Kienzler

Real-time DeepLearning on IoT Sensor Data

Romeo Kienzler

Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...

Romeo Kienzler

Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service

Romeo Kienzler

IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...

Romeo Kienzler

We are transitioning from the programmatic to the cognitive computing era.IBM Deep Blue won against the world champion in Chess 1996. IBM Watson won against the two world champions in the famous US quiz show "Jeopardy" 2011. Since then, the press heavily established the term "Cognitive Computing" to the public. I will explain how IBM Watson works internally and start with Algebraic Text Extraction. DeepQA is the heart of IBM Watson and I will explain each component of this pipeline, the linguistic preprocessor, hypothesis generation, hypothesis and evidence scoring, final matching based on supervised learning and confidence estimation. Finally, I conclude with an overview of actual use cases and outline the roadmap of future work.

TDWI_DW2014_SQLNoSQL_DBAASRomeo Kienzler

Cloudant Overview Bluemix Meetup from Lisa Neddam

Romeo Kienzler

The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...Romeo Kienzler

DBaaS Bluemix Meetup DACH 26.8.14Romeo Kienzler

Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich

Romeo Kienzler

Cloud Databases, Developer Week Nuernberg 2014

Romeo Kienzler

Cloudfoundry / Bluemix tutorials, compressed in 4 Hours

Romeo Kienzler

Cloudfoundry / Bluemix tutorials, compressed in 4 HoursRomeo Kienzler

SQL on Hadoop - 12th Swiss Big Data User Group Meeting, 3rd of July, 2014, ET...

Romeo Kienzler

The datascientists workplace of the future, IBM developerDays 2014, Vienna by...Romeo Kienzler

BlueMix – IBM CIO Leadership Exchange Europe 27/28.5.14 - Berlin - Romeo Kien...Romeo Kienzler

IBM Codename: Bluemix - Cloudfoundry, PaaS development and deployment trainin...Romeo Kienzler

Development in the cloud for the cloud – Guest Lecture - University of Applie...Romeo Kienzler

More from Romeo Kienzler (20)

Parallelization Stategies of DeepLearning Neural Network Training

Blockchain Technology Book Vernisage

Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...

Real-time DeepLearning on IoT Sensor Data

Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...

Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service

IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...

TDWI_DW2014_SQLNoSQL_DBAAS

Cloudant Overview Bluemix Meetup from Lisa Neddam

The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...

DBaaS Bluemix Meetup DACH 26.8.14

Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich

Cloud Databases, Developer Week Nuernberg 2014

Cloudfoundry / Bluemix tutorials, compressed in 4 Hours

SQL on Hadoop - 12th Swiss Big Data User Group Meeting, 3rd of July, 2014, ET...

The datascientists workplace of the future, IBM developerDays 2014, Vienna by...

BlueMix – IBM CIO Leadership Exchange Europe 27/28.5.14 - Berlin - Romeo Kien...

IBM Codename: Bluemix - Cloudfoundry, PaaS development and deployment trainin...

Development in the cloud for the cloud – Guest Lecture - University of Applie...

Recently uploaded

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...

Product School

Epistemic Interaction - tuning interfaces to provide information for AI support

Alan Dix

Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024 https://alandix.com/academic/papers/synergy2024-epistemic/ As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.

Designing Great Products: The Power of Design and Leadership by Chief Designe...

Product School

To Graph or Not to Graph Knowledge Graph Architectures and LLMs

Paul Groth

JMeter webinar - integration with InfluxDB and Grafana

RTTS

Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application. In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics. Length: 30 minutes Session Overview ------------------------------------------- During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana: - What out-of-the-box solutions are available for real-time monitoring JMeter tests? - What are the benefits of integrating InfluxDB and Grafana into the load testing stack? - Which features are provided by Grafana? - Demonstration of InfluxDB and Grafana using a practice web application To view the webinar recording, go to: https://www.rttsweb.com/jmeter-integration-webinar

Knowledge engineering: from people to machines and back

Elena Simperl

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...

Product School

Generating a custom Ruby SDK for your web service or Rails API using Smithy

g2nightmarescribd

Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

Product School

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024

Albert Hoitingh

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024

Tobias Schneck

As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other? Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

FIDO Alliance

Neuro-symbolic is not enough, we need neuro-*semantic*

Frank van Harmelen

Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”. All of this illustrated with link prediction over knowledge graphs, but the argument is general.

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...

Jeffrey Haguewood

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams. Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

Elevating Tactical DDD Patterns Through Object Calisthenics

Dorra BARTAGUIZ

After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf

FIDO Alliance

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

Sri Ambati

Securing your Kubernetes cluster_ a step-by-step guide to success !

KatiaHIMEUR1

Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster. However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks. In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.

Bits & Pixels using AI for Good.........

Alison B. Lowndes

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

Recently uploaded (20)

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...

Epistemic Interaction - tuning interfaces to provide information for AI support

Designing Great Products: The Power of Design and Leadership by Chief Designe...

To Graph or Not to Graph Knowledge Graph Architectures and LLMs

JMeter webinar - integration with InfluxDB and Grafana

Knowledge engineering: from people to machines and back

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...

Generating a custom Ruby SDK for your web service or Rails API using Smithy

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

Neuro-symbolic is not enough, we need neuro-*semantic*

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...

Elevating Tactical DDD Patterns Through Object Calisthenics

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

Securing your Kubernetes cluster_ a step-by-step guide to success !

Bits & Pixels using AI for Good.........

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

IBM Middle East Data Science Connect 2016 - Doha, Qatar

1. Unless stated otherwise all images are taken from wikipedia.org or openclipart.org

3. Why IoT (now) ? • 15 Billion connected devices in 2015 • 40 Billion connected devices in 2020 • World population 7.4 Billion in 2016

10.

11.

12.

13.

14.

15. Machine Learning on historic data Source: deeplearning4j.org

16. Online Learning Source: deeplearning4j.org

17. online vs. historic • Pros • low storage costs • real-time model update • Cons • algorithm support • software support • no algorithmic improvement • compute power to be inline with data rate • Pros • all algorithms • abundance of software • model re-scoring / re- parameterisation (algorithmic improvement) • batch processing • Cons • high storage costs • batch model update

18.

19. DeepLearning DeepLearning Apache Spark Hadoop

20. Neural Networks

21. Neural Networks

22. Deeper (more) Layers

23. Convolutional

24. Convolutional + =

25. Convolutional

26.

27. Learning of a function A neural network can basically learn any mathematical function

28. Recurrent

29. LSTM “vanishing error problem” == inﬂuence of past inputs decay quickly over time

30. LSTM

31. http://karpathy.github.io/2015/05/21/rnn-effectiveness/

32.

33.

34.

35. • Outperformed traditional methods, such as • cumulative sum (CUSUM) • exponentially weighted moving average (EWMA) • Hidden Markov Models (HMM) • Learned what “Normal” is • Raised error if time series pattern haven't been seen before

36.

37.

38. Learning of an algorithm A LSTM network is touring complete

39. Problems • Neural Networks are computationally very complex •especially during training •but also during scoring CPU (2009) GPU (2016) IBM TrueNorth (2017)

40. IBM TrueNorth • Scalable • Parallel • Distributed • Fault Tolerant • No Clock ! :) • IBM Cluster • 4.096 chips • 4 billion neurons • 1 trillion synapses • Human Brain • 100 billion neurons • 100 trillion synapses • 1.000.000 neurons • 250.000.000 synapses

41. DeepLearning the future in cloud based analytics Storage Layer (OpenStack SWIFT / Hadoop HDFS / IBM GPFS) Execution Layer (Spark Executor, YARN, Platform Symphony) Hardware Layer (Bare Metal High Performance Cluster) GraphXStreaming SQL MLLib BlinkDB DeepLearning4J    ND4J R MLBase H2O Y O U GPUAVX Intel Xeon E7-4850 v2 48 core, 3 TB RAM, 72 GB HDD, 10Gbps, NVIDIA TESLA M60 GPU (cu)BLAS jcuBLAS S T R E A M S

42.

43.

44.

45.

46. data https://github.com/romeokienzler/pmqsimulator https://ibm.biz/joinIBMCloud

IBM Middle East Data Science Connect 2016 - Doha, Qatar

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to IBM Middle East Data Science Connect 2016 - Doha, Qatar

Similar to IBM Middle East Data Science Connect 2016 - Doha, Qatar (20)

More from Romeo Kienzler

More from Romeo Kienzler (20)

Recently uploaded

Recently uploaded (20)

IBM Middle East Data Science Connect 2016 - Doha, Qatar