Continuous delivery and machine learning

•

0 likes•64 views

Find out about OVH’s approach to continuous delivery of machine learning models. Automated machine learning enables a company to efficiently build models and keep them up-to-date, using data refreshed by the “Data Collector” tool.

Technology

PRESENTEDBY
CONTINUOUS DELIVERY
AND
MACHINE LEARNING
GUILLAUME SALOU
MACHINE LEARNING SERVICES TEAM LEADER
@guillaume_salou
ROOM 3
3.30 PM

OUR AGENDA
OUR AGENDA
• Machine Learning
• Disk Failure Detection: Problem
• Disk Failure Detection: Solution
• Scale-up: Need
• Scale-up: Opportunities
• Scale-up: Automated Machine Learning
• Prescience: Machine Learning Platform
• Prescience: The Future

MACHINE LEARNING
MACHINE LEARNING
Machine learning is a subset of artificial intelligence that uses
statistical techniques to give computers the ability to learn (i.e.
progressively improve their performance of a specific task) from
data, without being explicitly programmed to do so.

DISK FAILURE DETECTION
DISK FAILURE DETECTION
• Detects if a disk is broken
• Works ‘on-the-fly’
• RTM: Real-time Monitoring (agent deployed on
servers)
• Supports bare metal
• Learns from bench and reinstall processes

DISK FAILURE DETECTION
DISK FAILURE DETECTION
Bench/
reinstall
Train
Server
RTM
DATA
Deploy
RTM
Query

DISK FAILURE DETECTION: PROS
DISK FAILURE DETECTION
• Useful
• Already deployed on Public Cloud, IPLB, etc.
• KPI-driven
• Self-learning
• Detect  Predict

DISK FAILURE DETECTION: CONS
DISK FAILURE DETECTION
• Takes time
• Retraining is needed
• Upgrades are complicated
• Knowledge is in silos
• Specific to different disks

DISK FAILURE DETECTION: CDS
DISK FAILURE DETECTION
BUILD/
EVALUATE
DEPLOY
SCOREFEEDBACK
TRAIN

Machine Learning Platform
DISK FAILURE DETECTION
DISK FAILURE DETECTION
Bench/
reinstall
Server
RTM
DATA
Deploy
RTM
Query
Train

SCALE-UP
SCALE-UP
• Automate continuous delivery
• Simplify the sharing of knowledge
• Use the power of the cloud: optimising/preprocessing/explanation
• Only one automated process must be updated and monitored
• Automate model rebuild processes
• Enable quick wins and fast fails
• Allows us to distribute processing: Sklearn
• Backend agnostic (Sklearn, Spark, etc.)
• Feature extraction can be done by data or business analysts

SCALE-UP: AUTO ML
SCALE-UP
Our focus:
• Supervised (e.g. classification, regression, etc.)
• Structured data (CSV)
• Starting TimeSeries
R&D subject:
• Stack complex methods
• Unstructured data (graph, image, text, etc.)

PRESCIENCE: LABS
PRESCIENCE
https://labs.ovh.com/machine-learning-platform

PRESCIENCE
Optimiser SMAC
Download files Preprocess data/models (PMML)
Explain models/predictions SHAP
Automated preprocessing Text
Custom modeling Scikit-Learn
Distributed Yes
PRESCIENCE
PRESCIENCE

PRESCIENCE: EXPLAIN
PRESCIENCE
• SHAP (SHapley Additive exPlanations).
• A unified approach to explaining the output of any machine learning
model
• Designing/coding UX
• Allows us to explain one prediction, or a model (i.e. thousands of
predictions)

PRESCIENCE: THE FUTURE
PRESCIENCE
• Integrate other algorithms: XGBoost, Tensorflow algorithms…
• Solve other problems:
• TimeSeries forecasting
• Anomaly detection
• Image recognition
• Natural language processing
• And many more!
• Integrate other data sources: Databases, Kafka, PCS...
• Apply it everywhere in OVH

The right setup of the local development and cloud infrastructure are the requirement for reproducible and reliable Machine Learning products. They also require a well-polished process behind the management of the data science life cycle, from research to production. ML stimulates the need for a more advanced type of software development process and requires a sophisticated ecosystem of services than classic IDE. This SlideShare provides ML engineers with insightful tips on how to use specific AWS & open-sources tools as well as DevOps best practices to complete routine tasks like data ingestion, data preprocessing, feature engineering, labeling, training, parameters tuning, testing, deployment, monitoring, and retraining. On top of that, you will learn what can and what can not be automated when it comes to using both AWS products and tools like Kubernetes, Kubeflow, Jupiter notebooks, TensorFlow, and TPOT. The keynote was originally delivered to Stanford academia (University IT, students, and staff) on campus of Stanford University. Speakers: -- Stepan Pushkarev, CTO at Squadex (https://www.linkedin.com/in/stepanpushkarev/) -- Rinat Gareev, Machine Learning Engineer at Squadex (https://www.linkedin.com/in/gareev/) -- Iskandar Sitdikov, Machine Learning Engineer at Squadex (https://www.linkedin.com/in/icekhan/)

Streaming meetup

karthik_krk

JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...

srisatish ambati

Top 10 Causes for Java Issues in Production and What to Do When Things Go Wrong JavaOne 2010. Abstract: It's Friday evening and you hear the first rumble . . . one java node has become slightly unresponsive. You lookup the process, get a thread dump, and for good measure restart it at 8 p.m. Saturday afternoon is when you realize that other nodes have caught the flu and you get the ugly call from the customer. In a matter of hours, you're on that conference bridge with support groups of different packages and Java vendors and one of your uberarchitects. Yes, production instances are up and down, and restarting like there's no tomorrow. Here's an accumulated compendium of the op 10 things that can cause Java production heartburn and what to do when your Java production is on fire. And yes, please have your tools belt on. Speaker(s): Cliff Click, Azul Systems, Distinguished Engineer SriSatish Ambati, Azul Systems, Performance Engineer

Geek Sync | Performance Tune Like an MVP

IDERA Software

You can watch the replay for this Geek Sync webcast, Performance Tune Like an MVP, in the IDERA Resource Center, http://ow.ly/aDE250A4qdF. The life of a DBA is evolving and your tuning skills should always be sharp. Tuning is one of the key components of a great DBA and developer. In this demo rich session we'll deep dive into performance tuning for on-premises, PaaS (platform as a service), and IaaS (infrastructure as a service). We'll discuss tips and techniques for troubleshooting bottlenecks and how to remediate them for hardware, OS, and the database. Speaker: Daniel Janik has been supporting SQL Server for over 18 years. Six of those years were at Microsoft Corporation supporting SQL Server as a Senior Premier Field Engineer (PFE) where he supported over 287 different clients with both reactive and proactive database needs. Daniel has presented at many community events and SQL Saturdays.

A data driven etl test framework sqlsat madisonTerry Bunio

Making Data Science Scalable - 5 Lessons Learned Making Data Science and Machine Learning scalable is not easy: #1 Data Science in silos is bad #2 ML-Feature stores should be at the heart of every ML-Platform #3 Auto ML works great if you have a Feature store #4 Treat Data Science Projekts more like Software Development #5 Cloude based Infrastructure makes it easy to get started Data Science MeetUp Cologne, Germany 16. May 2019 datasolut GmbH - https://datasolut.com

Early Software Development through Palladium Emulation

Raghav Nayak

Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...

PAPIs.io

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

Amazon Web Services

"This is a technical architect's case study of how Loggly has employed the latest social-media-scale technologies as the backbone ingestion processing for our multi-tenant, geo-distributed, and real-time log management system. This presentation describes design details of how we built a second-generation system fully leveraging AWS services including Amazon Route 53 DNS with heartbeat and latency-based routing, multi-region VPCs, Elastic Load Balancing, Amazon Relational Database Service, and a number of pro-active and re-active approaches to scaling computational and indexing capacity. The talk includes lessons learned in our first generation release, validated by thousands of customers; speed bumps and the mistakes we made along the way; various data models and architectures previously considered; and success at scale: speeds, feeds, and an unmeltable log processing engine."

Using AWS To Build A Scalable Machine Data Analytics Service

Christian Beedgen

Illuminate - Performance Analystics driven by Machine Learning

jClarity

Trenowanie i wdrażanie modeli uczenia maszynowego z wykorzystaniem Google Clo...

Sotrender

Okej, mam już mój świetny model w Notebooku, co dalej? Większość kursów i źródeł dotyczących uczenia maszynowego dobrze przygotowuje nas do implementacji algorytmów uczenia maszynowego i budowy mniej lub bardziej skomplikowanych modeli. Jednak w większości przypadków model jest jedynie małym fragmentem większego systemu, a jego wdrożenie i utrzymywanie okazuje się w praktyce procesem czasochłonnym i generującym rozmaite błędy. Problem potęguje się kiedy mamy do sproduktyzowania nie jeden, a więcej modeli. Choć z roku na rok powstaje coraz więcej narzędzi i platform do usprawnienia tego procesu, jest to zagadnienie któremu wciąż poświęca się stosunkowo mało uwagi. W mojej prezentacji przedstawię jakich podejść, dobrych praktyk oraz narzędzi i usług Google Cloud Platform używamy w Sotrender do efektywnego trenowania i produktyzacji naszych modeli ML, służących do analizy danych z mediów społecznościowych. Omówię na które aspekty DevOps zwracamy uwagę w kontekście wytwarzania produktów opartych o modele ML (MLOps) i jak z wykorzystaniem Google Cloud Platform można je w łatwy sposób wdrożyć w swoim startupie lub firmie. Prezentacja Macieja Pieńkosza z Sotrendera poczas Data Science Summit 2020

Postgresql in Education

dostatni

University of Alberta migrated their central Learning Management System from Blackboard Vista on Oracle to Moodle on Postgresql 9.0. We went from a pilot project of 13 courses in January 2011 to running all centrally supported courses (3600+) in Moodle in September 2012. Our central Moodle instance has seen more than 500,000 page loads and 24,000 unique visitors in a single day. Over the last two years we have learned a few hard lessons and overcome a few challenges in running Postgresql in a 24x7 production environment.

Performance testing with your eyes wide open geekweek 2018

Yoav Weiss

Machine learning systems for engineers

Cameron Joannidis

Cutting Edge Computer Vision for Everyone

Ivo Andreev

Microsoft offers a wide range of tools and advanced solutions to support you in managing computer vision related tasks. From purely coding approaches with ML.NET, through zero-code ComputerVision.ai to advanced and flexible AI service in Azure ML, there is a solution for every need and each type of person. From running on premises, through managed infrastructure to completely cloud services the speed of getting to the desired results and the return of investment are guaranteed. Join this session to get insights about the options, deployment, pricing, pros and cons compared and select the most appropriate tech for your business case.

AI Stack on AWS: Amazon SageMaker and Beyond

Provectus

Looking to learn more about AWS AI stack? Join experts from Provectus & AWS to find out how to use Amazon SageMaker (with combination with other tools and services) to enable enterprise-wide AI. Companies are looking to scale and become more productive when it comes to AI and data initiatives. They seek to launch AI projects more rapidly, which, among many other factors, requires a robust machine learning infrastructure. In this webinar, you will learn how to create a canonical SageMaker workflow, expand the SageMaker workflow to a holistic implementation, enhance and expand the implementation using best practices for feature store, data versioning, ML pipeline orchestration, and model monitoring. Agenda - Introductions - Amazon SageMaker Overview - Real-World Use Case - Data Lake for Machine Learning - Amazon SageMaker Experiments - Orchestration Beyond SageMaker Experiments - Amazon SageMaker Debugger - Amazon SageMaker Model Monitor - Webinar Takeaways Intended audience Technology executives & decision makers, manager-level tech roles, data engineers & data scientists, ML practitioners & ML engineers, and developers Presenters - Stepan Pushkarev, Chief Technology Officer, Provectus - Pritpal Sahota, Technical Account Manager, Provectus - Christopher A. Burns, Sr. AI/ML Solution Architect, AWS Feel free to share this presentation with your colleagues and don't hesitate to reach out to us at info@provectus.com if you have any questions! REQUEST WEBINAR: https://provectus.com/ai-stack-on-aws-sagemaker-and-beyond-mar-2020/

Serverless on OpenStack with Docker Swarm, Mistral, and StackStorm

Dmitri Zimine

Petabyte Scale Anomaly Detection Using R & Spark by Sridhar Alla and Kiran Mu...

Spark Summit

KoprowskiT_it_camp2013 - 2amADisasterJustBegan

Tobias Koprowski

2AM. We sleeping well. And our mobile ringing and ringing. Message: DISASTER! In this session (on slides) we are NOT talk about potential disaster (such BCM); we talk about: What happened NOW? Which tasks should have been finished BEFORE. Is virtual or physical SQL matter? We talk about systems, databases, peoples, encryption, passwords, certificates and users. In this session (on few demos) I'll show which part of our SQL Server Environment are critical and how to be prepared to disaster. In some documents I'll show You how to be BEST prepared.

ITCamp 2013 - Tobiasz Koprowski - 2AM A Disaster Just BeganITCamp

OVHcloud Startup Program : Découvrir l'écosystème au service des startups

Continuous delivery and machine learning

Recommended

Recommended

More Related Content

Similar to Continuous delivery and machine learning

Similar to Continuous delivery and machine learning (20)

More from OVHcloud

More from OVHcloud (20)

Recently uploaded

Recently uploaded (20)

Continuous delivery and machine learning