Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...Ian Lumb
Watch On Demand Anytime via http://www.univa.com/resources/webinar-machine-learning.php
Armed with nothing more than an Apache Spark toting laptop, you have all the trappings required to prototype the application of Machine Learning against your data-science needs. From programmability in Scala, Java or Python, to built-in support for Machine Learning via MLlib, Spark is an exceedingly effective enabler that allows you to rapidly produce results.
Of course, as soon as your prototyping proves successful, you'll want to scale out to embrace the volume, variety and velocity that characterizes today's Big Data demands... in production. Because Spark is as comfortable on an isolated laptop as it is in a distributed-computing environment, addressing Big Data requirements in production boils down to effectively and efficiently embracing containers and clusters for Big Data Analytics.
And this is where offerings from Univa shine - i.e., in making the transition from prototype to production completely seamless. For some use cases, it makes sense to scale-in Spark based applications within Docker containers via Univa Grid Engine Container Edition or Navops by Univa; whereas in others, Spark is interfaced (as a Mesos-compliant framework) with Univa Universal Resource Broker, to permit scaling out on a cluster. In both scenarios, your production Spark applications are scheduled alongside other classes of workload - without a need for dedicated resources.
Agenda:
• Overview of Apache Spark as a platform for Deep Learning - from Python-based Jupyter Notebooks to Spark's Machine Learning library MLlib
• Overview of prototyping Machine Learning via Apache Spark on a laptop - without and within Docker containers
• Introductions to Univa Grid Engine Container Edition and Univa Universal Resource Broker plus Navops by Univa
• Overview of production Big Data Analytics platforms for Machine Learning
• Docker-containerized Apache Spark and Univa Grid Engine Container Edition
• Docker-containerized Apache Spark and Navops by Univa
• Apache Spark plus Univa Universal Resource Broker
• Introducing support for GPUs without and within Docker containers
• Use case example - using Machine Learning to classify data from Twitter without and within Docker containers
• Summary and next steps
Watch On Demand Anytime via http://www.univa.com/resources/webinar-machine-learning.php
All major cloud service providers now have some ML offering. The startup costs are low-to-no. They provide seamless leverage of cloud resources for scale-ups = $’s.
Open source ML options are now common making the creation of very large models now possible. Open data sets are proliferating.
Presented at CF Machine Learning
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016PAPIs.io
Machine learning as a service (MLaS) is imperative to the success of many companies as many internal teams and organizations need to gain business intelligence from big data. Building a scalable MLaS in a very challenging problem. In this paper, we present the scalable MLaS we built for a company that operates globally. We focus on several scalability challenges and our technical solutions.
Video at https://www.youtube.com/watch?v=MpnszJ_3Ong
Couldn't attend PAPIs '16? Get access to the other presentations' slides and videos at https://gumroad.com/products/fehon/
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...Ian Lumb
Watch On Demand Anytime via http://www.univa.com/resources/webinar-machine-learning.php
Armed with nothing more than an Apache Spark toting laptop, you have all the trappings required to prototype the application of Machine Learning against your data-science needs. From programmability in Scala, Java or Python, to built-in support for Machine Learning via MLlib, Spark is an exceedingly effective enabler that allows you to rapidly produce results.
Of course, as soon as your prototyping proves successful, you'll want to scale out to embrace the volume, variety and velocity that characterizes today's Big Data demands... in production. Because Spark is as comfortable on an isolated laptop as it is in a distributed-computing environment, addressing Big Data requirements in production boils down to effectively and efficiently embracing containers and clusters for Big Data Analytics.
And this is where offerings from Univa shine - i.e., in making the transition from prototype to production completely seamless. For some use cases, it makes sense to scale-in Spark based applications within Docker containers via Univa Grid Engine Container Edition or Navops by Univa; whereas in others, Spark is interfaced (as a Mesos-compliant framework) with Univa Universal Resource Broker, to permit scaling out on a cluster. In both scenarios, your production Spark applications are scheduled alongside other classes of workload - without a need for dedicated resources.
Agenda:
• Overview of Apache Spark as a platform for Deep Learning - from Python-based Jupyter Notebooks to Spark's Machine Learning library MLlib
• Overview of prototyping Machine Learning via Apache Spark on a laptop - without and within Docker containers
• Introductions to Univa Grid Engine Container Edition and Univa Universal Resource Broker plus Navops by Univa
• Overview of production Big Data Analytics platforms for Machine Learning
• Docker-containerized Apache Spark and Univa Grid Engine Container Edition
• Docker-containerized Apache Spark and Navops by Univa
• Apache Spark plus Univa Universal Resource Broker
• Introducing support for GPUs without and within Docker containers
• Use case example - using Machine Learning to classify data from Twitter without and within Docker containers
• Summary and next steps
Watch On Demand Anytime via http://www.univa.com/resources/webinar-machine-learning.php
All major cloud service providers now have some ML offering. The startup costs are low-to-no. They provide seamless leverage of cloud resources for scale-ups = $’s.
Open source ML options are now common making the creation of very large models now possible. Open data sets are proliferating.
Presented at CF Machine Learning
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016PAPIs.io
Machine learning as a service (MLaS) is imperative to the success of many companies as many internal teams and organizations need to gain business intelligence from big data. Building a scalable MLaS in a very challenging problem. In this paper, we present the scalable MLaS we built for a company that operates globally. We focus on several scalability challenges and our technical solutions.
Video at https://www.youtube.com/watch?v=MpnszJ_3Ong
Couldn't attend PAPIs '16? Get access to the other presentations' slides and videos at https://gumroad.com/products/fehon/
Learn the advantages and disadvantages of machine learning algorithms versus traditional statistical modelling approaches to solve complex business problems.
How to Become a Thought Leader in Your NicheLeslie Samuel
Are bloggers thought leaders? Here are some tips on how you can become one. Provide great value, put awesome content out there on a regular basis, and help others.
Mikhail Ovchinnikov "Automated Machine Learning: building a conveyor"Fwdays
What is the most difficult part of the machine learning process? Data collection? Feature Engineering? Model selection and tuning? Deploy and monitoring? What if you have a whole bunch of models, and business requires you to continuously improve, experiment, re-train and integrate models? And what if you are not even a Data Scientist?
In this talk:
How to not be drown in chaos, and build structured ML-integration process in a large company
Taking a close look at what can be automated (spoiler: everything)
Discussing "conveyor" taking ideas as input can make a great impact on business metrics, through fast and convenient machine learning integration
What can we achieve by using very basic and simple models
22 ноября, Одессе
FOSS Sea 2014 (http://geekslab.co/events/21-foss-sea-2014-infrastructure-for-researchers)
Инструменты машинного знания, как сервис (Виктор Цикунов, Developers and Platform Evangelism Lead at Microsoft Ukraine)
Концепция применения онтологических структур в ERP-системахAnatoly Simkin
В данной статье поднята проблематика анализа информации, предоставляемой информационными системами. Рассмотрены актуальные способы ее структурирования и представления пользователю. Предложена концепция построения и применения онтологических структур в информационных системах для анализа данных.
This article is devoted to the problems of data analysis that is provided by information systems. The actual methods of structuring and representation for user were considered. There was proposed the principle of making and applying the ontology structure in information systems for data analysis.
Andrii Belas: Turning machine learning models into stuff that actually helps ...Lviv Startup Club
Andrii Belas: Turning machine learning models into stuff that actually helps people and makes money
AI & BigData Online Day 2021
Website - http://aiconf.com.ua
Youtube - https://www.youtube.com/startuplviv
FB - https://www.facebook.com/aiconf
Learn the advantages and disadvantages of machine learning algorithms versus traditional statistical modelling approaches to solve complex business problems.
How to Become a Thought Leader in Your NicheLeslie Samuel
Are bloggers thought leaders? Here are some tips on how you can become one. Provide great value, put awesome content out there on a regular basis, and help others.
Mikhail Ovchinnikov "Automated Machine Learning: building a conveyor"Fwdays
What is the most difficult part of the machine learning process? Data collection? Feature Engineering? Model selection and tuning? Deploy and monitoring? What if you have a whole bunch of models, and business requires you to continuously improve, experiment, re-train and integrate models? And what if you are not even a Data Scientist?
In this talk:
How to not be drown in chaos, and build structured ML-integration process in a large company
Taking a close look at what can be automated (spoiler: everything)
Discussing "conveyor" taking ideas as input can make a great impact on business metrics, through fast and convenient machine learning integration
What can we achieve by using very basic and simple models
22 ноября, Одессе
FOSS Sea 2014 (http://geekslab.co/events/21-foss-sea-2014-infrastructure-for-researchers)
Инструменты машинного знания, как сервис (Виктор Цикунов, Developers and Platform Evangelism Lead at Microsoft Ukraine)
Концепция применения онтологических структур в ERP-системахAnatoly Simkin
В данной статье поднята проблематика анализа информации, предоставляемой информационными системами. Рассмотрены актуальные способы ее структурирования и представления пользователю. Предложена концепция построения и применения онтологических структур в информационных системах для анализа данных.
This article is devoted to the problems of data analysis that is provided by information systems. The actual methods of structuring and representation for user were considered. There was proposed the principle of making and applying the ontology structure in information systems for data analysis.
Andrii Belas: Turning machine learning models into stuff that actually helps ...Lviv Startup Club
Andrii Belas: Turning machine learning models into stuff that actually helps people and makes money
AI & BigData Online Day 2021
Website - http://aiconf.com.ua
Youtube - https://www.youtube.com/startuplviv
FB - https://www.facebook.com/aiconf
1. Мы учим Облака – Облака учат нас
(Машинное Обучение как облачная услуга)
Олег Фатеев
эксперт по облачной разработке
Cloud & Digital Transformation
Москва, Digital October
23 марта 2017
2. Где применяется Машинное Обучение
Анализ финансовых рынков,
технический анализ, биржевая торговля
Обнаружение мошенничества,
кредитный скоринг
Защита от спама, адаптивные сайты,
контекстная реклама
Предсказание пользовательского
поведения, рекомендательные системы
Понимание естественного языка,
смысловой поиск, машинный перевод
Разработка программного обеспечения
Логические и азартные игры
Распознавание речи, жестов, образов,
рукописного ввода
Компьютерное зрение, машинное осязание,
движение роботов
Распознавание эмоций и эмоциональный
интерфейс
Анализ сложных последовательностей и
структур
Биоинформатика, исследование ДНК,
молекулярная информатика, химическая
информатика
Медицинская диагностика, лабораторные
анализы
Прогнозирование технического состояния
3. Паттерны Машинного Обучения
Machine Learning = Model Self-Learning
Обученная модель – это программа, «написанная» данными
Машинное Обучение невозможно без Больших Данных
Неявное обнаружения скрытых закономерностей
Эмпиризм, а не рационализм. Индукция, а не дедукция
Используется классический математический аппарат
Машинное Обучение не всесильно
4. Современные тенденции Машинного
Обучения
Машинное Обучение стало операционным
Для пользователей не требуется глубоких знаний ни в
математике, ни в программировании. Глубокие знания им
требуются только в собственной предметной области
Машинное Обучение идет в Облака
Машинное Обучение невозможно без открытого
программного обеспечения и открытых данных
Машинное Обучение невозможно без широкого
сотрудничества и честного соперничества
6. Netflix Prize – начало эпохи Машинного
Обучения в рекомендательных системах
7. Как работают рекомендательные
системы
Позволяют предсказать неизвестные
предпочтения пользователей по известным
предпочтениям других пользователей.
Предсказывает, что пользователю может
понравиться, на основании схожести его
поведения с другими пользователями
Обрабатывает прошлое поведение одних
пользователей, их активности или оценки,
чтобы строить прогнозы на будущее для
других пользователей
Позволяет рекомендовать товары/услуги, за
которые пользователь/покупатель заплатит с
наибольшей вероятностью
9. Почему Машинное Обучение идет в
Облака
Начиная разработку, вы еще не знаете, что вам будет нужно
Неравномерная структура потребления вычислительных ресурсов
Необходимость проводить большое количество экспериментов и
подключать различные источники данных
Желание переложить разработку моделей на провайдеров
Желание использовать практический опыт других
10. Процесс облачного Машинного Обучения
Выбор, обучение и
оценка моделей
Предоставление сервиса
Алгоритм Алгоритм
Очистка и
восстановление
Обогащение
Обученная
модель
Подготовка данных
(Big Data)
11. Как выбрать облачную платформу Машинного
Обучения (список требований)
Наличие большого набора готовых моделей и модулей предобработки
данных
Возможность бесплатно испытывать модели и наличие тестовых наборов
данных для этого
Возможность подключать различные источники данных
Интегрированные визуальные средства разработки (для управления
workflow, обработки данных, настройки моделей)
Средства сплита данных, скоринга и оценки моделей
Возможность дорабатывать модели, делать их тонкую настройку и
подключать свои модули предобработки данных
Возможность сразу развертывать обученную модель как облачный сервис
12. Начните использовать Машинное
Обучение в Облаках уже сейчас
Крупнейшие провайдеры (все с наличием бесплатных режимов работы)
Microsoft Azure Machine Learning: https://azure.microsoft.com/ru-
ru/services/machine-learning/
Amazon Machine Learning: https://aws.amazon.com/ru/machine-learning/
Google Cloud Machine Learning: https://cloud.google.com/products/machine-
learning/
TensorFlow – самая известная библиотека с открытым исходным кодом:
https://www.tensorflow.org/
Kaggle – самая популярная краудсорсинговая платформа и площадка для
проведения соревнований: https://www.kaggle.com/
13. Выводы
Машинное Обучение, базирующееся на Больших Данных,
обеспечивает реальные конкурентные преимущества компаниям,
которые его активно используют!
Справиться со сложностью математического аппарата Машинного
Обучения и потребностью в вычислительных ресурсах, при этом
минимизируя затраты и время внедрения, можно с помощью Облаков!