1. The document discusses the anatomy of a machine learning application from defining the problem as a machine learning task, collecting and preparing data, building and evaluating models, and applying predictions.
2. It provides examples of real-world machine learning applications like predicting the success of startups and reducing turbulence on flights.
3. A key point is that machine learning applications involve more than just algorithms and require properly formulating the problem, engineering features from raw data, and iterative evaluation and improvement of models.
MLSEV. Logistic Regression, Deepnets, and Time Series BigML, Inc
Supervised Learning (Part II): Logistic Regression, Deepnets, and Time Series, by BigML.
MLSEV 2019: 1st edition of the Machine Learning School in Seville, Spain.
MLSEV. Models, Evaluations and Ensembles BigML, Inc
Introduction to Machine Learning. Supervised Learning (Part I): Models, Evaluations and Ensembles, by BigML.
MLSEV 2019: 1st edition of the Machine Learning School in Seville, Spain.
MLSEV. Use Case: Smart Energy ManagementBigML, Inc
Anomaly Detection in the Real World: Smart Energy Management, by Talento Corporativo.
MLSEV 2019: 1st edition of the Machine Learning School in Seville, Spain.
Machine Learning automation. Advanced WhizzML workflows: feature selection, boosting, gradient descent, and stacking.
VSSML18: 4th edition of the Valencian Summer School in Machine Learning.
Machine learning is becoming widely used to automate decision making. While machine learning seems complex, it involves finding patterns in data that can be used to make useful predictions. The document discusses how factors like increased data availability, faster computation, and easier tools have led to the rise of machine learning applications. It also notes common pitfalls in early machine learning adoption like overhyping results and failing to develop a clear strategy. Overall machine learning is transforming industries by enabling cheaper and more data-driven decisions at scale.
1. The document discusses the anatomy of a machine learning application from defining the problem as a machine learning task, collecting and preparing data, building and evaluating models, and applying predictions.
2. It provides examples of real-world machine learning applications like predicting the success of startups and reducing turbulence on flights.
3. A key point is that machine learning applications involve more than just algorithms and require properly formulating the problem, engineering features from raw data, and iterative evaluation and improvement of models.
MLSEV. Logistic Regression, Deepnets, and Time Series BigML, Inc
Supervised Learning (Part II): Logistic Regression, Deepnets, and Time Series, by BigML.
MLSEV 2019: 1st edition of the Machine Learning School in Seville, Spain.
MLSEV. Models, Evaluations and Ensembles BigML, Inc
Introduction to Machine Learning. Supervised Learning (Part I): Models, Evaluations and Ensembles, by BigML.
MLSEV 2019: 1st edition of the Machine Learning School in Seville, Spain.
MLSEV. Use Case: Smart Energy ManagementBigML, Inc
Anomaly Detection in the Real World: Smart Energy Management, by Talento Corporativo.
MLSEV 2019: 1st edition of the Machine Learning School in Seville, Spain.
Machine Learning automation. Advanced WhizzML workflows: feature selection, boosting, gradient descent, and stacking.
VSSML18: 4th edition of the Valencian Summer School in Machine Learning.
Machine learning is becoming widely used to automate decision making. While machine learning seems complex, it involves finding patterns in data that can be used to make useful predictions. The document discusses how factors like increased data availability, faster computation, and easier tools have led to the rise of machine learning applications. It also notes common pitfalls in early machine learning adoption like overhyping results and failing to develop a clear strategy. Overall machine learning is transforming industries by enabling cheaper and more data-driven decisions at scale.
Machine Learning: Business Perspective - Main Conference: Introduction to Machine Learning.
DutchMLSchool: 1st edition of the Machine Learning Summer School in The Netherlands.
From DevOps to MLOps: practical steps for a smooth transitionAnne-Marie Tousch
Abstract: There has been tremendous progress in artificial intelligence recently. There's no doubt one day it will also power Datadog products and you'll have to deal with it in your pipelines. What is it going to change? In this talk, I'll explain what makes ML fundamentally different than software engineering, and present a few of the operational challenges of setting up a machine learning system in the real world. Most importantly, I’ll propose practical steps to prepare the transition, that do not require you having a machine model running yet.
This talk was given at a Ladies of Code Meetup in Paris, in May 2023.
Recording: https://www.youtube.com/watch?v=S9l8GO4wtdY
Meetup: https://www.meetup.com/fr-FR/ladies-of-code-paris/events/293711765/
DutchMLSchool. Introduction to Machine Learning with the BigML PlatformBigML, Inc
Introduction to Machine Learning with the BigML Platform - ML for Executives Course.
DutchMLSchool: 1st edition of the Machine Learning Summer School in The Netherlands.
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tpseudor00t overflow
1) The document discusses using machine learning to help with the challenges of security monitoring and log management. Specifically, it presents a case study of using machine learning to build a model to detect malicious external agents based on firewall block data.
2) The model calculates "badness" ranks for IP addresses, netblocks, and autonomous system numbers based on proximity, temporal decay, and other factors. It then trains a support vector machine classifier on these features to detect malicious behaviors with 80-85% accuracy on new data.
3) The author argues this type of machine learning approach could help analysts focus on the most important alerts and events, since the models are 5-8 times more likely to correctly identify truly malicious traffic.
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Dhiana Deva
The document discusses introducing machine learning and the challenges that come with it. It likens introducing machine learning to opening Pandora's box, as it brings problems like constraints, assumptions, risks, and issues. It recommends starting with simple approaches, addressing these challenges through iteration, and aiming high with vision while avoiding algorithmic bias. The overall message is to have fun on the journey of machine learning and focus on creating customer value.
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan LippsApplitools
** Full webinar recording -- https://youtu.be/ihpAsmRtGuM **
Artificial Intelligence and Machine Learning (AI/ML) have seen application in a variety of fields, including the automation of QA tasks. But what are they exactly? What distinguishes different instances and applications of AI, for example? What are the horizons of these technologies in the field of QA?
The promise of AI/ML must be understood correctly to be harnessed appropriately. As with any buzzword, many technologies and products are offered under the guise of AI/ML without satisfying the definition. The industry is reforming itself around the promise that AI/ML holds often without a clear understanding of the technical limitations that give the promise its boundaries.
In this webinar, test automation guru Jonathan Lipps gives a detailed overview of the concepts that underpin AI/ML, and discuss their ramifications for the work of QA automation.
In addition to a discussion of AI/ML in general, Jonathan looks at examples from the QA industry. These examples will help give attendees the basic understanding required to cut through the marketing language. so we can clearly evaluate AI/ML solutions, and calibrate expectations about the benefit of AI/ML in QA, both as it stands today and in the future.
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh
Alok Singh is a Principal Engineer at IBM CODAIT who has built multiple analytical frameworks and machine learning algorithms. The presentation provides an overview of building predictive models for imbalanced datasets using scikit-learn and XGBoost. It discusses challenges with imbalanced data, evaluation metrics like confusion matrix and ROC curves, and techniques for imbalanced learning including weighted classes, oversampling minorities and undersampling majorities, and SMOTE. The presentation concludes with a hands-on tutorial demonstrating these techniques on an imbalanced bank marketing dataset.
This document summarizes the learnings and evolution of EZML, a proposed machine learning tool. It began by targeting data scientists, but interviews revealed that feature extraction was more important than automated model selection. The target then became companies lacking data science capabilities. Further interviews identified the ideal customer as a startup CTO with a recommendation or engagement problem. An MVP was developed with tiered pricing and consulting. Ongoing challenges around data privacy and costs were noted. The document concludes by questioning the business viability and next steps.
A few Challenges to Make Machine Learning EasyPemo Theodore
The document discusses the challenges of making machine learning easy. It notes that while machine learning is a key technique for making data-driven decisions, developing smart applications, and building predictive analytics, it is also complex due to complicated tools that do not scale well, costly solutions, and a lack of experts. The talk presents BigML as a cloud-based service that aims to make machine learning simple by allowing users to create models with a few lines of code or via an easy-to-use web interface. It discusses the challenges BigML faces in achieving breadth and depth of techniques, supporting diverse users, maintaining simplicity, scaling to large amounts of data, measuring true impact, and determining fair pricing models.
Introduction to End-to-End Machine Learning: Classification and Regression - Mercè Martín, VP of Bindings and Applications at BigML.
*Machine Learning School in The Netherlands 2022.
Feature engineering is the process of using domain knowledge to create new features that allow machine learning algorithms to work better or work at all. It involves applying transformations to existing features, like splitting date-time fields or normalizing numeric values, as well as computing new features from existing ones. Flatline is a domain-specific language for programmatic feature engineering and filtering that allows creating new features using expressions over existing fields. Care must be taken to avoid leakage when creating new features.
Agile Analytics: Delivering on Promises by Atif Abdul RahmanAgile ME
Big Data is all the hype in town yet the real value still remain with delivering analytics that create business impact. Agile Analytics sets out to unleash the true promise usually lost in lengthy, elephantine projects and years of data management purists' pursuits of perfection. That is exactly what separates these big data technologies: They promise greater agility. But is a supportive technology enough or even mandatory to become more agile? We will go through the value chain of delivering high impact analytics using agile practices and devise a jumpstarter kit for you to adopt and adapt.
This document provides an introduction to machine learning. It defines machine learning as systems that take in data to make predictions and decisions about unseen data without being explicitly programmed. Machine learning systems can label or classify data, predict numerical values, cluster similar data, infer patterns in data, and create complex outputs. The document discusses supervised and unsupervised learning and gives examples of machine learning applications in areas like early dementia diagnosis, wildlife surveys, pricing optimization, population tracking, and predicting social media engagement. It prompts the reader to consider scenarios where machine learning could be applied and how models may fail or be improved.
Jargon is an important aspect in the learning process of any new concept. Join us in our fourth session of the Explore ML series to learn more about the terminologies associated with Machine Learning
This use case showcases how Machine Learning can help you understand your customers to better develop personalized relationships. The lecturer is Arturo Moreno, Associate Professor at ICADE Business School, and a technology entrepreneur, investor, and innovative leader working on the intersection of venture capital and Machine Learning.
*Machine Learning School for Business Schools 2021: Virtual Conference.
This document discusses how analytics and data science projects can benefit from adopting agile principles and methods. It notes that analytics problems are often non-linear like scientific problems, requiring an agile approach with rapid experimentation and refinement of models and insights over time based on feedback. Adapting agile practices like user stories and incremental improvements can help analytics teams discover valuable insights and continuously learn from their work and data. The document also promotes the use of new technologies like data lakes and data virtualization to help provision agile data architectures that support rapid analytics experimentation.
Operationalizing Machine Learning in the Enterprisemark madsen
TDWI Munich 2019
What does it take to operationalize machine learning and AI in an enterprise setting?
Machine learning in an enterprise setting is difficult, but it seems easy. All you need is some smart people, some tools, and some data. It’s a long way from the environment needed to build ML applications to the environment to run them in an enterprise.
Most of what we know about production ML and AI come from the world of web and digital startups and consumer services, where ML is a core part of the services they provide. These companies have fewer constraints than most enterprises do.
This session describes the nature of ML and AI applications and the overall environment they operate in, explains some important concepts about production operations, and offers some observations and advice for anyone trying to build and deploy such systems.
Digital Transformation and Process Optimization in ManufacturingBigML, Inc
Keyanoush Razavidinani, Digital Services Consultant at A1 Digital, a BigML Partner, highlights why it is important to identify and reduce human bottlenecks that optimize processes and let you focus on important activities. Additionally, Guillem Vidal, Machine Learning Engineer at BigML completes the session by showcasing how Machine Learning is put to use in the manufacturing industry with a use case to detect factory failures.
The Road to Production: Automating your Anomaly Detectors - by jao (Jose A. Ortega), Co-Founder and Chief Technology Officer at BigML.
*Machine Learning School in The Netherlands 2022.
More Related Content
Similar to MLSEV. Machine Learning: Technical Perspective
Machine Learning: Business Perspective - Main Conference: Introduction to Machine Learning.
DutchMLSchool: 1st edition of the Machine Learning Summer School in The Netherlands.
From DevOps to MLOps: practical steps for a smooth transitionAnne-Marie Tousch
Abstract: There has been tremendous progress in artificial intelligence recently. There's no doubt one day it will also power Datadog products and you'll have to deal with it in your pipelines. What is it going to change? In this talk, I'll explain what makes ML fundamentally different than software engineering, and present a few of the operational challenges of setting up a machine learning system in the real world. Most importantly, I’ll propose practical steps to prepare the transition, that do not require you having a machine model running yet.
This talk was given at a Ladies of Code Meetup in Paris, in May 2023.
Recording: https://www.youtube.com/watch?v=S9l8GO4wtdY
Meetup: https://www.meetup.com/fr-FR/ladies-of-code-paris/events/293711765/
DutchMLSchool. Introduction to Machine Learning with the BigML PlatformBigML, Inc
Introduction to Machine Learning with the BigML Platform - ML for Executives Course.
DutchMLSchool: 1st edition of the Machine Learning Summer School in The Netherlands.
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tpseudor00t overflow
1) The document discusses using machine learning to help with the challenges of security monitoring and log management. Specifically, it presents a case study of using machine learning to build a model to detect malicious external agents based on firewall block data.
2) The model calculates "badness" ranks for IP addresses, netblocks, and autonomous system numbers based on proximity, temporal decay, and other factors. It then trains a support vector machine classifier on these features to detect malicious behaviors with 80-85% accuracy on new data.
3) The author argues this type of machine learning approach could help analysts focus on the most important alerts and events, since the models are 5-8 times more likely to correctly identify truly malicious traffic.
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Dhiana Deva
The document discusses introducing machine learning and the challenges that come with it. It likens introducing machine learning to opening Pandora's box, as it brings problems like constraints, assumptions, risks, and issues. It recommends starting with simple approaches, addressing these challenges through iteration, and aiming high with vision while avoiding algorithmic bias. The overall message is to have fun on the journey of machine learning and focus on creating customer value.
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan LippsApplitools
** Full webinar recording -- https://youtu.be/ihpAsmRtGuM **
Artificial Intelligence and Machine Learning (AI/ML) have seen application in a variety of fields, including the automation of QA tasks. But what are they exactly? What distinguishes different instances and applications of AI, for example? What are the horizons of these technologies in the field of QA?
The promise of AI/ML must be understood correctly to be harnessed appropriately. As with any buzzword, many technologies and products are offered under the guise of AI/ML without satisfying the definition. The industry is reforming itself around the promise that AI/ML holds often without a clear understanding of the technical limitations that give the promise its boundaries.
In this webinar, test automation guru Jonathan Lipps gives a detailed overview of the concepts that underpin AI/ML, and discuss their ramifications for the work of QA automation.
In addition to a discussion of AI/ML in general, Jonathan looks at examples from the QA industry. These examples will help give attendees the basic understanding required to cut through the marketing language. so we can clearly evaluate AI/ML solutions, and calibrate expectations about the benefit of AI/ML in QA, both as it stands today and in the future.
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh
Alok Singh is a Principal Engineer at IBM CODAIT who has built multiple analytical frameworks and machine learning algorithms. The presentation provides an overview of building predictive models for imbalanced datasets using scikit-learn and XGBoost. It discusses challenges with imbalanced data, evaluation metrics like confusion matrix and ROC curves, and techniques for imbalanced learning including weighted classes, oversampling minorities and undersampling majorities, and SMOTE. The presentation concludes with a hands-on tutorial demonstrating these techniques on an imbalanced bank marketing dataset.
This document summarizes the learnings and evolution of EZML, a proposed machine learning tool. It began by targeting data scientists, but interviews revealed that feature extraction was more important than automated model selection. The target then became companies lacking data science capabilities. Further interviews identified the ideal customer as a startup CTO with a recommendation or engagement problem. An MVP was developed with tiered pricing and consulting. Ongoing challenges around data privacy and costs were noted. The document concludes by questioning the business viability and next steps.
A few Challenges to Make Machine Learning EasyPemo Theodore
The document discusses the challenges of making machine learning easy. It notes that while machine learning is a key technique for making data-driven decisions, developing smart applications, and building predictive analytics, it is also complex due to complicated tools that do not scale well, costly solutions, and a lack of experts. The talk presents BigML as a cloud-based service that aims to make machine learning simple by allowing users to create models with a few lines of code or via an easy-to-use web interface. It discusses the challenges BigML faces in achieving breadth and depth of techniques, supporting diverse users, maintaining simplicity, scaling to large amounts of data, measuring true impact, and determining fair pricing models.
Introduction to End-to-End Machine Learning: Classification and Regression - Mercè Martín, VP of Bindings and Applications at BigML.
*Machine Learning School in The Netherlands 2022.
Feature engineering is the process of using domain knowledge to create new features that allow machine learning algorithms to work better or work at all. It involves applying transformations to existing features, like splitting date-time fields or normalizing numeric values, as well as computing new features from existing ones. Flatline is a domain-specific language for programmatic feature engineering and filtering that allows creating new features using expressions over existing fields. Care must be taken to avoid leakage when creating new features.
Agile Analytics: Delivering on Promises by Atif Abdul RahmanAgile ME
Big Data is all the hype in town yet the real value still remain with delivering analytics that create business impact. Agile Analytics sets out to unleash the true promise usually lost in lengthy, elephantine projects and years of data management purists' pursuits of perfection. That is exactly what separates these big data technologies: They promise greater agility. But is a supportive technology enough or even mandatory to become more agile? We will go through the value chain of delivering high impact analytics using agile practices and devise a jumpstarter kit for you to adopt and adapt.
This document provides an introduction to machine learning. It defines machine learning as systems that take in data to make predictions and decisions about unseen data without being explicitly programmed. Machine learning systems can label or classify data, predict numerical values, cluster similar data, infer patterns in data, and create complex outputs. The document discusses supervised and unsupervised learning and gives examples of machine learning applications in areas like early dementia diagnosis, wildlife surveys, pricing optimization, population tracking, and predicting social media engagement. It prompts the reader to consider scenarios where machine learning could be applied and how models may fail or be improved.
Jargon is an important aspect in the learning process of any new concept. Join us in our fourth session of the Explore ML series to learn more about the terminologies associated with Machine Learning
This use case showcases how Machine Learning can help you understand your customers to better develop personalized relationships. The lecturer is Arturo Moreno, Associate Professor at ICADE Business School, and a technology entrepreneur, investor, and innovative leader working on the intersection of venture capital and Machine Learning.
*Machine Learning School for Business Schools 2021: Virtual Conference.
This document discusses how analytics and data science projects can benefit from adopting agile principles and methods. It notes that analytics problems are often non-linear like scientific problems, requiring an agile approach with rapid experimentation and refinement of models and insights over time based on feedback. Adapting agile practices like user stories and incremental improvements can help analytics teams discover valuable insights and continuously learn from their work and data. The document also promotes the use of new technologies like data lakes and data virtualization to help provision agile data architectures that support rapid analytics experimentation.
Operationalizing Machine Learning in the Enterprisemark madsen
TDWI Munich 2019
What does it take to operationalize machine learning and AI in an enterprise setting?
Machine learning in an enterprise setting is difficult, but it seems easy. All you need is some smart people, some tools, and some data. It’s a long way from the environment needed to build ML applications to the environment to run them in an enterprise.
Most of what we know about production ML and AI come from the world of web and digital startups and consumer services, where ML is a core part of the services they provide. These companies have fewer constraints than most enterprises do.
This session describes the nature of ML and AI applications and the overall environment they operate in, explains some important concepts about production operations, and offers some observations and advice for anyone trying to build and deploy such systems.
Similar to MLSEV. Machine Learning: Technical Perspective (20)
Digital Transformation and Process Optimization in ManufacturingBigML, Inc
Keyanoush Razavidinani, Digital Services Consultant at A1 Digital, a BigML Partner, highlights why it is important to identify and reduce human bottlenecks that optimize processes and let you focus on important activities. Additionally, Guillem Vidal, Machine Learning Engineer at BigML completes the session by showcasing how Machine Learning is put to use in the manufacturing industry with a use case to detect factory failures.
The Road to Production: Automating your Anomaly Detectors - by jao (Jose A. Ortega), Co-Founder and Chief Technology Officer at BigML.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - ML for AML ComplianceBigML, Inc
Machine Learning for Anti Money Laundering Compliance, by Kevin Nagel, Consultant and Data Scientist at INFORM.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - Multi Perspective AnomaliesBigML, Inc
Multi Perspective Anomalies, by Jan W Veldsink, Master in the art of AI at Nyenrode, Rabobank, and Grio.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - My First Anomaly Detector BigML, Inc
The document discusses building an anomaly detector model to identify unusual transactions in a dataset. It describes loading transaction data with 31 features into the BigML platform and creating an anomaly detector model. The model scores new data and identifies the most anomalous fields to help detect fraud. Creating the anomaly detector involves interpreting the data, exploring the dataset distribution, and setting a threshold score to define what is considered anomalous.
DutchMLSchool 2022 - History and Developments in MLBigML, Inc
History and Present Developments in Machine Learning, by Tom Dietterich, Emeritus Professor of computer science at Oregon State University and Chief Scientist at BigML.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - A Data-Driven CompanyBigML, Inc
A Data-Driven Company: 21 Lessons for Large Organizations to Create Value from AI, by Richard Benjamins, Chief AI and Data Strategist at Telefónica.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - ML in the Legal SectorBigML, Inc
How Machine Learning Transforms and Automates Legal Services, by Arnoud Engelfriet, Co-Founder at Lynn Legal.
*Machine Learning School in The Netherlands 2022.
This document describes a proposed solution using machine learning and artificial intelligence to help create a safer stadium experience. The solution involves two parts: 1) linking access to stadiums to a verified identity through a fan app for preregistration, and 2) using AI/ML to help detect unwanted behaviors or events early. The rest of the document provides more details on the proposed smart video review framework, including using computer vision and audio analysis techniques to help identify issues like flares, flags, banners, chants including monkey chants. The goal is to help reviewers more efficiently identify potential problems but with privacy, ethics and human oversight.
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsBigML, Inc
Process Optimization in Manufacturing Plants, by Keyanoush Razavidinani, Digital Business Consultant at A1 Digital.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - Anomaly Detection at ScaleBigML, Inc
Lessons Learned Applying Anomaly Detection at Scale, by Álvaro Clemente, Machine Learning Engineer at BigML.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - Citizen Development in AIBigML, Inc
The document discusses the need for citizen developers and humans in the AI/ML process. It notes that while technology and talent are important, company culture must also support broad data analytics and AI/ML adoption. It then provides examples of how involving domain experts can help attribute meaning to correlations and build better causal models to improve AI systems. The document advocates for a systems thinking approach and having humans in the loop to help AI/ML systems consider the wider context and avoid issues like bias.
This new feature is a continuation of and improvement on our previous Image Processing release. Now, Object Detection lets you go a step further with your image data and allows you to locate objects and annotate regions in your images. Once your image regions are defined, you can train and evaluate Object Detection models, make predictions with them, and automate end-to-end Machine Learning workflows on a single platform. To make that possible, BigML enables Object Detection by introducing the regions optype.
As with any other BigML feature, Object Detection is available from the BigML Dashboard, API, and WhizzML for automation. Object Detection is extremely helpful to tackle a wide range of computer vision use cases such as medical image analysis, quality control in manufacturing, license plate recognition in transportation, people detection in security surveillance, among many others.
This new release brings Image Processing to the BigML platform, a feature that enhances our offering to solve image data-driven business problems with remarkable ease of use. Because BigML treats images as any other data type, this unique implementation allows you to easily use image data alongside text, categorical, numeric, date-time, and items data types as input to create any Machine Learning model available in our platform, both supervised and unsupervised.
Now, it is easier than ever to solve a wide variety of computer vision and image classification use cases in a single platform: label your image data, train and evaluate your models, make predictions, and automate your end-to-end Machine Learning workflows. As with any other BigML feature, Image Processing is available from the BigML Dashboard, API, and WhizzML, and it can be applied to solve use cases such as medical image analysis, visual product search, security surveillance, and vehicle damage detection, among others.
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureBigML, Inc
This session presents a quite common situation for those working in food and beverage retail (FnB) and highlights interesting insights to fight waste reduction.
Speaker: Stephen Kinns, CEO and Co-Founder at catsAi.
*ML in Retail 2021: Webinar.
Machine Learning in Retail: ML in the Retail SectorBigML, Inc
This is an introductory session about the role that Machine Learning is playing in the retail sector and how it is being deployed across the different areas of this industry.
Speaker: Atakan Cetinsoy, VP of Predictive Applications at BigML.
*ML in Retail 2021: Webinar.
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotBigML, Inc
This presentation analyzes the role that Machine Learning plays in legal automation with a real-world Machine Learning application.
Speaker: Arnoud Engelfriet, Co-Founder at Lynn Legal.
*ML in GRC 2021: Virtual Conference.
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...BigML, Inc
This is a real-life Machine Learning use case about integrated risk.
Speakers: Thomas Rengersen, Product Owner of the Governance Risk and Compliance Tool for Rabobank, and Thomas Alderse Baas, Co-Founder and Director of The Bowmen Group.
*ML in GRC 2021: Virtual Conference.
ML in GRC: Cybersecurity versus Governance, Risk Management, and ComplianceBigML, Inc
Some of these concepts (Cybersecurity, Governance, Risk Management, and Compliance) overlap and sometimes they can be confusing. This session helps us understand why those terms are key for any business to be successful.
Speaker: Jon Shende, Founding Investor at MyVayda.
*ML in GRC 2021: Virtual Conference.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
State of Artificial intelligence Report 2023kuntobimo2016
Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its sixth year. Consider this report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future.
We consider the following key dimensions in our report:
Research: Technology breakthroughs and their capabilities.
Industry: Areas of commercial application for AI and its business impact.
Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI.
Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.
Predictions: What we believe will happen in the next 12 months and a 2022 performance review to keep us honest.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
3. BigML, Inc #MLSEV: ML a Technical Perspective
Sampling the Audience
!3
Expert: Published papers at KDD, ICML, NIPS, etc or
developed own ML algorithms used at large scale
Aficionado: Understands pros/cons of different
techniques and/or can tweak algorithms as needed
Practitioner: Very familiar with ML packages (Weka,
Scikit, BigML, etc.)
Newbie: Just taking Coursera ML class or reading an
introductory book to ML
Absolute beginner: ML sounds like science fiction
7. BigML, Inc #MLSEV: ML a Technical Perspective
What is Machine Learning?
!7
Let’s start with what is NOT Machine Learning…
• Sentience
• Killer robots
• Generalized Artificial Intelligence
• Anything to do with the word “singularity”
8. BigML, Inc #MLSEV: ML a Technical Perspective
Oh the Hype!
!8
AlphaGo Zero beats a human at Go… killer robots far off?
• First of all, AlphaGo Zero is impressive!
• But, no need to fear killer robots power by AlphaGo Zero:
• Learning is not transferrable: retrain for chess, etc.
• Works only for rule based systems / perfect simulator
• Relies on games/systems with clear objectives (win/lose)
• Cost $25 million1
“While AlphaGo Zero is a step towards a general-purpose AI, it can only work on
problems that can be perfectly simulated in a computer, making tasks such as
driving a car out of the question. AIs that match humans at a huge range of
tasks are still a long way off” - Demis Hassabis, CEO of DeepMind2
2. https://www.theguardian.com/science/2017/oct/18/its-able-to-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own
1. https://www.inc.com/lisa-calhoun/google-artificial-intelligence-alpha-go-zero-just-pressed-reset-on-how-we-learn.html
9. BigML, Inc #MLSEV: ML a Technical Perspective
Three Domains
!9
Artificial
Intelligence
Cool/Scary things…
that mostly don’t exist
Machine
Learning
AI Concepts applied to
very specific problems
Deep
Learning
Specific techniques of
Machine Learning
10. BigML, Inc #MLSEV: ML a Technical Perspective
What is Machine Learning?
!10
Let’s start with what is NOT Machine Learning…
• Sentience
• Killer robots
• Generalized Artificial Intelligence
• Anything to do with the word “singularity”
• Something “new”
• First International Conference on ML held in 1980
• Top-performing algorithms have been around for decades
How do these things relate?
11. BigML, Inc #MLSEV: ML a Technical Perspective
AIRLINE ORIGIN DESTINATION
DEPARTURE
DELAY
DISTANCE
ARRIVAL
DELAY
AS ANC SEA -11 1448,0 -22
AA LAX PBI -8 2330,0 -9
US SFO CLT -2 2296,0 5
AA LAX MIA -5 2342,0 -9
AS SEA ANC -1 1448,0 -21
DL SFO MSP -5 1589 8
NK LAS MSP -6 1299 -17
US LAX CLT 14 2125,0 -10
AA SFO DFW -11 1464,0 -13
DL LAS ATL 3 1747,0 -15
What is Machine Learning?
!11
Finding patterns in data that can be used to
make inferences
Predictive Models
A practical definition…
12. BigML, Inc #MLSEV: ML a Technical Perspective
Machine Learning Terminology
!12
Instances
Features
New Instance
Predictive model
Prediction
Confidence
ML algorithm
Label
Training / Learning Predicting / Scoring
Data
14. BigML, Inc #MLSEV: ML a Technical Perspective
Why Machine Learning
!14
COMPLEXITYOFTASKS
TIME20th century 21st century
-
+
15. BigML, Inc #MLSEV: ML a Technical Perspective
Traditional Programming
!15
Lost Baggage Policy
• Explicit rules defined by requirements and experience
• How do we program when the rules are unknown or
very difficult to determine?
16. BigML, Inc #MLSEV: ML a Technical Perspective
Programming with ML
!16
AIRLINE ORIGIN DESTINATION
DEPARTURE
DELAY
DISTANCE
ARRIVAL
DELAY
AS ANC SEA -11 1448,0 -22
AA LAX PBI -8 2330,0 -9
US SFO CLT -2 2296,0 5
AA LAX MIA -5 2342,0 -9
AS SEA ANC -1 1448,0 -21
DL SFO MSP -5 1589 8
NK LAS MSP -6 1299 -17
US LAX CLT 14 2125,0 -10
AA SFO DFW -11 1464,0 -13
DL LAS ATL 3 1747,0 -15
Want: Flight Delay Prediction
Flight Delay Model????
What else can ML do?
18. BigML, Inc #MLSEV: ML a Technical Perspective
Machine Learning Tasks
!18
CLUSTER
ANALYSIS
ANOMALY
DETECTION
ASSOCIATION
DISCOVERY
TOPIC MODELING
TIME SERIES
UNSUPERVISED
CLASSIFICATION AND REGRESSION
SUPERVISED
19. BigML, Inc #MLSEV: ML a Technical Perspective
Predictive Maintenance
!19
CLASSIFICATION Will this component fail?
REGRESSION How many days until this component fails?
TIME SERIES FORECASTING How many components will fail in a week from now?
CLUSTER ANALYSIS Which machines behave similarly?
ANOMALY DETECTION Is this behavior normal?
ASSOCIATION DISCOVERY What alerts are triggered together before a failure?
20. BigML, Inc #MLSEV: ML a Technical Perspective
Personalized Music
!20
CLASSIFICATION Will this song be a hit?
REGRESSION How many users will play this song next month?
TIME SERIES FORECASTING
How many downloads this song will have in 3
months?
CLUSTER ANALYSIS Which songs are similar?
ANOMALY DETECTION Is this song being played more than normal?
ASSOCIATION DISCOVERY What songs people like to play together?
21. BigML, Inc #MLSEV: ML a Technical Perspective
Airline Revenue Management
!21
CLASSIFICATION Will this flight be booked at 80% 14 days out?
REGRESSION
How many passengers will book this flight 7 days
out?
TIME SERIES FORECASTING How many tickets will be cancelled this week?
CLUSTER ANALYSIS Which flight booking patterns are similar?
ANOMALY DETECTION Are these flights booking patterns normal?
ASSOCIATION DISCOVERY What price changes help overbook sooner?
22. BigML, Inc #MLSEV: ML a Technical Perspective
Network Security
!22
CLASSIFICATION Is this email part of a phishing attack?
REGRESSION How many logins after work per week?
TIME SERIES FORECASTING What will be the number of false alarms next week?
CLUSTER ANALYSIS Are these users behaving similarly?
ANOMALY DETECTION Is this user behavior worth to inspect?
ASSOCIATION DISCOVERY What alerts were triggered before this attack?
24. BigML, Inc #MLSEV: ML a Technical Perspective
All ML Models are WRONG
!24
TRUE FALSE
DEEPNET ENSEMBLELOGISTIC
REGRESION
DECISION TREE
Some model(s) is wrong… which one?
Same patient… different models… different predictions!
Insight: Need a way to measure model fitness
25. BigML, Inc #MLSEV: ML a Technical Perspective
Evaluating Models
!25
TEST
TRAINING
CONFIDENCEPREDICTION
%
EVALUATION
%
ENSEMBLE
PATIENT DATA
Stay Tuned: You will see this in Evaluations
26. BigML, Inc #MLSEV: ML a Technical Perspective
Measuring ML Mistakes
!26
TRUE FALSE
TRUE
TRUE
POSITIVE
FALSE
POSITIVE
FALSE
FALSE
NEGATIVE
TRUE
NEGATIVE
MODEL
ACTUAL
We can bend the rules a bit…
27. BigML, Inc #MLSEV: ML a Technical Perspective
Operating Point
!27
TRUE
FALSE
100% 0%
0% 100%
Operating Point
More False Positives More False Negatives
Why would you do this?
28. BigML, Inc #MLSEV: ML a Technical Perspective
Comparing Models
!28
%TRUEPOSITIVES
% FALSE POSITIVES
WORST(?) MODEL
IDEAL MODEL
GOOD
BETTER
R
AN
D
O
M
TRIVIAL MODEL
TRIVIAL MODEL
29. BigML, Inc #MLSEV: ML a Technical Perspective
Mistakes can be Costly
!29
+ =
FUN!
DANGER!
30. BigML, Inc #MLSEV: ML a Technical Perspective
Cost Functions
!30
GOOD
BETTER?%TRUEPOSITIVES
% FALSE POSITIVES
• What is the cost of predicting cancer incorrectly?
• What is the cost of labeling a fraudulent transaction as valid?
• What is the cost of incorrectly predicting an aircraft part is safe?
• Why can’t I just have a perfect model?
FALSE NEGATIVE COST
FALSE POSITIVE COST
One possibility
31. BigML, Inc #MLSEV: ML a Technical Perspective
How it Goes All Wrong
!31
• Over-fitting
• Under-fitting
32. BigML, Inc #MLSEV: ML a Technical Perspective
Hunting Dog Image Classifier
!32
TRU
E
FAL
SE
Which images are pictures of dogs that are
bred to be hunters?
33. BigML, Inc #MLSEV: ML a Technical Perspective
Over-fitting…
!33
“Hunting dogs are short-
haired spotted puppies that
lay out on the grass”
34. BigML, Inc #MLSEV: ML a Technical Perspective
Title
!34
A perfect model! How about some new images…
TRU
E
FAL
SE
35. BigML, Inc #MLSEV: ML a Technical Perspective
Over-fitting
!35
Model: true
Reality: false
Model: false
Reality: true
• This is an example or poor generalization
• The model “fit” the training data perfectly
• But it does not generalize to new instances well
36. BigML, Inc #MLSEV: ML a Technical Perspective
Under-fitting
!36
“Dogs with drop or pendant
ears are hunters”
Only use ear shape:
37. BigML, Inc #MLSEV: ML a Technical Perspective
Title
!37
An imperfect model… now we are making some
mistakes on the training data.
TRU
E
FAL
SE
38. BigML, Inc #MLSEV: ML a Technical Perspective
Under-fitting
!38
• This is an example of good generalization
• The model “under-fit” the training data
• But it is generalizing to new instances better
Model: true
Reality: true
Model: false
Reality: false
39. BigML, Inc #MLSEV: ML a Technical Perspective
Under-fitting
!39
Model: false
Reality: true
Model: false
Reality: true
40. BigML, Inc #MLSEV: ML a Technical Perspective
Learning Problems / Complexity
!40
Under-fitting Over-fitting
• High Complexity Model
• Fitting the data too well
One way to mitigate this is with different types of models…
• Low Complexity Model
• Not fitting the data very well
41. BigML, Inc #MLSEV: ML a Technical Perspective
Choosing the ML Algorithm
!41
Decreasing Interpretability / Better Representation / Longer Training
IncreasingDataSize/Complexity
Early Stage
Rapid Prototyping
Mid Stage
Proven Application
Late Stage
Critical Performance
DeepnetsSingle Tree Model
Logistic Regression Boosted Trees
Random
Decision Forest
Decision Forest
Hard?
44. BigML, Inc #MLSEV: ML a Technical Perspective
BigML Deepnet
!44
• The success of a Deepnet is dependent on getting the right
network structure for the dataset
• But, there are too many parameters:
• Nodes, layers, activation function, learning rate, etc…
• And setting them takes significant expert knowledge
• Solution: Metalearning (a good initial guess)
• Solution: Network search (try a bunch)
45. BigML, Inc #MLSEV: ML a Technical Perspective
Automating Machine Learning
!45
http://www.clparker.org/ml_benchmark/
46. BigML, Inc #MLSEV: ML a Technical Perspective
Automating Machine Learning
!46
• Each resource has several parameters that impact quality
• Number of trees, missing splits, nodes, weight
• Rather than trial and error, we can use ML to find ideal
parameters
• Why not make the model type, Decision Tree, Boosted Tree,
etc, a parameter as well?
• Similar to Deepnet network search, but finds the optimum
machine learning algorithm and parameters for your data
automatically
Key Insight: We can solve any parameter selection
problem in a similar way.
48. BigML, Inc #MLSEV: ML a Technical Perspective
Fusions
!48
Key Insight: ML algorithms each have unique
strengths and weaknesses
Single Tree: output changes abruptly
with inputs near decision boundary
Tree + Deepnet: output changes smoothly
with inputs near decision boundary
49. BigML, Inc #MLSEV: ML a Technical Perspective
Fusions
!49
Model Skills: Some ML algorithms “generally” do better
on some feature types:
• RDF for sparse text vectors
• LR/Deepnets for numeric features
• Trees for categorical features
Full
Numeric
Text
50. BigML, Inc #MLSEV: ML a Technical Perspective
Summary
!50
• Machine Learning is a subset of “Artificial Intelligence”
• Finds patterns in data that can be used to make inferences
• Can be thought of as “programming with data”
• Has been around for a long time (only recently practical)
• Already being used to solve real-world problems
• Caveat Emptor:
• Machine Learning mistakes are expected
• Care must be taken to address the cost of mistakes
• Automating Machine Learning
• Powerful application of ML to parameterizing ML
• Models can be fused to address specific data complexities