Our Summer 2017 release presents Deepnets, a highly effective supervised learning method that solves classification and regression problems in a way that can match or exceed human performance, especially in domains where effective feature engineering is difficult. BigML Deepnets bring two unique parameter optimization options: Automatic Network Search and Structure Suggestion. These options avoid the difficult and time-consuming work of hand-tuning the algorithm and ensure the best network among all possible networks to solve your problem. This new resource is available from the BigML Dashboard, API, as well as from WhizzML for its automation. Deepnets are state-of-the-art in many important supervised learning applications.
Valencian Summer School in Machine Learning 2017 - Day 2
Lecture Review: Summary Day 2 Sessions. By Mercè Martín Prats (BigML).
https://bigml.com/events/valencian-summer-school-in-machine-learning-2017
Valencian Summer School in Machine Learning 2017 - Day 2
Lecture 6: Time Series and Deepnets. By Charles Parker (BigML).
https://bigml.com/events/valencian-summer-school-in-machine-learning-2017
VSSML17 L5. Basic Data Transformations and Feature EngineeringBigML, Inc
Valencian Summer School in Machine Learning 2017 - Day 2
Lecture 5: Basic Data Transformations and Feature Engineering. By Poul Petersen (BigML).
https://bigml.com/events/valencian-summer-school-in-machine-learning-2017
Valencian Summer School in Machine Learning 2017 - Day 2
Lecture Review: Summary Day 2 Sessions. By Mercè Martín Prats (BigML).
https://bigml.com/events/valencian-summer-school-in-machine-learning-2017
Valencian Summer School in Machine Learning 2017 - Day 2
Lecture 6: Time Series and Deepnets. By Charles Parker (BigML).
https://bigml.com/events/valencian-summer-school-in-machine-learning-2017
VSSML17 L5. Basic Data Transformations and Feature EngineeringBigML, Inc
Valencian Summer School in Machine Learning 2017 - Day 2
Lecture 5: Basic Data Transformations and Feature Engineering. By Poul Petersen (BigML).
https://bigml.com/events/valencian-summer-school-in-machine-learning-2017
VSSML16 LR1. Summary Day 1
Valencian Summer School in Machine Learning 2016
Day 1
Summary Day 1
Mercè Martin (BigML)
https://bigml.com/events/valencian-summer-school-in-machine-learning-2016
In this presentation, I talk about data science competitions. After an introduction of the data science competitions, I go through the benefits, misconceptions, and best practices of competitions.
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Sri Ambati
Presented at #H2OWorld 2017 in Mountain View, CA.
Enjoy the video: https://youtu.be/-rGRHrED94Y.
Learn more about H2O.ai: https://www.h2o.ai/.
Follow @h2oai: https://twitter.com/h2oai.
- - -
Abstract:
Most machine learning systems enable two essential processes: creating a model and applying the model in a repeatable and controlled fashion. These two processes are interrelated and pose technological and organizational challenges as they evolve from research to prototype to production. This presentation outlines common design patterns for tackling such challenges while implementing machine learning in a production environment.
Sergei's Bio:
Dr. Sergei Izrailev is Chief Data Scientist at BeeswaxIO, where he is responsible for data strategy and building AI applications powering the next generation of real-time bidding technology. Before Beeswax, Sergei led data science teams at Integral Ad Science and Collective, where he focused on architecture, development and scaling of data science based advertising technology products. Prior to advertising, Sergei was a quant/trader and developed trading strategies and portfolio optimization methodologies. Previously, he worked as a senior scientist at Johnson & Johnson, where he developed intelligent tools for structure-based drug discovery. Sergei holds a Ph.D. in Physics and Master of Computer Science degrees from the University of Illinois at Urbana-Champaign.
Gender Prediction with Databricks AutoML PipelineDatabricks
As the nation’s leading advocate for people aged 50+, each month AARP conducts thousands of campaigns made up of hundreds of millions of emails, mails and phone calls to over 37 million members and a broader universe of non-members. Missing information on demographics results in less accurate profiling and targeting strategies. For example, there are 1.5 Million active members and 15 Million expired members missing gender information in AARP’s database. The Name gender model is a use case where AARP Data Analytics team utilized the Databricks Lakehouse platform to create a fully automated machine learning model. The Random Forest Classifier used 800 thousand existing distinct first names, ages, and over 700 variables derived from the letter composition of first names to predict gender. It leveraged MLflow to track the accuracy of models and log the metrics overtime, and registered multiple model versions to pick the best for production. Model training and scoring were scheduled and the auto ML pipeline significantly minimized manual working hours after the initial set up. As a result, AARP dramatically (and accurately) improved the coverage of gender information from about 92.5% to 99.5%.
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016MLconf
Building a Machine Learning Platform at Quora: Each month, over 100 million people use Quora to share and grow their knowledge. Machine learning has played a critical role in enabling us to grow to this scale, with applications ranging from understanding content quality to identifying users’ interests and expertise. By investing in a reusable, extensible machine learning platform, our small team of ML engineers has been able to productionize dozens of different models and algorithms that power many features across Quora.
In this talk, I’ll discuss the core ideas behind our ML platform, as well as some of the specific systems, tools, and abstractions that have enabled us to scale our approach to machine learning.
Tech talk by Serena Signorelli (https://www.linkedin.com/in/serenasignorelli/) in the event ''Tensorflow and Sparklyr: Scaling Deep Learning and R to the Big Data ecosystem'', May 15, 2017 at ICTeam Grassobbio (BG). The event was part of the Data Science Milan Meetup (https://www.meetup.com/it-IT/Data-Science-Milan/).
Misha Bilenko, Principal Researcher, Microsoft at MLconf SEA - 5/01/15MLconf
Many Shades of Scale: Big Learning Beyond Big Data: In the machine learning research community, much of the attention devoted to ‘big data’ in recent years has been manifested as development of new algorithms and systems for distributed training on many examples. This focus has led to significant advances in the field, from basic but operational implementations on popular platforms to highly sophisticated prototypes in the literature. In the meantime, other aspects of scaling up learning have received relatively little attention, although they are often more pressing in practice. The talk will survey these less-studied facets of big learning: scaling to an extremely large number of features, to many components in predictive pipelines, and to multiple data scientists collaborating on shared experiments.
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Sri Ambati
This talk was recorded in London on Oct 30, 2018 and can be viewed here: https://youtu.be/p4iAnxwC_Eg
The good news is building fair, accountable, and transparent machine learning systems is possible. The bad news is it’s harder than many blogs and software package docs would have you believe. The truth is nearly all interpretable machine learning techniques generate approximate explanations, that the fields of eXplainable AI (XAI) and Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) are very new, and that few best practices have been widely agreed upon. This combination can lead to some ugly outcomes!
This talk aims to make your interpretable machine learning project a success by describing fundamental technical challenges you will face in building an interpretable machine learning system, defining the real-world value proposition of approximate explanations for exact models, and then outlining the following viable techniques for debugging, explaining, and testing machine learning models
Mateusz is a software developer who loves all things distributed, machine learning and hates buzzwords. His favourite hobby data juggling.
He obtained his M.Sc. in Computer Science from AGH UST in Krakow, Poland, during which he did an exchange at L’ECE Paris in France and worked on distributed flight booking systems. After graduation he move to Tokyo to work as a researcher at Fujitsu Laboratories on machine learning and NLP projects, where he is still currently based.
Logistic Regression is one of the most popular Machine Learning methods for solving classification problems. With Logistic Regressions in your Dashboard and in the BigML API, you will be able to easily create and download models to your environment for fast local predictions.
Our Spring 2017 release brings Time Series, a supervised learning method for analyzing time based data when historical patterns can explain future behavior. This new resource is available from the BigML Dashboard, API, as well as from WhizzML for its automation. Time Series is commonly used for predicting stock prices, sales forecasting, website traffic, production and inventory analysis as well as weather forecasting, among many other use cases.
VSSML16 LR1. Summary Day 1
Valencian Summer School in Machine Learning 2016
Day 1
Summary Day 1
Mercè Martin (BigML)
https://bigml.com/events/valencian-summer-school-in-machine-learning-2016
In this presentation, I talk about data science competitions. After an introduction of the data science competitions, I go through the benefits, misconceptions, and best practices of competitions.
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Sri Ambati
Presented at #H2OWorld 2017 in Mountain View, CA.
Enjoy the video: https://youtu.be/-rGRHrED94Y.
Learn more about H2O.ai: https://www.h2o.ai/.
Follow @h2oai: https://twitter.com/h2oai.
- - -
Abstract:
Most machine learning systems enable two essential processes: creating a model and applying the model in a repeatable and controlled fashion. These two processes are interrelated and pose technological and organizational challenges as they evolve from research to prototype to production. This presentation outlines common design patterns for tackling such challenges while implementing machine learning in a production environment.
Sergei's Bio:
Dr. Sergei Izrailev is Chief Data Scientist at BeeswaxIO, where he is responsible for data strategy and building AI applications powering the next generation of real-time bidding technology. Before Beeswax, Sergei led data science teams at Integral Ad Science and Collective, where he focused on architecture, development and scaling of data science based advertising technology products. Prior to advertising, Sergei was a quant/trader and developed trading strategies and portfolio optimization methodologies. Previously, he worked as a senior scientist at Johnson & Johnson, where he developed intelligent tools for structure-based drug discovery. Sergei holds a Ph.D. in Physics and Master of Computer Science degrees from the University of Illinois at Urbana-Champaign.
Gender Prediction with Databricks AutoML PipelineDatabricks
As the nation’s leading advocate for people aged 50+, each month AARP conducts thousands of campaigns made up of hundreds of millions of emails, mails and phone calls to over 37 million members and a broader universe of non-members. Missing information on demographics results in less accurate profiling and targeting strategies. For example, there are 1.5 Million active members and 15 Million expired members missing gender information in AARP’s database. The Name gender model is a use case where AARP Data Analytics team utilized the Databricks Lakehouse platform to create a fully automated machine learning model. The Random Forest Classifier used 800 thousand existing distinct first names, ages, and over 700 variables derived from the letter composition of first names to predict gender. It leveraged MLflow to track the accuracy of models and log the metrics overtime, and registered multiple model versions to pick the best for production. Model training and scoring were scheduled and the auto ML pipeline significantly minimized manual working hours after the initial set up. As a result, AARP dramatically (and accurately) improved the coverage of gender information from about 92.5% to 99.5%.
Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016MLconf
Building a Machine Learning Platform at Quora: Each month, over 100 million people use Quora to share and grow their knowledge. Machine learning has played a critical role in enabling us to grow to this scale, with applications ranging from understanding content quality to identifying users’ interests and expertise. By investing in a reusable, extensible machine learning platform, our small team of ML engineers has been able to productionize dozens of different models and algorithms that power many features across Quora.
In this talk, I’ll discuss the core ideas behind our ML platform, as well as some of the specific systems, tools, and abstractions that have enabled us to scale our approach to machine learning.
Tech talk by Serena Signorelli (https://www.linkedin.com/in/serenasignorelli/) in the event ''Tensorflow and Sparklyr: Scaling Deep Learning and R to the Big Data ecosystem'', May 15, 2017 at ICTeam Grassobbio (BG). The event was part of the Data Science Milan Meetup (https://www.meetup.com/it-IT/Data-Science-Milan/).
Misha Bilenko, Principal Researcher, Microsoft at MLconf SEA - 5/01/15MLconf
Many Shades of Scale: Big Learning Beyond Big Data: In the machine learning research community, much of the attention devoted to ‘big data’ in recent years has been manifested as development of new algorithms and systems for distributed training on many examples. This focus has led to significant advances in the field, from basic but operational implementations on popular platforms to highly sophisticated prototypes in the literature. In the meantime, other aspects of scaling up learning have received relatively little attention, although they are often more pressing in practice. The talk will survey these less-studied facets of big learning: scaling to an extremely large number of features, to many components in predictive pipelines, and to multiple data scientists collaborating on shared experiments.
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Sri Ambati
This talk was recorded in London on Oct 30, 2018 and can be viewed here: https://youtu.be/p4iAnxwC_Eg
The good news is building fair, accountable, and transparent machine learning systems is possible. The bad news is it’s harder than many blogs and software package docs would have you believe. The truth is nearly all interpretable machine learning techniques generate approximate explanations, that the fields of eXplainable AI (XAI) and Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) are very new, and that few best practices have been widely agreed upon. This combination can lead to some ugly outcomes!
This talk aims to make your interpretable machine learning project a success by describing fundamental technical challenges you will face in building an interpretable machine learning system, defining the real-world value proposition of approximate explanations for exact models, and then outlining the following viable techniques for debugging, explaining, and testing machine learning models
Mateusz is a software developer who loves all things distributed, machine learning and hates buzzwords. His favourite hobby data juggling.
He obtained his M.Sc. in Computer Science from AGH UST in Krakow, Poland, during which he did an exchange at L’ECE Paris in France and worked on distributed flight booking systems. After graduation he move to Tokyo to work as a researcher at Fujitsu Laboratories on machine learning and NLP projects, where he is still currently based.
Logistic Regression is one of the most popular Machine Learning methods for solving classification problems. With Logistic Regressions in your Dashboard and in the BigML API, you will be able to easily create and download models to your environment for fast local predictions.
Our Spring 2017 release brings Time Series, a supervised learning method for analyzing time based data when historical patterns can explain future behavior. This new resource is available from the BigML Dashboard, API, as well as from WhizzML for its automation. Time Series is commonly used for predicting stock prices, sales forecasting, website traffic, production and inventory analysis as well as weather forecasting, among many other use cases.
VSSML17 L7. REST API, Bindings, and Basic WorkflowsBigML, Inc
Valencian Summer School in Machine Learning 2017 - Day 2
Lecture 7: REST API, Bindings, and Basic Workflows. By jao - Jose A. Ortega - (BigML).
https://bigml.com/events/valencian-summer-school-in-machine-learning-2017
There are just so much valuation information behind the data, but we could only use something we want to know or we already know. Data mining could help you to identify what's the valuable information behind, but absence of tools and knowledge is always the show stopper.
IBM SPSS Catalyst helps you to unveil the hidden gems in three steps:
1. Upload data
2. Run the program
3. All relationship and dependencies of data will be shown in easy reading graphical interfaces
Check this out and contact us for more information!!
2017 AI and Cloud Brand Leader Mini-ReportIT Brand Pulse
This IT Brand Pulse mini-report includes only market leader data from the independent, non-sponsored survey covering six categories of brand leadership—Market, Price, Performance, Reliability, Service & Support and Innovation—for nine AI and Cloud products.
Complete survey data for each product category is available. Please contact us at info@itbrandpulse.com for information and pricing.
How to implement a truly modular ecommerce platform on the example of Spryker...Fabian Wesner
The focus of my presentation are patterns, principles and especially practical solutions how to implement an ecommerce platform that consists of hundreds decoupled and coherent modules. This talk is suitable for senior developers who design large scale systems and are interested in architectural insights and inspirations.
Keywords: SOLID, Packaging Principles, Design Patterns, Macro- and Microservices, Semantic Versioning, Composer
IBM Connections Customizer – A Whole New World of PossibilitiesLetsConnect
Adapting the look, feel and behaviour of IBM Connections has never been easy but the advent of a new tool called IBM Connections Customizer changes all that. This session demonstrates how to modify the Connection’s UI by taking you through use cases ranging from simple tweaks to all-new sophisticated visual layouts. Learn to add new capabilities to Connections by adding actions that call the latest API’s and integrate secure 3rd party services like Watson. The Customizer development and governance models will also be fully explained – come learn how to take control of your IBM Connections world!
Presentation that was developed for IoT DevCon in April 2017, Santa Clara California USA.
In this deck, I go though the value of data & derivative (analytics) form an economic point of value, and then connect to how we can package that value for monetization in the context of IoT. The key technology enablement is the Digital Twin, which is describe as well as platform that support this paradigm. It ends with recommendations on how to get ready and how to start using GE Digital Predix Platform.
The Scout24 Data Platform (A Technical Deep Dive)RaffaelDzikowski
The Scout24 Data Platform powers all reporting, ad hoc analytics and machine learning products at AutoScout24 and ImmobilienScout24. In this talk, we will take a technical deep dive into our modern, cloud-based big data platform. We will discuss our evolution of approaches to ingestion, ETL, access control, reporting, and machine learning with a focus on in-the-trenches learnings gained from our many failures and successes as we migrated from a traditional Oracle Data Warehouse to an AWS-based data lake.
Data engineering at the interface of art and analytics: the why, what, and ho...Data Con LA
Abstract:- Netflix has a growing presence in Hollywood, with technical teams working on everything from high-speed video editing pipelines to machine learning methods for categorizing films. Data is foundational across these efforts and in this talk Josh will take a tour through why we invest so much in data about content, what data engineering challenges we tackle, and the style in which we do it.
Bhadale group of companies projects portfolio - This is a list of public shareable projects for the past 10 years.Technologies used are AI / ML, Scala , Spark, Akka, Play, IoT, Hadoop, React, Javascript and several other related ones
This IT Brand Pulse mini-report includes only market leader data from the independent, non-sponsored survey conducted in March 2017 covering six categories of brand leadership–Market, Price, Performance, Reliability, Service & Support and Innovation–for twelve Networking & Scale-out Storage products.
Complete survey data for each product category is available. Please contact us at info@itbrandpulse.com for information and pricing.
A video conversation with performance and capacity management veteran, Boris Zibitsker, about how I saved a multi-million dollar computing platform, using a 1-line performance model (at 21:50 minutes). "Best practices" caused the problem.
1) Learn about Myplanet's Headless CMS solution using Gatsby Preview and Contentful’s UI Extensions (https://www.contentful.com/resources/serverless/)
2) their Serverless project with IBM - using Apache OpenWhisk (https://www.ibm.com/cloud/functions)
3) how Myplanet got involved with AWS DeepRacer - a fun way to get started with Reinforcement Learning (RL), and their racing experience at re:Invent DeepRacer League (https://reinvent.awsevents.com/learn/deepracer/)
4) their Machine Learning (ML) research related to finding DeepRacer’s ideal line (https://medium.com/myplanet-musings/the-best-path-a-deepracer-can-learn-2a468a3f6d64).
BONUS: Two TED Talks referenced in the intro
5) When ideas have sex | Matt Ridley | Jul 14, 2010 https://www.ted.com/talks/matt_ridley_when_ideas_have_sex
6) Why The Best Leaders Make Love The Top Priority | Matt Tenney | Dec 5, 2019 https://www.youtube.com/watch?v=qCVoohdyI6I
VIDEO: https://youtu.be/ZH1xxmBNx5k
IMS on Mainframe host many Enterprise Critical assets, transactional and batch applications as well as data. Analytics solutions apply to both!
Contact me for more details
Optimizing your SparkML pipelines using the latest features in Spark 2.3DataWorks Summit
This talk will highlight the recent additions of vectorized UDFs and parallel cross-validation in Apache Spark 2.3. Vectorized UDFs not only enhance performance, but it also opens up more possibilities by using Pandas for input and output of the UDF. Parallel cross validation speeds up tuning ML models by exploiting your Spark cluster resources to the max. Bryan will discuss the details of Apache Arrow in Spark, how it is relevant to the rest of the big data ecosystem, and what is in store for the future. We will share performance results from using parallelism in cross-validation and some on-going work with optimizing ML pipelines.
This talk will also touch upon how we are leveraging this work to enhance the end to end Enterprise AI lifecycle in Open Source for developers and data scientists. Finally, we will highlight some of the other relevant projects at the Center for Open Source Data and AI Technologies and how you can contribute to the same.
Speakers
Vijay Bommireddipalli, Program Director: Center for Open Source Data and AI Technologies, IBM
Bryan Cutler, Software Engineer - Center for Open Source Data and AI Technologies, IBM
Digital Transformation and Process Optimization in ManufacturingBigML, Inc
Keyanoush Razavidinani, Digital Services Consultant at A1 Digital, a BigML Partner, highlights why it is important to identify and reduce human bottlenecks that optimize processes and let you focus on important activities. Additionally, Guillem Vidal, Machine Learning Engineer at BigML completes the session by showcasing how Machine Learning is put to use in the manufacturing industry with a use case to detect factory failures.
The Road to Production: Automating your Anomaly Detectors - by jao (Jose A. Ortega), Co-Founder and Chief Technology Officer at BigML.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - ML for AML ComplianceBigML, Inc
Machine Learning for Anti Money Laundering Compliance, by Kevin Nagel, Consultant and Data Scientist at INFORM.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - Multi Perspective AnomaliesBigML, Inc
Multi Perspective Anomalies, by Jan W Veldsink, Master in the art of AI at Nyenrode, Rabobank, and Grio.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - My First Anomaly Detector BigML, Inc
My First Anomaly Detector: Practical Workshop, by Mercè Martín, VP of Bindings and Applications at BigML.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - History and Developments in MLBigML, Inc
History and Present Developments in Machine Learning, by Tom Dietterich, Emeritus Professor of computer science at Oregon State University and Chief Scientist at BigML.
*Machine Learning School in The Netherlands 2022.
Introduction to End-to-End Machine Learning: Classification and Regression - Mercè Martín, VP of Bindings and Applications at BigML.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - A Data-Driven CompanyBigML, Inc
A Data-Driven Company: 21 Lessons for Large Organizations to Create Value from AI, by Richard Benjamins, Chief AI and Data Strategist at Telefónica.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - ML in the Legal SectorBigML, Inc
How Machine Learning Transforms and Automates Legal Services, by Arnoud Engelfriet, Co-Founder at Lynn Legal.
*Machine Learning School in The Netherlands 2022.
Machine Learning for Public Safety: Reducing Violence and Discrimination in Stadiums.
Speakers: Ramon van Ingen, Co-Founder at Siip, Entrepreneur, Researcher, and Pablo González, Machine Learning Engineer at BigML.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsBigML, Inc
Process Optimization in Manufacturing Plants, by Keyanoush Razavidinani, Digital Business Consultant at A1 Digital.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - Anomaly Detection at ScaleBigML, Inc
Lessons Learned Applying Anomaly Detection at Scale, by Álvaro Clemente, Machine Learning Engineer at BigML.
*Machine Learning School in The Netherlands 2022.
DutchMLSchool 2022 - Citizen Development in AIBigML, Inc
Citizen Development in AI, by Jan W Veldsink, Master in the art of AI at Nyenrode, Rabobank, and Grio.
*Machine Learning School in The Netherlands 2022.
This new feature is a continuation of and improvement on our previous Image Processing release. Now, Object Detection lets you go a step further with your image data and allows you to locate objects and annotate regions in your images. Once your image regions are defined, you can train and evaluate Object Detection models, make predictions with them, and automate end-to-end Machine Learning workflows on a single platform. To make that possible, BigML enables Object Detection by introducing the regions optype.
As with any other BigML feature, Object Detection is available from the BigML Dashboard, API, and WhizzML for automation. Object Detection is extremely helpful to tackle a wide range of computer vision use cases such as medical image analysis, quality control in manufacturing, license plate recognition in transportation, people detection in security surveillance, among many others.
This new release brings Image Processing to the BigML platform, a feature that enhances our offering to solve image data-driven business problems with remarkable ease of use. Because BigML treats images as any other data type, this unique implementation allows you to easily use image data alongside text, categorical, numeric, date-time, and items data types as input to create any Machine Learning model available in our platform, both supervised and unsupervised.
Now, it is easier than ever to solve a wide variety of computer vision and image classification use cases in a single platform: label your image data, train and evaluate your models, make predictions, and automate your end-to-end Machine Learning workflows. As with any other BigML feature, Image Processing is available from the BigML Dashboard, API, and WhizzML, and it can be applied to solve use cases such as medical image analysis, visual product search, security surveillance, and vehicle damage detection, among others.
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureBigML, Inc
This session presents a quite common situation for those working in food and beverage retail (FnB) and highlights interesting insights to fight waste reduction.
Speaker: Stephen Kinns, CEO and Co-Founder at catsAi.
*ML in Retail 2021: Webinar.
Machine Learning in Retail: ML in the Retail SectorBigML, Inc
This is an introductory session about the role that Machine Learning is playing in the retail sector and how it is being deployed across the different areas of this industry.
Speaker: Atakan Cetinsoy, VP of Predictive Applications at BigML.
*ML in Retail 2021: Webinar.
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotBigML, Inc
This presentation analyzes the role that Machine Learning plays in legal automation with a real-world Machine Learning application.
Speaker: Arnoud Engelfriet, Co-Founder at Lynn Legal.
*ML in GRC 2021: Virtual Conference.
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...BigML, Inc
This is a real-life Machine Learning use case about integrated risk.
Speakers: Thomas Rengersen, Product Owner of the Governance Risk and Compliance Tool for Rabobank, and Thomas Alderse Baas, Co-Founder and Director of The Bowmen Group.
*ML in GRC 2021: Virtual Conference.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
2. BigML, Inc 2Summer Release Webinar - October 2017
Summer 2017 Release
CHARLES PARKER, PH.D. - VP of Machine
Learning Algorithms
Please enter questions into chat box – We will
answer some via chat and others at the end of the
session
https://bigml.com/releases/summer-2017
ATAKAN CETINSOY - VP of Predictive Applications
Resources
Moderator
Speaker
Contact support@bigml.com
Twitter @bigmlcom
Questions
3. BigML, Inc 3Summer Release Webinar - October 2017
Power To The People!
• Why another supervised learning algorithm?
• Deep neural networks have been shown to be
state of the art in several niche applications
• Vision
• Speech recognition
• NLP
• While powerful, these networks have historically
been difficult for novices to train
4. BigML, Inc 4Summer Release Webinar - October 2017
Goals of BigML Deepnets
• What BigML Deepnets are not (yet)
• Convolutional networks
• Recurrent networks (e.g., LSTM Networks)
• These solve a particular type of sub-problem, and
are carefully engineered by experts to do so
• Can we bring some of the power of Deep Neural
Networks to your problem, even if you have no
deep learning expertise?
• Let’s try to separate deep neural network myths
from realities
5. BigML, Inc 5Summer Release Webinar - October 2017
Myth #1
Deep neural networks are the next step in evolution,
destined to perfect humanity or destroy it utterly.
6. BigML, Inc 6Summer Release Webinar - October 2017
Some Weaknesses
• Trees
• Pro: Massive representational power that expands as the data
gets larger; efficient search through this space
• Con: Difficult to represent smooth functions and functions of
many variables
• Ensembles mitigate some of these difficulties
• Logistic Regression
• Pro: Some smooth, multivariate, functions are not a problem;
fast optimization
• Con: Parametric - If decision boundary is nonlinear, tough luck
• Can these be mitigated?
7. BigML, Inc 7Summer Release Webinar - October 2017
Logistic Level Up
Outputs
Inputs
8. BigML, Inc 8Summer Release Webinar - October 2017
Logistic Level Up
wi
Class “a”, logistic(w, b)
9. BigML, Inc 9Summer Release Webinar - October 2017
Logistic Level Up
Outputs
Inputs
Hidden layer
10. BigML, Inc 10Summer Release Webinar - October 2017
Logistic Level Up
Class “a”, logistic(w, b)
Hidden node 1,
logistic(w, b)
11. BigML, Inc 11Summer Release Webinar - October 2017
Logistic Level Up
Class “a”, logistic(w, b)
Hidden node 1,
logistic(w, b)
n
hidden nodes?
12. BigML, Inc 12Summer Release Webinar - October 2017
Logistic Level Up
Class “a”, logistic(w, b)
Hidden node 1,
logistic(w, b)
n
hidden
layers?
13. BigML, Inc 13Summer Release Webinar - October 2017
Logistic Level Up
Class “a”, logistic(w, b)
Hidden node 1,
logistic(w, b)
14. BigML, Inc 14Summer Release Webinar - October 2017
Myth #2
Deep neural networks are great for the established
marquee applications, but less interesting for general use.
15. BigML, Inc 15Summer Release Webinar - October 2017
Parameter Paralysis
Parameter Name Possible Values
Descent Algorithm Adam, RMSProp, Adagrad, Momentum, FTRL
Number of hidden layers 0 - 32
Activation Function (per layer) relu, tanh, sigmoid, softplus, etc.
Number of nodes (per layer) 1 - 8192
Learning Rate 0 - 1
Dropout Rate 0 - 1
Batch size 1 - 1024
Batch Normalization True, False
Learn Residuals True, False
Missing Numerics True, False
Objective weights Weight per class
. . . and that’s ignoring the parameters that are
specific to the descent algorithm.
16. BigML, Inc 16Summer Release Webinar - October 2017
What Can We Do?
• Clearly there are too many parameters to fuss with
• Setting them takes significant expert knowledge
• Solution: Metalearning (a good initial guess)
• Solution: Network search (try a bunch)
17. BigML, Inc 17Summer Release Webinar - October 2017
Metalearning
We trained 296,748 deep neural networks
so you don’t have to!
18. BigML, Inc 18Summer Release Webinar - October 2017
Bayesian Parameter Optimization
Model and EvaluateStructure 1
Structure 2
Structure 3
Structure 4
Structure 5
Structure 6
19. BigML, Inc 19Summer Release Webinar - October 2017
Bayesian Parameter Optimization
Model and EvaluateStructure 1
Structure 2
Structure 3
Structure 4
Structure 5
Structure 6
0.75
20. BigML, Inc 20Summer Release Webinar - October 2017
Bayesian Parameter Optimization
Model and EvaluateStructure 1
Structure 2
Structure 3
Structure 4
Structure 5
Structure 6
0.75
0.48
21. BigML, Inc 21Summer Release Webinar - October 2017
Bayesian Parameter Optimization
Model and EvaluateStructure 1
Structure 2
Structure 3
Structure 4
Structure 5
Structure 6
0.75
0.48
0.91
23. BigML, Inc 23Summer Release Webinar - October 2017
Benchmarking
• The ML world is filled with crummy benchmarks
• Not enough datasets
• No cross-validation
• Only one metric
• Solution: Roll our own
• 50+ datasets, 5 replications of 10-fold CV
• 10 different metrics
• 30+ competing algorithms (R, scikit-learn, weka, xgboost)
http://www.clparker.org/ml_benchmark/
24. BigML, Inc 24Summer Release Webinar - October 2017
Myth #3
Deep neural networks have such spectacular performance
that all other supervised learning techniques are now irrelevant
25. BigML, Inc 25Summer Release Webinar - October 2017
Caveat Emptor
• Things that make deep learning less useful:
• Small data (where that could still be thousands of instances)
• Problems where you could benefit by iterating quickly (better
features always beats better models)
• Problems that are easy, or for which top-of-the-line
performance isn’t absolutely critical
• Remember deep learning is just another sort
of supervised learning algorithm
“…deep learning has existed in the neural network community for over 20 years. Recent advances are
driven by some relatively minor improvements in algorithms and models and by the availability of large
data sets and much more powerful collections of computers.” — Stuart Russell
26. BigML, Inc 26Summer Release Webinar - October 2017
https://bigml.com/releases/summer-2017
Learn about Deepnets