This document provides an overview of H2O.ai, an open source in-memory machine learning platform. It describes H2O.ai's product as an in-memory prediction engine, its team of 37 distributed systems engineers doing machine learning, and its headquarters in Mountain View, CA. It also provides details on how to use H2O with R and Python for scalable machine learning on large datasets across distributed systems.
How Deep Learning Will Make Us More Human Again
While deep learning is taking over the AI space, most of us are struggling to keep up with the pace of innovation. Arno Candel shares success stories and challenges in training and deploying state-of-the-art machine learning models on real-world datasets. He will also share his insights into what the future of machine learning and deep learning might look like, and how to best prepare for it.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Michal Malohlava talks about the PySparkling Water package for Spark and Python users.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Arno Candel, Chief Architect, H2O.ai talks about what's new in H2O including all the new advancements in the algorithms.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Migrating from Closed to Open Source - Fonda Ingram & Ken SanfordSri Ambati
Ken and Fonda will talk through how organizations are embracing open source machine learning and AI platforms and what strategies to use to make the transformation easier.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Michal Malohlava from H2O.ai talks about the new features in Sparkling Water 2.0 and the future roadmap.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
How Deep Learning Will Make Us More Human Again
While deep learning is taking over the AI space, most of us are struggling to keep up with the pace of innovation. Arno Candel shares success stories and challenges in training and deploying state-of-the-art machine learning models on real-world datasets. He will also share his insights into what the future of machine learning and deep learning might look like, and how to best prepare for it.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Michal Malohlava talks about the PySparkling Water package for Spark and Python users.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Arno Candel, Chief Architect, H2O.ai talks about what's new in H2O including all the new advancements in the algorithms.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Migrating from Closed to Open Source - Fonda Ingram & Ken SanfordSri Ambati
Ken and Fonda will talk through how organizations are embracing open source machine learning and AI platforms and what strategies to use to make the transformation easier.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Michal Malohlava from H2O.ai talks about the new features in Sparkling Water 2.0 and the future roadmap.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
H2O Rains with Databricks Cloud - NY 02.16.16Sri Ambati
Michal Malohlava's presentation on H2O Rains with Databricks Cloud, New York, NY 02.16.16
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Slides from Matt Dowle's presentation at H2O Open Tour: NYC
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Machine Learning with H2O, Spark, and Python at Strata 2015Sri Ambati
Machine Learning with H2O, Spark, and Python at Strata SJ 2015-by Cliff Click and Michal Malohlava
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Scalable Machine Learning in R and Python with H2OSri Ambati
The focus of this presentation is scalable machine learning using the h2o R and Python packages. H2O is an open source, distributed machine learning platform designed for big data, with the added benefit that it's easy to use on a laptop (in addition to a multi-node Hadoop or Spark cluster). The core machine learning algorithms of H2O are implemented in high-performance Java, however, fully-featured APIs are available in R, Python, Scala, REST/JSON, and also through a web interface.
Since H2O's algorithm implementations are distributed, this allows the software to scale to very large datasets that may not fit into RAM on a single machine. H2O currently features distributed implementations of Generalized Linear Models, Gradient Boosting Machines, Random Forest, Deep Neural Nets, Stacked Ensembles (aka "Super Learners"), dimensionality reduction methods (PCA, GLRM), clustering algorithms (K-means), anomaly detection methods, among others.
R and Python code with H2O machine learning code examples will be demoed live and will be made available on GitHub for participants to follow along on their laptops if they choose. For those interested in running the code on a multi-node Amazon EC2 cluster, an H2O AMI is also available.
Author Bio:
Dr. Erin LeDell is a Machine Learning Scientist at H2O.ai, the company that produces the open source machine learning platform, H2O. Erin received her Ph.D. in Biostatistics with a Designated Emphasis in Computational Science and Engineering from UC Berkeley. Before joining H2O.ai, she was the Principal Data Scientist at Wise.io (acquired by GE in 2016) and Marvin Mobile Security (acquired by Veracode in 2012) and the founder of DataScientific, Inc.
Intro to H2O Machine Learning in Python - Galvanize SeattleSri Ambati
Erin LeDell presents Intro to H2O Machine Learning in Python at Galvanize Seattle, 02.02.16
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and ShinyJo-fai Chow
Joe recently teamed up with IBM and Aginity to create a proof of concept "Moneyball" app for the IBM Think conference in Vegas. The original goal was to prove that different tools (e.g. H2O, Aginity AMP, IBM Data Science Experience, R and Shiny) could work together seamlessly for common business use-cases. Little did Joe know, the app would be used by Ari Kaplan (the real "Moneyball" guy) to validate the future performance of some baseball players. Ari recommended one player to a Major League Baseball team. The player was signed the next day with a multimillion-dollar contract. This talk is about Joe's journey to a real "Moneyball" application.
Deep learning algorithms have benefited greatly from the recent performance gains of GPUs. However, it has been unclear whether GPUs can speed up machine learning algorithms such as generalized linear modeling, random forests, gradient boosting machines, and clustering. H2O.ai, the leading open source AI company, is bringing the best-of-breed data science and machine learning algorithms to GPUs.
We introduce H2O4GPU, a fully featured machine learning library that is optimized for GPUs with a robust python API that is drop dead replacement for scikit-learn. We'll demonstrate benchmarks for the most common algorithms relevant to enterprise AI and showcase performance gains as compared to running on CPUs.
Jon’s Bio:
https://umdphysics.umd.edu/people/faculty/current/item/337-jcm.html
Please view the video here:
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIASri Ambati
Presented at #H2OWorld 2017 in Mountain View, CA.
Enjoy the recording: https://youtu.be/NyaJ7uDroww.
Learn more about H2O.ai: https://www.h2o.ai/.
Follow @h2oai: https://www.twitter.com/h2oai.
This is my Deep Water talk for the TensorFlow Paris meetup.
Deep Water is H2O's integration with multiple open source deep learning libraries such as TensorFlow, MXNet and Caffe. On top of the performance gains from GPU backends, Deep Water naturally inherits all H2O properties in scalability. ease of use and deployment.
Intro to H2O Machine Learning in R at Santa Clara UniversitySri Ambati
Erin LeDell's presentation on Intro to H2O Machine Learning in R at SCU
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
"Comparing Variable Importance from Ensemble and Deep Learning Methods for AdTech Data"
Variable Importance brings interpretability to popular black box modeling techniques. In this talk we study performance of popular ensemble techniques like Random Forest, Gradient Boosting with GLM. We observe certain traits that get magnified by non-linear techniques like Deep Learning that are otherwise missed by GBM or Random Forest.
We describe Open Source Scalable Machine Learning package, H2O which through ease-of-use and speed makes comparisons and picking best-of-breed and ensembles more natural. H2O's implementation of these algorithms tracks popular open source and text book implementations closely.
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...Spark Summit
AI plays a central role in the today’s Internet applications and emerging intelligent systems, which are driving the need for scalable, distributed big data analytics with deep learning capabilities. There is increasing demand from organizations to discover and explore data using advanced big data analytics and deep learning. In this talk, we will share how we work with our users to build deep learning powered big data analytics applications (e.g., object detection, image recognition, NLP, etc.) using BigDL, an open source distributed deep learning library for Apache Spark.
H2O Deep Water - Making Deep Learning Accessible to EveryoneSri Ambati
Deep Water is H2O's integration with multiple open source deep learning libraries such as TensorFlow, MXNet and Caffe. On top of the performance gains from GPU backends, Deep Water naturally inherits all H2O properties in scalability. ease of use and deployment. In this talk, I will go through the motivation and benefits of Deep Water. After that, I will demonstrate how to build and deploy deep learning models with or without programming experience using H2O's R/Python/Flow (Web) interfaces.
Jo-fai (or Joe) is a data scientist at H2O.ai. Before joining H2O, he was in the business intelligence team at Virgin Media in UK where he developed data products to enable quick and smart business decisions. He also worked remotely for Domino Data Lab in the US as a data science evangelist promoting products via blogging and giving talks at meetups. Joe has a background in water engineering. Before his data science journey, he was an EngD research engineer at STREAM Industrial Doctorate Centre working on machine learning techniques for drainage design optimization. Prior to that, he was an asset management consultant specialized in data mining and constrained optimization for the utilities sector in the UK and abroad. He also holds an MSc in Environmental Management and a BEng in Civil Engineering.
These slides will show how to approach a multi-class (classification) problem using H2O. The data that is being used is an aggregated log of multiple systems that are constantly providing information about their status, connections and traffic. In large organizations, these log datasets can be very huge and unidentifiable due to the number of sources, legacy systems etc. In our example, we use a created response for each source. The use H2O to classify the source of data.
Author Bio: Ashrith Barthur is a Security Scientist at H2O currently working on algorithms that detect anomalous behaviour in user activities, network traffic, attacks, financial fraud and global money movement. He has a PhD from Purdue University in the field of information security, specialized in Anomalous behaviour in DNS protocol.
Don’t forget to download H2O!
http://www.h2o.ai/download/
H2O Rains with Databricks Cloud - NY 02.16.16Sri Ambati
Michal Malohlava's presentation on H2O Rains with Databricks Cloud, New York, NY 02.16.16
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Slides from Matt Dowle's presentation at H2O Open Tour: NYC
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Machine Learning with H2O, Spark, and Python at Strata 2015Sri Ambati
Machine Learning with H2O, Spark, and Python at Strata SJ 2015-by Cliff Click and Michal Malohlava
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Scalable Machine Learning in R and Python with H2OSri Ambati
The focus of this presentation is scalable machine learning using the h2o R and Python packages. H2O is an open source, distributed machine learning platform designed for big data, with the added benefit that it's easy to use on a laptop (in addition to a multi-node Hadoop or Spark cluster). The core machine learning algorithms of H2O are implemented in high-performance Java, however, fully-featured APIs are available in R, Python, Scala, REST/JSON, and also through a web interface.
Since H2O's algorithm implementations are distributed, this allows the software to scale to very large datasets that may not fit into RAM on a single machine. H2O currently features distributed implementations of Generalized Linear Models, Gradient Boosting Machines, Random Forest, Deep Neural Nets, Stacked Ensembles (aka "Super Learners"), dimensionality reduction methods (PCA, GLRM), clustering algorithms (K-means), anomaly detection methods, among others.
R and Python code with H2O machine learning code examples will be demoed live and will be made available on GitHub for participants to follow along on their laptops if they choose. For those interested in running the code on a multi-node Amazon EC2 cluster, an H2O AMI is also available.
Author Bio:
Dr. Erin LeDell is a Machine Learning Scientist at H2O.ai, the company that produces the open source machine learning platform, H2O. Erin received her Ph.D. in Biostatistics with a Designated Emphasis in Computational Science and Engineering from UC Berkeley. Before joining H2O.ai, she was the Principal Data Scientist at Wise.io (acquired by GE in 2016) and Marvin Mobile Security (acquired by Veracode in 2012) and the founder of DataScientific, Inc.
Intro to H2O Machine Learning in Python - Galvanize SeattleSri Ambati
Erin LeDell presents Intro to H2O Machine Learning in Python at Galvanize Seattle, 02.02.16
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and ShinyJo-fai Chow
Joe recently teamed up with IBM and Aginity to create a proof of concept "Moneyball" app for the IBM Think conference in Vegas. The original goal was to prove that different tools (e.g. H2O, Aginity AMP, IBM Data Science Experience, R and Shiny) could work together seamlessly for common business use-cases. Little did Joe know, the app would be used by Ari Kaplan (the real "Moneyball" guy) to validate the future performance of some baseball players. Ari recommended one player to a Major League Baseball team. The player was signed the next day with a multimillion-dollar contract. This talk is about Joe's journey to a real "Moneyball" application.
Deep learning algorithms have benefited greatly from the recent performance gains of GPUs. However, it has been unclear whether GPUs can speed up machine learning algorithms such as generalized linear modeling, random forests, gradient boosting machines, and clustering. H2O.ai, the leading open source AI company, is bringing the best-of-breed data science and machine learning algorithms to GPUs.
We introduce H2O4GPU, a fully featured machine learning library that is optimized for GPUs with a robust python API that is drop dead replacement for scikit-learn. We'll demonstrate benchmarks for the most common algorithms relevant to enterprise AI and showcase performance gains as compared to running on CPUs.
Jon’s Bio:
https://umdphysics.umd.edu/people/faculty/current/item/337-jcm.html
Please view the video here:
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIASri Ambati
Presented at #H2OWorld 2017 in Mountain View, CA.
Enjoy the recording: https://youtu.be/NyaJ7uDroww.
Learn more about H2O.ai: https://www.h2o.ai/.
Follow @h2oai: https://www.twitter.com/h2oai.
This is my Deep Water talk for the TensorFlow Paris meetup.
Deep Water is H2O's integration with multiple open source deep learning libraries such as TensorFlow, MXNet and Caffe. On top of the performance gains from GPU backends, Deep Water naturally inherits all H2O properties in scalability. ease of use and deployment.
Intro to H2O Machine Learning in R at Santa Clara UniversitySri Ambati
Erin LeDell's presentation on Intro to H2O Machine Learning in R at SCU
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
"Comparing Variable Importance from Ensemble and Deep Learning Methods for AdTech Data"
Variable Importance brings interpretability to popular black box modeling techniques. In this talk we study performance of popular ensemble techniques like Random Forest, Gradient Boosting with GLM. We observe certain traits that get magnified by non-linear techniques like Deep Learning that are otherwise missed by GBM or Random Forest.
We describe Open Source Scalable Machine Learning package, H2O which through ease-of-use and speed makes comparisons and picking best-of-breed and ensembles more natural. H2O's implementation of these algorithms tracks popular open source and text book implementations closely.
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...Spark Summit
AI plays a central role in the today’s Internet applications and emerging intelligent systems, which are driving the need for scalable, distributed big data analytics with deep learning capabilities. There is increasing demand from organizations to discover and explore data using advanced big data analytics and deep learning. In this talk, we will share how we work with our users to build deep learning powered big data analytics applications (e.g., object detection, image recognition, NLP, etc.) using BigDL, an open source distributed deep learning library for Apache Spark.
H2O Deep Water - Making Deep Learning Accessible to EveryoneSri Ambati
Deep Water is H2O's integration with multiple open source deep learning libraries such as TensorFlow, MXNet and Caffe. On top of the performance gains from GPU backends, Deep Water naturally inherits all H2O properties in scalability. ease of use and deployment. In this talk, I will go through the motivation and benefits of Deep Water. After that, I will demonstrate how to build and deploy deep learning models with or without programming experience using H2O's R/Python/Flow (Web) interfaces.
Jo-fai (or Joe) is a data scientist at H2O.ai. Before joining H2O, he was in the business intelligence team at Virgin Media in UK where he developed data products to enable quick and smart business decisions. He also worked remotely for Domino Data Lab in the US as a data science evangelist promoting products via blogging and giving talks at meetups. Joe has a background in water engineering. Before his data science journey, he was an EngD research engineer at STREAM Industrial Doctorate Centre working on machine learning techniques for drainage design optimization. Prior to that, he was an asset management consultant specialized in data mining and constrained optimization for the utilities sector in the UK and abroad. He also holds an MSc in Environmental Management and a BEng in Civil Engineering.
These slides will show how to approach a multi-class (classification) problem using H2O. The data that is being used is an aggregated log of multiple systems that are constantly providing information about their status, connections and traffic. In large organizations, these log datasets can be very huge and unidentifiable due to the number of sources, legacy systems etc. In our example, we use a created response for each source. The use H2O to classify the source of data.
Author Bio: Ashrith Barthur is a Security Scientist at H2O currently working on algorithms that detect anomalous behaviour in user activities, network traffic, attacks, financial fraud and global money movement. He has a PhD from Purdue University in the field of information security, specialized in Anomalous behaviour in DNS protocol.
Don’t forget to download H2O!
http://www.h2o.ai/download/
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Imam Raza
Google Next Extended (https://cloudnext.withgoogle.com/) is an annual Google event focusing on Google cloud technologies. This presentation is from tech talk held in Google Next Extended 2017 Karachi event
These slides will show how to approach a multi-class (classification) problem using H2O. The data that is being used is an aggregated log of multiple systems that are constantly providing information about their status, connections and traffic. In large organizations, these log datasets can be very huge and unidentifiable due to the number of sources, legacy systems etc. In our example, we use a created response for each source. The use H2O to classify the source of data.
Ashrith Barthur is a Security Scientist at H2O currently working on algorithms that detect anomalous behaviour in user activities, network traffic, attacks, financial fraud and global money movement. He has a PhD from Purdue University in the field of information security, specialized in Anomalous behaviour in DNS protocol.
Don’t forget to download H2O!
http://www.h2o.ai/download/
Scalable and Automatic Machine Learning with H2OSri Ambati
H2O is widely used for machine learning projects. A TechCrunch article, published in January 2017 by John Mannes, reported that around 20% of Fortune 500 companies use H2O.
Talk 1: Introduction to Scalable & Automatic Machine Learning with H2O
In recent years, the demand for machine learning experts has outpaced the supply, despite the surge of people entering the field. To address this gap, there have been big strides in the development of user-friendly machine learning software that can be used by non-experts. Although H2O and other tools have made it easier for practitioners to train and deploy machine learning models at scale, there is still a fair bit of knowledge and background in data science that is required to produce high-performing machine learning models.
In this presentation, Joe will introduce the AutoML functionality in H2O. H2O's AutoML provides an easy-to-use interface which automates the process of training a large, comprehensive selection of candidate models and a stacked ensemble model which, in most cases, will be the top performing model in the AutoML Leaderboard.
Talk 2: Making Multimillion-dollar Baseball Decisions with H2O AutoML and Shiny
Joe recently teamed up with IBM and Aginity to create a proof of concept "Moneyball" app for the IBM Think conference in Vegas. The original goal was to prove that different tools (e.g. H2O, Aginity AMP, IBM Data Science Experience, R and Shiny) could work together seamlessly for common business use-cases. Little did Joe know, the app would be used by Ari Kaplan (the real "Moneyball" guy) to validate the future performance of some baseball players. Ari recommended one player to a Major League Baseball team. The player was signed the next day with a multimillion-dollar contract. This talk is about Joe's journey to a real "Moneyball" application.
Bio : Jo-fai (or Joe) Chow is a data scientist at H2O.ai. Before joining H2O, he was in the business intelligence team at Virgin Media in UK where he developed data products to enable quick and smart business decisions. He also worked remotely for Domino Data Lab in the US as a data science evangelist promoting products via blogging and giving talks at meetups. Joe has a background in water engineering. Before his data science journey, he was an EngD research engineer at STREAM Industrial Doctorate Centre working on machine learning techniques for drainage design optimization. Prior to that, he was an asset management consultant specialized in data mining and constrained optimization for the utilities sector in the UK and abroad. He also holds an MSc in Environmental Management and a BEng in Civil Engineering.
Intro to H2O in Python - Data Science LASri Ambati
Erin LeDell's presentation on Intro to H2O Machine Learning in Python at Data Science LA meetup on 1.19.16
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Big Data Applications Made Easy: Fact Or Fiction?Glenn Renfro
With Spring XD the answer is Fact. In short Spring XD provides a one stop shop for writing and deploying Big Data Applications. It provides a scalable, fault tolerant, distributed runtime for Data Ingestion, Analytics, and Workflow Orchestration using a single programming, configuration and extensibility model. By reducing the complexity of Big Data development, developers can focus on the business problem.
In this discussion, we will cover:
• The basics of Spring XD
• Show how to deploy streams that will handle data received from multiple sources, and write the results to various sinks
• Capture some analytics from a live data stream
• Show how to create and execute Jobs
• Demonstrate the failover capabilities of a XD Cluster
• Discuss how to create your own custom modules
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)Sascha Dittmann
In dieser Session stellen wir anhand eines praktischen Szenarios vor, wie konkrete Aufgabenstellungen mit HDInsight in der Praxis gelöst werden können:
- Grundlagen von HDInsight für Windows Server und Windows Azure
- Mit Windows Azure HDInsight arbeiten
- MapReduce-Jobs mit Javascript und .NET Code implementieren
Sparkling Water Webinar October 29th, 2014Sri Ambati
Sparkling Water is the newest application on the Apache Spark in-memory platform to extend Machine Learning for better predictions and to quickly deploy models into production. H2O is proud to partner with Cloudera and Databricks to bring this capability to a wide audience.
H2O is for data scientists and business analysts who need scalable and fast machine learning. H2O is an open source predictive analytics platform. Unlike traditional analytics tools, H2O provides a combination of extraordinary math and high performance parallel processing with unrivaled ease of use. H2O speaks the language of data science with support for R, Python, Scala, Java and a robust REST API. Smart business applications are powered by H2O’s NanoFast¬TM Scoring Engine. Learn more by going to http://www.h2o.ai and contact us for more information.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Similar to Introduction to data science with H2O-Chicago (20)
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Sri Ambati
Sandeep Singh, Head of Applied AI Computer Vision, Beans.ai
H2O Open Source GenAI World SF 2023
In the modern era of machine learning, leveraging both open-source and closed-source solutions has become paramount for achieving cutting-edge results. This talk delves into the intricacies of seamlessly integrating open-source Large Language Model (LLM) solutions like Vicuna, Falcon, and Llama with industry giants such as ChatGPT and Google's Palm. As the demand for fine-tuned and specialized datasets grows, it is imperative to understand the synergy between these tools. Attendees will gain insights into best practices for building and enriching datasets tailored for fine-tuning tasks, ensuring that their LLM projects are both robust and efficient. Through real-world examples and hands-on demonstrations, this talk will equip attendees with the knowledge to harness the power of both open and closed-source tools in a coherent and effective manner.
Patrick Hall, Professor, AI Risk Management, The George Washington University
H2O Open Source GenAI World SF 2023
Language models are incredible engineering breakthroughs but require auditing and risk management before productization. These systems raise concerns about toxicity, transparency and reproducibility, intellectual property licensing and ownership, disinformation and misinformation, supply chains, and more. How can your organization leverage these new tools without taking on undue or unknown risks? While language models and associated risk management are in their infancy, a small number of best practices in governance and risk are starting to emerge. If you have a language model use case in mind, want to understand your risks, and do something about them, this presentation is for you!
Dr. Alexy Khrabrov, Open Source Science Community Director, IBM
H2O Open Source GenAI World SF 2023
In this talk, Dr. Alexy Khrabrov, recently elected Chair of the new Generative AI Commons at Linux Foundation for AI & Data, outlines the OSS AI landscape, challenges, and opportunities. With new models and frameworks being unveiled weekly, one thing remains constant: community building and validation of all aspects of AI is key to reliable and responsible AI we can use for business and society needs. Industrial AI is one key area where such community validation can prove invaluable.
Michelle Tanco, Head of Product, H2O.ai
H2O Open Source GenAI World SF 2023
Learn how the makers at H2O.ai are building internal tools to solve real use cases using H2O Wave and h2oGPT. We will walk through an end-to-end use case and discuss how to incorporate business rules and generated content to rapidly develop custom AI apps using only Python APIs.
Applied Gen AI for the Finance Vertical Sri Ambati
Megan Kurka, Vice President, Customer Data Scientist, H2O.ai
H2O Open Source GenAI World SF 2023
Discover the transformative power of Applied Gen AI. Learn how the H2O team builds customized applications and workflows that integrate capabilities of Gen AI and AutoML specifically designed to address and enhance financial use cases. Explore real world examples, learn best practices, and witness firsthand how our innovative solutions are reshaping the landscape of finance technology.
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Sri Ambati
Pascal Pfeiffer, Principal Data Scientist, H2O.ai
H2O Open Source GenAI World SF 2023
This talk dives into the expansive ecosystem of Large Language Models (LLMs), offering practitioners an insightful guide to various relevant applications, from natural language understanding to creative content generation. While exploring use cases across different industries, it also honestly addresses the current limitations of LLMs and anticipates future advancements.
Introducción al Aprendizaje Automatico con H2O-3 (1)Sri Ambati
En esta reunión virtual, damos una introducción a la plataforma de aprendizaje automático de código abierto número 1, H2O-3 y te mostramos cómo puedes usarla para desarrollar modelos para resolver diferentes casos de uso.
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...Sri Ambati
Numerai is an open, crowd-sourced hedge fund powered by predictions from data scientists around the world. In return, participants are rewarded with weekly payouts in crypto.
In this talk, Joe will give an overview of the Numerai tournament based on his own experience. He will then explain how he automates the time-consuming tasks such as testing different modelling strategies, scoring new datasets, submitting predictions to Numerai as well as monitoring model performance with H2O Driverless AI and R.
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...Sri Ambati
In this session, you will learn about what you should do after you’ve taken an AI transformation baseline. Over the span of this session, we will discuss the next steps in moving toward AI readiness through alignment of talent and tools to drive successful adoption and continuous use within an organization.
To find additional videos on AI courses, earn badges, join the courses at H2O.ai Learning Center: https://training.h2o.ai/products/ai-foundations-course
To find the Youtube video about this presentation: https://youtu.be/K1Cl3x3rd8g
Speaker:
Chemere Davis (H2O.ai - Senior Data Scientist Training Specialist)
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Hivelance Technology
Cryptocurrency trading bots are computer programs designed to automate buying, selling, and managing cryptocurrency transactions. These bots utilize advanced algorithms and machine learning techniques to analyze market data, identify trading opportunities, and execute trades on behalf of their users. By automating the decision-making process, crypto trading bots can react to market changes faster than human traders
Hivelance, a leading provider of cryptocurrency trading bot development services, stands out as the premier choice for crypto traders and developers. Hivelance boasts a team of seasoned cryptocurrency experts and software engineers who deeply understand the crypto market and the latest trends in automated trading, Hivelance leverages the latest technologies and tools in the industry, including advanced AI and machine learning algorithms, to create highly efficient and adaptable crypto trading bots
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Your Digital Assistant.
Making complex approach simple. Straightforward process saves time. No more waiting to connect with people that matter to you. Safety first is not a cliché - Securely protect information in cloud storage to prevent any third party from accessing data.
Would you rather make your visitors feel burdened by making them wait? Or choose VizMan for a stress-free experience? VizMan is an automated visitor management system that works for any industries not limited to factories, societies, government institutes, and warehouses. A new age contactless way of logging information of visitors, employees, packages, and vehicles. VizMan is a digital logbook so it deters unnecessary use of paper or space since there is no requirement of bundles of registers that is left to collect dust in a corner of a room. Visitor’s essential details, helps in scheduling meetings for visitors and employees, and assists in supervising the attendance of the employees. With VizMan, visitors don’t need to wait for hours in long queues. VizMan handles visitors with the value they deserve because we know time is important to you.
Feasible Features
One Subscription, Four Modules – Admin, Employee, Receptionist, and Gatekeeper ensures confidentiality and prevents data from being manipulated
User Friendly – can be easily used on Android, iOS, and Web Interface
Multiple Accessibility – Log in through any device from any place at any time
One app for all industries – a Visitor Management System that works for any organisation.
Stress-free Sign-up
Visitor is registered and checked-in by the Receptionist
Host gets a notification, where they opt to Approve the meeting
Host notifies the Receptionist of the end of the meeting
Visitor is checked-out by the Receptionist
Host enters notes and remarks of the meeting
Customizable Components
Scheduling Meetings – Host can invite visitors for meetings and also approve, reject and reschedule meetings
Single/Bulk invites – Invitations can be sent individually to a visitor or collectively to many visitors
VIP Visitors – Additional security of data for VIP visitors to avoid misuse of information
Courier Management – Keeps a check on deliveries like commodities being delivered in and out of establishments
Alerts & Notifications – Get notified on SMS, email, and application
Parking Management – Manage availability of parking space
Individual log-in – Every user has their own log-in id
Visitor/Meeting Analytics – Evaluate notes and remarks of the meeting stored in the system
Visitor Management System is a secure and user friendly database manager that records, filters, tracks the visitors to your organization.
"Secure Your Premises with VizMan (VMS) – Get It Now"
Strategies for Successful Data Migration Tools.pptxvarshanayak241
Data migration is a complex but essential task for organizations aiming to modernize their IT infrastructure and leverage new technologies. By understanding common challenges and implementing these strategies, businesses can achieve a successful migration with minimal disruption. Data Migration Tool like Ask On Data play a pivotal role in this journey, offering features that streamline the process, ensure data integrity, and maintain security. With the right approach and tools, organizations can turn the challenge of data migration into an opportunity for growth and innovation.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
2. Who am I?
Hank Roark (hank@h2oai.com, @hankroark)
Data Scientist & Hacker @ H2O.ai
Lecturer in Systems Thinking, UIUC
13 years at John Deere doing Research, New Product
Development, New High Tech Ventures
Previously at startups and consulting
Physics Georgia Tech
Systems Design & Management MIT
3. • Founded: 2011 venture-backed, debuted in 2012
• Product: H2O open source in-memory prediction engine
• Team: 37 - Distributed Systems Engineers doing ML
• HQ: Mountain View, CA
H2O.ai Overview
H2O.ai
Machine Intelligence
4. 25,000 commits / 3yrs
H2O World Conference 2014
Team Work @ H2O.ai
28
Join H2O World Nov 9-11 2015!
5. What is H2O?
Math Platform
• Open
source in-
memory
prediction
engine
• Parallelized
and
distributed
algorithms
making the
most use
API
• Easy to
use and
adopt
• Written in
Java –
perfect for
Java
Programmer
s
• REST API
(JSON) –
Big Data
• More data?
Or better
models?
BOTH
• Use all of
your data –
model
without
down
sampling
• Run a
H2O.ai
Machine Intelligence
9. H2O.ai
Machine Intelligence
Reading Data from HDFS into H2O with R
H2O
H2O
H2O
data.csv
HTTP REST API
request to H2O
has HDFS path
H2O ClusterInitiate distributed
ingest
HDFS
Request data
from HDFS
STEP 2
2.2
2.3
2.4
R
h2o.importFile()
2.1
R function call
10. H2O.ai
Machine Intelligence
Reading Data from HDFS into H2O with R
H2O
H2O
H2O
R
HDFS
STEP 3
Cluster IP
Cluster Port
Pointer to Data
Return pointer to
data in REST API
JSON Response
HDFS provides
data
3.3
3.4
3.1h2o_df object
created in R
data.csv
h2o_df
H2O
Frame
3.2
Distributed H2O
Frame in DKV
H2O Cluster
11. H2O.ai
Machine Intelligence
R Script Starting H2O GLM
HTTP
REST/JSON
.h2o.startModelJob()
POST /3/ModelBuilders/glm
h2o.glm()
R script
Standard R process
TCP/IP
HTTP
REST/JSON
/3/ModelBuilders/glm endpoint
Job
GLM algorithm
GLM tasks
Fork/Join
framework
K/V store
framework
H2O process
Network layer
REST layer
H2O - algos
H2O - core
User process
H2O process
Legend
12. H2O.ai
Machine Intelligence
R Script Retrieving H2O GLM Result
HTTP
REST/JSON
h2o.getModel()
GET /3/Models/glm_model_id
h2o.glm()
R script
Standard R process
TCP/IP
HTTP
REST/JSON
/3/Models endpoint
Fork/Join
framework
K/V store
framework
H2O process
Network layer
REST layer
H2O - algos
H2O - core
User process
H2O process
Legend
13. Step 1
• Download and install h2o: h2o.ai, hit download
button
• SL Guest, password SerendipityOSW
• Only requirement is JDK 1.7+
• plus required packages if using R or Python
• Pick R, Python (2.7.x), or Standalone for tonight
13
15. Thank You
(final holdout test set for Chicago Meetup is at https://s3.amazonaws.com/0xdata-public/hank/mikeditka.csv)
Thank you Chicago for a great time!