Online Display Advertising Optimization with H2O talks about how to get lift with Machine Learning and Data Science on Big Data. Comparison of different algorithms, Gradient Boosting Machine, Random Forest, Generalized Linear Modeling.
Created a presentation in PowerPoint that examines the impact of the growth of mobile/tablet use on web design and convinces my prospective client that they should invest the time and resources to make sure their website is up-to-date.
Sparkling Water Applications Meetup 07.21.15Sri Ambati
Michal Malohlava's Sparkling Water Applications Meetup on 07.21.15, focusing on the Ask Craig use case.
http://h2o.ai/blog/2015/06/ask-craig-sparkling-water/
Created a presentation in PowerPoint that examines the impact of the growth of mobile/tablet use on web design and convinces my prospective client that they should invest the time and resources to make sure their website is up-to-date.
Sparkling Water Applications Meetup 07.21.15Sri Ambati
Michal Malohlava's Sparkling Water Applications Meetup on 07.21.15, focusing on the Ask Craig use case.
http://h2o.ai/blog/2015/06/ask-craig-sparkling-water/
Core principles for successful Ad monetization / Vlad Muntean (Google)DevGAMM Conference
- How do Ad Networks work
- Supply and demand of Ads
- Is a Detailed Mediation Waterfall Necessary
- How to minimize the amount of time necessary for setting up your mediation
- User is King
- The importance of paying attention to users rather than mediation setup
Onyx Beacon: technology and commercial presentation 2015Onyx Beacon
Complete presentation of our solution, including our iBeacons, CMS, SDK and mobile applications. Introducing the most common use cases of our solution: retail proximity marketing, events marketing, asset tracking, smart public transport and hospitality.
Deconstructing the Programmatic EcosystemKatana Media
Programmatic advertising is primed to have another monumental year in the digital marketing space, with a projected 24% growth to $27.47 billion in 2017. n this installment of our monthly webinar series, Katana’s Executive Chairman, Andreas Roell, and Media Director, Laura Wusthoff, will be sharing exclusive trends and tips in our webinar, Deconstructing the Programmatic Ecosystem. Read on to learn how your marketing strategy can embrace the efficiencies of programmatic advertising to meet your company’s revenue goals.
Connecting Applications from Mobile to Mainframe in the Application EconomyCA Technologies
The Application Economy continues to mount increasing customer demands on businesses worldwide, and for many in Mainframe organizations, success requires adopting a DevOps environment that allow development and operation teams to remain coordinated in designing and deploying software applications. Application Development solutions from CA Technologies provide a common set of tools that promote DevOps and allow customers to rapidly innovate and iterate critical applications and services.
Lear more: http://cainc.to/RgSy8t
Marketing in the Moment: Trends and Innovations in Real-Time Omni-Channel Mar...Ensighten
Presented by Dave Chaffey, CEO and Co-Founder, Smart Insights
Better understanding and acting on the holistic customer journey is paramount to success for today’s marketers. Reaching that goal, however, has become increasingly complicated due to an explosion of disparate technologies and fragmented data sources. How can marketers collect and stitch together the customer data they need, and then act upon that information to drive results? Join digital marketing expert Dave Chaffey as he discusses the latest trends, innovations and best practices for driving more timely and relevant interactions across touch points. Dave will discuss the opportunities and challenges for real-time personalization; give examples of what leading brands are doing today; and talk about how more and more brands are turning to the customer data platform (CDP) and other core technologies to accelerate their initiatives.
Devtodev is universal toolbox for mobile app developers, completely covering demands of both small teams and big publishers in analytics, user acquisition and retention, monetization and more.
The Evolution of OOH through Programmatic Drew Thachuk
Like most industries, out-of-home ad networks are undergoing a major digitization transition,affecting their network operations and sales processes. Global OOH ad spend is on the rise, driven by increased investment by media owners in technology and digital signage. This transition has made the medium more flexible and measurable, making it more easily executable and comparable alongside other digital channels. Most networks have begun steps to enable programmatic sales on their networks
Industry overview of the mobile user acquisition space going over the ad networks, attribution trackers, and in-app analytics tools.
Also see my Medium post for my information: http://bit.ly/1np5X6s
All recommendations are personal opinion and does not represent Flow State Media.
Core principles for successful Ad monetization / Vlad Muntean (Google)DevGAMM Conference
- How do Ad Networks work
- Supply and demand of Ads
- Is a Detailed Mediation Waterfall Necessary
- How to minimize the amount of time necessary for setting up your mediation
- User is King
- The importance of paying attention to users rather than mediation setup
Onyx Beacon: technology and commercial presentation 2015Onyx Beacon
Complete presentation of our solution, including our iBeacons, CMS, SDK and mobile applications. Introducing the most common use cases of our solution: retail proximity marketing, events marketing, asset tracking, smart public transport and hospitality.
Deconstructing the Programmatic EcosystemKatana Media
Programmatic advertising is primed to have another monumental year in the digital marketing space, with a projected 24% growth to $27.47 billion in 2017. n this installment of our monthly webinar series, Katana’s Executive Chairman, Andreas Roell, and Media Director, Laura Wusthoff, will be sharing exclusive trends and tips in our webinar, Deconstructing the Programmatic Ecosystem. Read on to learn how your marketing strategy can embrace the efficiencies of programmatic advertising to meet your company’s revenue goals.
Connecting Applications from Mobile to Mainframe in the Application EconomyCA Technologies
The Application Economy continues to mount increasing customer demands on businesses worldwide, and for many in Mainframe organizations, success requires adopting a DevOps environment that allow development and operation teams to remain coordinated in designing and deploying software applications. Application Development solutions from CA Technologies provide a common set of tools that promote DevOps and allow customers to rapidly innovate and iterate critical applications and services.
Lear more: http://cainc.to/RgSy8t
Marketing in the Moment: Trends and Innovations in Real-Time Omni-Channel Mar...Ensighten
Presented by Dave Chaffey, CEO and Co-Founder, Smart Insights
Better understanding and acting on the holistic customer journey is paramount to success for today’s marketers. Reaching that goal, however, has become increasingly complicated due to an explosion of disparate technologies and fragmented data sources. How can marketers collect and stitch together the customer data they need, and then act upon that information to drive results? Join digital marketing expert Dave Chaffey as he discusses the latest trends, innovations and best practices for driving more timely and relevant interactions across touch points. Dave will discuss the opportunities and challenges for real-time personalization; give examples of what leading brands are doing today; and talk about how more and more brands are turning to the customer data platform (CDP) and other core technologies to accelerate their initiatives.
Devtodev is universal toolbox for mobile app developers, completely covering demands of both small teams and big publishers in analytics, user acquisition and retention, monetization and more.
The Evolution of OOH through Programmatic Drew Thachuk
Like most industries, out-of-home ad networks are undergoing a major digitization transition,affecting their network operations and sales processes. Global OOH ad spend is on the rise, driven by increased investment by media owners in technology and digital signage. This transition has made the medium more flexible and measurable, making it more easily executable and comparable alongside other digital channels. Most networks have begun steps to enable programmatic sales on their networks
Industry overview of the mobile user acquisition space going over the ad networks, attribution trackers, and in-app analytics tools.
Also see my Medium post for my information: http://bit.ly/1np5X6s
All recommendations are personal opinion and does not represent Flow State Media.
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Sri Ambati
Sandeep Singh, Head of Applied AI Computer Vision, Beans.ai
H2O Open Source GenAI World SF 2023
In the modern era of machine learning, leveraging both open-source and closed-source solutions has become paramount for achieving cutting-edge results. This talk delves into the intricacies of seamlessly integrating open-source Large Language Model (LLM) solutions like Vicuna, Falcon, and Llama with industry giants such as ChatGPT and Google's Palm. As the demand for fine-tuned and specialized datasets grows, it is imperative to understand the synergy between these tools. Attendees will gain insights into best practices for building and enriching datasets tailored for fine-tuning tasks, ensuring that their LLM projects are both robust and efficient. Through real-world examples and hands-on demonstrations, this talk will equip attendees with the knowledge to harness the power of both open and closed-source tools in a coherent and effective manner.
Patrick Hall, Professor, AI Risk Management, The George Washington University
H2O Open Source GenAI World SF 2023
Language models are incredible engineering breakthroughs but require auditing and risk management before productization. These systems raise concerns about toxicity, transparency and reproducibility, intellectual property licensing and ownership, disinformation and misinformation, supply chains, and more. How can your organization leverage these new tools without taking on undue or unknown risks? While language models and associated risk management are in their infancy, a small number of best practices in governance and risk are starting to emerge. If you have a language model use case in mind, want to understand your risks, and do something about them, this presentation is for you!
Dr. Alexy Khrabrov, Open Source Science Community Director, IBM
H2O Open Source GenAI World SF 2023
In this talk, Dr. Alexy Khrabrov, recently elected Chair of the new Generative AI Commons at Linux Foundation for AI & Data, outlines the OSS AI landscape, challenges, and opportunities. With new models and frameworks being unveiled weekly, one thing remains constant: community building and validation of all aspects of AI is key to reliable and responsible AI we can use for business and society needs. Industrial AI is one key area where such community validation can prove invaluable.
Michelle Tanco, Head of Product, H2O.ai
H2O Open Source GenAI World SF 2023
Learn how the makers at H2O.ai are building internal tools to solve real use cases using H2O Wave and h2oGPT. We will walk through an end-to-end use case and discuss how to incorporate business rules and generated content to rapidly develop custom AI apps using only Python APIs.
Applied Gen AI for the Finance Vertical Sri Ambati
Megan Kurka, Vice President, Customer Data Scientist, H2O.ai
H2O Open Source GenAI World SF 2023
Discover the transformative power of Applied Gen AI. Learn how the H2O team builds customized applications and workflows that integrate capabilities of Gen AI and AutoML specifically designed to address and enhance financial use cases. Explore real world examples, learn best practices, and witness firsthand how our innovative solutions are reshaping the landscape of finance technology.
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Sri Ambati
Pascal Pfeiffer, Principal Data Scientist, H2O.ai
H2O Open Source GenAI World SF 2023
This talk dives into the expansive ecosystem of Large Language Models (LLMs), offering practitioners an insightful guide to various relevant applications, from natural language understanding to creative content generation. While exploring use cases across different industries, it also honestly addresses the current limitations of LLMs and anticipates future advancements.
Introducción al Aprendizaje Automatico con H2O-3 (1)Sri Ambati
En esta reunión virtual, damos una introducción a la plataforma de aprendizaje automático de código abierto número 1, H2O-3 y te mostramos cómo puedes usarla para desarrollar modelos para resolver diferentes casos de uso.
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...Sri Ambati
Numerai is an open, crowd-sourced hedge fund powered by predictions from data scientists around the world. In return, participants are rewarded with weekly payouts in crypto.
In this talk, Joe will give an overview of the Numerai tournament based on his own experience. He will then explain how he automates the time-consuming tasks such as testing different modelling strategies, scoring new datasets, submitting predictions to Numerai as well as monitoring model performance with H2O Driverless AI and R.
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...Sri Ambati
In this session, you will learn about what you should do after you’ve taken an AI transformation baseline. Over the span of this session, we will discuss the next steps in moving toward AI readiness through alignment of talent and tools to drive successful adoption and continuous use within an organization.
To find additional videos on AI courses, earn badges, join the courses at H2O.ai Learning Center: https://training.h2o.ai/products/ai-foundations-course
To find the Youtube video about this presentation: https://youtu.be/K1Cl3x3rd8g
Speaker:
Chemere Davis (H2O.ai - Senior Data Scientist Training Specialist)
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
2. 2
OUTLINE
Introducing ShareThis
Online display advertising problem
Estimation of conversion rate using H2O
Results from live campaigns
Ongoing work
Q&A
3. SHARING TOOLS AT SCALE
23 Billion PAGE
VIEWS
120 SOCIAL
CHANNELS
1. comScore Media Matrix Report * Includes PC, Tablet, and Mobile sites.
210 MM US USERS1
95% REACH*
2.4 MM SITES AND
APPS
4. This is Missy!
She is busy chatting
and browsing on the
web…
USER
Missy reads an article and
shares it to her Facebook
page using the ShareThis
widget
SOCIAL ACTIVITY
ShareThis observes the
share and can then target
Missy and her friends with
advertising messages
tailored to their interests
SOCIAL DATA
MAKING SOCIAL DATA ACTIONABLE
5. CATEGORY TARGETING: TECHNOLOGY
TVS
1.1 MM
AUDIO
800K
SMARTPHONES
13.7 MM
TABLETS
5.3 MM
PCs
6 MM
GAMING
7 MM
CAMERAS
1.3 MM
28.6 MM
USERS
35 MM+
SOCIAL ACTIONS
1.2 MM
SOCIAL ACTIONS/DAY
7. 7
ONLINE DISPLAY ADVERTISING
Advertisers’ goal is to target the most receptive online audience
in the right context and right time, so that to influence users to
engage with the ad.
Publisher Web
Page
Ad Ad
Exchange
Model Pipeline
(Production)
Real Time
Bidding (RTB)
System
ShareThis Data
Campaign DataMeta Data
Models
8. 8
ONLINE DISPLAY ADVERTISING
Campaign Performance
Advertisers seek the optimal price to bid for each ad call.
Cost per Click (CPC) Model
Cost per Action (CPA) Model
9. 9
MODELING CONVERSION RATE (CVR)
CTR and CVR are directly related to the user interacting with the
ad in a given context.
Challenge
They are fundamentally difficult to directly model and predict.
Even CVR is harder than CTR since conversion are very rare
events
View-through conversions have longer delays in the logging
system.
10. 10
PROBLEM SETUP
Let define Users, Publishers, Ads, Devices, and Locations as:
Goal
Find the optimal ad such that the probability of conversion is the
highest.
11. 11
PROBLEM SETUP
At single user level, the problem is a binary problem: conversion
or no conversion.
Conversion event is a random binary event
Transactional (low-level) data features are poorly correlated with
user’s direct response on a display ad.
15. 15
PRACTICAL ISSUES
Data Imbalance
CVR is inherently very low
Need to up-sample conversions or down-sample non conversions
Remove Anomalies
Retargeting visit data as proxy for cnv when cnv data is not available
Remove outliers
Missing Features
Sometimes features are missing or not enough conversions
Impute features
Feature Selection
Discard feature if more than 70% of the training examples are missing
Variance of attribution is lower than a threshold (10e-9)
16. 16
WHY NEW MACHINE LEARNING TOOL?
Available large-scale ML tools such as Apache Mahout, Vowpal Wabbit, Hadoop
RMR, native Spark MLLib have their own issues.
Critical Features for a state-of-the-art ML package:
Ease of use
System reliability
In-memory (fast)
Distributed
Extensible (API/SDK)
Accurate algorithms
Visualization (data and results)
Easy to deploy to production
18. 18
H2O PLATFORM: GLM MODEL
Screen shot for the CPA model using the GLM algorithm.
19. 19
SCORE CALIBRATION
Calibrate Model Scores
Find best threshold from AUC
Ad server attributes a conversion to the last impression
RTB needs to deliver certain amount of impressions per day
There is a trade-off between wasting impressions and winning
conversions.
20. 20
BUILDING A CPA MODEL
RETARGETED VISITS AS A PROXY FOR CONVERSIONS
USER-CENTRIC
Focus on RT Users
Deliver Ads at the optimal
times
BETTER
PERFORMANCE
Leverage optimization
opportunities
OPTIMAL TIME
Target Users Who Likely
Convert
DON’T WASTE IMP.
21. 21
LIVE TEST ON A CAR INSURANCE CAMPAIGN
TESTED FOR TWO MONTHS AND MEASURED THE PERFORMANCE BY DFA.
The CPA test for a car insurance campaign showed 58% improvement on
eCPA and 57% on conversion rate (CVR).
23. 23
ONGOING WORK
Tests are expensive and time consuming
We need to evaluate models before deploying to production
Build many models and evaluate them offline
Different datasets
Different features
Different algorithms
24. 24
COMBINING ESTIMATORS
GRADIENT BOOSTING MACHINE
Let denote categorical features.
Goal
Estimate CVR using an ensemble of weak prediction models,
decision trees:
Gradient boosting combines weak learners into a single strong
learner, in an iterative fashion.
27. 27
OFFLINE SIMULATIONS
Selecting models in practice
Accuracy of prediction on unseen data
Scoring time at production
Remove anomalies using Deep Learning
Correlations with other campaign KPIs (CTR, Brand lift,
Viewability, Winning Price, …)
Performance Stability
31. 31
CONCLUSION
How H2O helped us?
Maximized ROI by optimizing campaign performance and
budget allocation.
Empowered advanced ML algorithms in Hadoop cluster
Used all data and build models much faster
Reduced R&D time significantly
Building a smooth model building pipeline (R and Spark API)
You’ve probably seen and used the ShareThis widget and tools … which isn’t surprising.
We allow content to be shared seamlessly at nearly ubiquitous scale web-wide.
As consumers are celebrating, entertaining and educating their circles of friends, colleagues and community members, ShareThis is at the center of each of those moments …
I can share an example of one of those moments and how it leads to your brand engaging with one of your customers …
Let’s take Missy, for example. As Missy is browsing the web, she comes across an article about laptops.
Now because she’s just started her research for that new laptop before heading off to college in the fall, she’s going to post and email that article to various networks.
ShareThis observes this share as well as the downstream social activity which allows us to effectively deliver the right message tailored on your behalf.
Let’s explore more about how this happens …
With our category shareblock, you can surround entire categories or subverticals that matter most to your brand.
As an example, EVERY DAY nearly 1M social actions happen around technology-related topics. Here, you could immerse your brand into that technology content & the actions that accompany it.
GET READY FOR MUTLIPLE FORMATS AND SCREENS …
MOVING AWAY FROM OUTDATED AUDIENCE TARGETING BUCKETS – TO UTILIZING “FRESHER” REAL-TIME DATA .
Other companies use standard audience targeting and bucket Dan as a “tech enthusiast”, we message him at the moments when it’s most relevant.