In this presentation, I explained that a simple machine learning model can be implemented in some situations even if we do not have the perfect conditions. We go over the first iteration of a click prediction model used in the delivery of CPC ads.
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
This discusses the architecture of an end-to-end application that combines streaming data with machine learning to do real-time analysis and visualization of where and when Uber cars are clustered, so as to analyze and visualize the most popular Uber locations.
Churn prediction is big business. It minimizes customer defection by predicting which customers are likely to cancel a service. Though originally used within the telecommunications industry, it has become common practice for banks, ISPs, insurance firms, and other verticals. More: http://info.mapr.com/WB_PredictingChurn_Global_DG_17.06.15_RegistrationPage.html
The prediction process is data-driven and often uses advanced machine learning techniques. In this webinar, we'll look at customer data, do some preliminary analysis, and generate churn prediction models – all with Spark machine learning (ML) and a Zeppelin notebook.
Spark’s ML library goal is to make machine learning scalable and easy. Zeppelin with Spark provides a web-based notebook that enables interactive machine learning and visualization.
In this tutorial, we'll do the following:
Review classification and decision trees
Use Spark DataFrames with Spark ML pipelines
Predict customer churn with Apache Spark ML decision trees
Use Zeppelin to run Spark commands and visualize the results
SparkML: Easy ML Productization for Real-Time BiddingDatabricks
dataxu bids on ads in real-time on behalf of its customers at the rate of 3 million requests a second and trains on past bids to optimize for future bids. Our system trains thousands of advertiser-specific models and runs multi-terabyte datasets. In this presentation we will share the lessons learned from our transition towards a fully automated Spark-based machine learning system and how this has drastically reduced the time to get a research idea into production. We'll also share how we: - continually ship models to production - train models in an unattended fashion with auto-tuning capabilities - tune and overbooked cluster resources for maximum performance - ported our previous ML solution into Spark - evaluate the performance of high-rate bidding models
Speakers: Maximo Gurmendez, Javier Buquet
Demystifying AI, Machine Learning and Deep LearningCarol McDonald
Deep learning, machine learning, artificial intelligence - all buzzwords and representative of the future of analytics. In this talk we will explain what is machine learning and deep learning at a high level with some real world examples. The goal of this is not to turn you into a data scientist, but to give you a better understanding of what you can do with machine learning. Machine learning is becoming more accessible to developers, and Data scientists work with domain experts, architects, developers and data engineers, so it is important for everyone to have a better understanding of the possibilities. Every piece of information that your business generates has potential to add value. This and future posts are meant to provoke a review of your own data to identify new opportunities.
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
This discusses the architecture of an end-to-end application that combines streaming data with machine learning to do real-time analysis and visualization of where and when Uber cars are clustered, so as to analyze and visualize the most popular Uber locations.
Churn prediction is big business. It minimizes customer defection by predicting which customers are likely to cancel a service. Though originally used within the telecommunications industry, it has become common practice for banks, ISPs, insurance firms, and other verticals. More: http://info.mapr.com/WB_PredictingChurn_Global_DG_17.06.15_RegistrationPage.html
The prediction process is data-driven and often uses advanced machine learning techniques. In this webinar, we'll look at customer data, do some preliminary analysis, and generate churn prediction models – all with Spark machine learning (ML) and a Zeppelin notebook.
Spark’s ML library goal is to make machine learning scalable and easy. Zeppelin with Spark provides a web-based notebook that enables interactive machine learning and visualization.
In this tutorial, we'll do the following:
Review classification and decision trees
Use Spark DataFrames with Spark ML pipelines
Predict customer churn with Apache Spark ML decision trees
Use Zeppelin to run Spark commands and visualize the results
SparkML: Easy ML Productization for Real-Time BiddingDatabricks
dataxu bids on ads in real-time on behalf of its customers at the rate of 3 million requests a second and trains on past bids to optimize for future bids. Our system trains thousands of advertiser-specific models and runs multi-terabyte datasets. In this presentation we will share the lessons learned from our transition towards a fully automated Spark-based machine learning system and how this has drastically reduced the time to get a research idea into production. We'll also share how we: - continually ship models to production - train models in an unattended fashion with auto-tuning capabilities - tune and overbooked cluster resources for maximum performance - ported our previous ML solution into Spark - evaluate the performance of high-rate bidding models
Speakers: Maximo Gurmendez, Javier Buquet
Demystifying AI, Machine Learning and Deep LearningCarol McDonald
Deep learning, machine learning, artificial intelligence - all buzzwords and representative of the future of analytics. In this talk we will explain what is machine learning and deep learning at a high level with some real world examples. The goal of this is not to turn you into a data scientist, but to give you a better understanding of what you can do with machine learning. Machine learning is becoming more accessible to developers, and Data scientists work with domain experts, architects, developers and data engineers, so it is important for everyone to have a better understanding of the possibilities. Every piece of information that your business generates has potential to add value. This and future posts are meant to provoke a review of your own data to identify new opportunities.
Artificial Intelligence for Travel - MyLittleAdventure / Welcome City Lab Dem...Valéry BERNARD
An overview of how MyLittleAdventure use Artificial Intelligence technologies to help travellers to choose their best memories : Smart inspiration engine, easy comparison of things to do in destination and many more
#MME15: How Mintigo, the Leading Predictive Marketing Platform for Enterprise...Mintigo1
Oracle Modern Marketing Experience 2015
Keynote Presentation
Description:
Is your demand generation machine broken? Are your goals unattainable with current conversion metrics? Is Sales demanding more leads and better leads? Is your budget shrinking but your goals are growing? If this sounds like you, don’t miss learning how Mintigo fixed Seagate’s demand gen machine and made the unattainable goal, attainable.
Presented by:
- Dr.Jacob Shama, CEO and Co-Founder, Mintigo
- Michael de la Torre, Sr. Director of Demand Generation, Seagate
Keynote presentation from ECBS conference. The talk is about how to use machine learning and AI in improving software engineering. Experiences from our project in Software Center (www.software-center.se).
Jakub Štěch na konferenci Data Restart 2020: Demonstrujeme propojení online a offline dat na případové studii České spořitelny. Jak vytvořit zákaznickou 360° a kde jsou možná omezení? Jak s cílením souvisí NLP a neuronové sítě? Proč je důležité mít celé datové prostředí u sebe jako klient?
Dear Sir/Ma’am
I am interested to work as a data specialist in your organization. I believe my experience, skills and work attitude will aid your organization in a great way. Please accept my enclosed resume with this letter.
I worked at Accenture for the last four years. My key responsibilities here were to collect, analyse, store and create data. I made sure that these data were accurate and not damaged. As far as my educational background is concerned, I have a bachelor's degree in EXTC. I am excellent at solving problems and have great analytical skills. I am capable of working well with network administration and can explain the technical problems.
I would appreciate if we could meet up for an interview wherein we can discuss more on this. I can be contacted at +919493377607 or you can email me at imtiaz.khan.sw39@gmail.com
Thank You.
Yours sincerely,
Imtiaz Khan
We introduce LEAD a formal CEP that was presented at DEBS 2019. Using a running example, we explain how the pattern matching job is generated using Aged Colored Petri Net as logical execution plan. The implementation is currently under Flink Streaming.
Big Data in Advertising Industry — Oleksandr Fedirko, Danylo StepanchukGlobalLogic Ukraine
This presentation deals with Big Data usage in advertising industry, namely in ad exchange business. It contains an overview of Big Data tools, a respective architectural approach and Java application for MapReduce.
This presentation was held by Oleksandr Fedirko (Lead Software Engineer, Consultant, GlobalLogic) and Danylo Stepanchuk (Lead Software Engineer, Consultant, GlobalLogic) at GlobalLogic Kyiv Java Career Day on August 11, 2018.
Learn more: https://www.globallogic.com/ua/events/globallogic-kyiv-java-career-day
Slidedeck for my session on Insider Dev Tour 2019 (Lisbon Jul 29th).
Mostly based on tools and platform support for AI workloads and the options for edge computing and cloud computing.
ML.NET, WinML, DirectML, Model Builder, Azure Cognitive Services, ...
Marketers have more data available than ever but struggle to pull in together in a usable format. Customer Data Platforms promise to solve this problem by offering easy-to-deploy systems specializing in data unification and sharing. But can CDP really deliver on its promise? This workshop will equip you to understand the definition of a CDP, how CDPs differ from other systems, which features are shared by all CDPs and which are found in only some, the most important CDP use cases, how to select the right CDP, how to manage a successful deployment and where to look next for more information.
Reviewing progress in the machine learning certification journey
𝗦𝗽𝗲𝗰𝗶𝗮𝗹 𝗔𝗱𝗱𝗶𝘁𝗶𝗼𝗻 - Short tech talk on How to Network by Qingyue(Annie) Wang
C𝗼𝗻𝘁𝗲𝗻𝘁 𝗿𝗲𝘃𝗶𝗲𝘄 𝗼𝗻 AI and ML on Google Cloud by Margaret Maynard-Reid
𝗔 𝗳𝗼𝗰𝘂𝘀𝗲𝗱 𝗰𝗼𝗻𝘁𝗲𝗻𝘁 𝗿𝗲𝘃𝗶𝗲𝘄 𝗼𝗻 𝗠𝗟 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 𝗳𝗿𝗮𝗺𝗶𝗻𝗴, 𝗺𝗼𝗱𝗲𝗹 𝗲𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻, 𝗮𝗻𝗱 𝗳𝗮𝗶𝗿𝗻𝗲𝘀𝘀 by Sowndarya Venkateswaran.
A discussion on sample questions to aid certification exam preparation.
An interactive Q&A session to clarify doubts and questions.
Previewing next steps and topics, including course completions and material reviews.
Artificial Intelligence for Travel - MyLittleAdventure / Welcome City Lab Dem...Valéry BERNARD
An overview of how MyLittleAdventure use Artificial Intelligence technologies to help travellers to choose their best memories : Smart inspiration engine, easy comparison of things to do in destination and many more
#MME15: How Mintigo, the Leading Predictive Marketing Platform for Enterprise...Mintigo1
Oracle Modern Marketing Experience 2015
Keynote Presentation
Description:
Is your demand generation machine broken? Are your goals unattainable with current conversion metrics? Is Sales demanding more leads and better leads? Is your budget shrinking but your goals are growing? If this sounds like you, don’t miss learning how Mintigo fixed Seagate’s demand gen machine and made the unattainable goal, attainable.
Presented by:
- Dr.Jacob Shama, CEO and Co-Founder, Mintigo
- Michael de la Torre, Sr. Director of Demand Generation, Seagate
Keynote presentation from ECBS conference. The talk is about how to use machine learning and AI in improving software engineering. Experiences from our project in Software Center (www.software-center.se).
Jakub Štěch na konferenci Data Restart 2020: Demonstrujeme propojení online a offline dat na případové studii České spořitelny. Jak vytvořit zákaznickou 360° a kde jsou možná omezení? Jak s cílením souvisí NLP a neuronové sítě? Proč je důležité mít celé datové prostředí u sebe jako klient?
Dear Sir/Ma’am
I am interested to work as a data specialist in your organization. I believe my experience, skills and work attitude will aid your organization in a great way. Please accept my enclosed resume with this letter.
I worked at Accenture for the last four years. My key responsibilities here were to collect, analyse, store and create data. I made sure that these data were accurate and not damaged. As far as my educational background is concerned, I have a bachelor's degree in EXTC. I am excellent at solving problems and have great analytical skills. I am capable of working well with network administration and can explain the technical problems.
I would appreciate if we could meet up for an interview wherein we can discuss more on this. I can be contacted at +919493377607 or you can email me at imtiaz.khan.sw39@gmail.com
Thank You.
Yours sincerely,
Imtiaz Khan
We introduce LEAD a formal CEP that was presented at DEBS 2019. Using a running example, we explain how the pattern matching job is generated using Aged Colored Petri Net as logical execution plan. The implementation is currently under Flink Streaming.
Big Data in Advertising Industry — Oleksandr Fedirko, Danylo StepanchukGlobalLogic Ukraine
This presentation deals with Big Data usage in advertising industry, namely in ad exchange business. It contains an overview of Big Data tools, a respective architectural approach and Java application for MapReduce.
This presentation was held by Oleksandr Fedirko (Lead Software Engineer, Consultant, GlobalLogic) and Danylo Stepanchuk (Lead Software Engineer, Consultant, GlobalLogic) at GlobalLogic Kyiv Java Career Day on August 11, 2018.
Learn more: https://www.globallogic.com/ua/events/globallogic-kyiv-java-career-day
Slidedeck for my session on Insider Dev Tour 2019 (Lisbon Jul 29th).
Mostly based on tools and platform support for AI workloads and the options for edge computing and cloud computing.
ML.NET, WinML, DirectML, Model Builder, Azure Cognitive Services, ...
Marketers have more data available than ever but struggle to pull in together in a usable format. Customer Data Platforms promise to solve this problem by offering easy-to-deploy systems specializing in data unification and sharing. But can CDP really deliver on its promise? This workshop will equip you to understand the definition of a CDP, how CDPs differ from other systems, which features are shared by all CDPs and which are found in only some, the most important CDP use cases, how to select the right CDP, how to manage a successful deployment and where to look next for more information.
Reviewing progress in the machine learning certification journey
𝗦𝗽𝗲𝗰𝗶𝗮𝗹 𝗔𝗱𝗱𝗶𝘁𝗶𝗼𝗻 - Short tech talk on How to Network by Qingyue(Annie) Wang
C𝗼𝗻𝘁𝗲𝗻𝘁 𝗿𝗲𝘃𝗶𝗲𝘄 𝗼𝗻 AI and ML on Google Cloud by Margaret Maynard-Reid
𝗔 𝗳𝗼𝗰𝘂𝘀𝗲𝗱 𝗰𝗼𝗻𝘁𝗲𝗻𝘁 𝗿𝗲𝘃𝗶𝗲𝘄 𝗼𝗻 𝗠𝗟 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 𝗳𝗿𝗮𝗺𝗶𝗻𝗴, 𝗺𝗼𝗱𝗲𝗹 𝗲𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻, 𝗮𝗻𝗱 𝗳𝗮𝗶𝗿𝗻𝗲𝘀𝘀 by Sowndarya Venkateswaran.
A discussion on sample questions to aid certification exam preparation.
An interactive Q&A session to clarify doubts and questions.
Previewing next steps and topics, including course completions and material reviews.
Similar to Using Machine Learning in the delivery of ads (20)
Online social networks support users in a wide range of activities, such as sharing information and making recommendations. In this talk, I will talk about a human-generated recommendations of friendship based on hashtags that emerged in Twitter around 2010. I will present a study of a large-scale corpus of human friendship recommendations, using a large corpus of tweets gathered over a 24 week period and involving a set of nearly 6 million users. I will show that these explicit recommendations had a measurable effect on the process of link creation, increasing the chance of link creation between two and three times on average, compared with a recommendation-free scenario. Also, ties created after such recommendations have up to 6% more longevity than other Twitter ties. Finally, I will talk about a supervised system that ranks our user-generated recommendations, surfacing the most valuable ones with high precision (0.52 MAP). We find that features describing users and the relationships between them are discriminative for this task. After the talk, we will carry out some examples on the collection of online data
Discovering Culture in Social Media and a Brief Case of Collective MemoryRuth Garcia Gavilanes
'Discovering Cultural Trails in Social Media and Collective Memory in Wikipedia', with speaker Dr Ruth García-Gavilanes, Oxford Internet Institute.
As a computational social science researcher, Ruth is interested in understanding online footprints, utilizing/developing computational methods and leveraging big data. In this seminar Ruth will present two case studies in this field: a) a study of how one’s action on Twitter (e.g., deciding when to post messages) is linked to one’s culture (e.g., country’s Pace of Life) and b) a case study of how collective memories can be measured using Wikipedia articles related to aircraft incidents and accidents.
To what extent do people tweet in other languages beyond English?
How do lingua groups interact with each other?
Is there an effect of language over online user interaction?
PhD defense contributions:
Providing a study of human generated recommendation on Twitter and its effect.
García-Gavilanes et al. Follow My Friends This Friday! An Analysis of Human-generated Friendship Recommendations. SocInfo’13 [Best paper award]
Describing the evolution of user behavior over time regarding the content they generate.
García-Gavilanes et al. Who are my Audiences? A Study of the Evolution of Target Audiences in Microblogs. SocInfo’14
Describing differences and similarities of users across countries regarding the way people tweet and connect with others.
García-Gavilanes et al. Microblogging without Borders: Differences and Similarities. Websci’11.
w/ Poblete et al. Do All Birds Tweet the Same? Characterizing Twitter Around the World. In CIKM’11
Proposing how to combine anthropological studies of culture with large scale data.
Correlating how and when people tweet with dimensions of national culture and pace of life
García-Gavilanes et al. Cultural Dimensions in Twitter: Time, Individualism and Power. ICWSM’13 [Honorable mention]
Improving the prediction of the communication strength between users from different countries by taking into account several cultural and socio-economic indicators taken from diverse sources.
García-Gavilanes et al. Twitter ain’t Without Frontiers: Economic, Social, and Cultural Boundaries in International Communication. CSCW’14.
Who are my Audiences? Evolution of Target Audiences in MicroblogsRuth Garcia Gavilanes
User behavior in online social media is not static, it evolves
through the years. In Twitter, we have witnessed a maturation of its platform and its users due to endogenous and exogenous reasons. While the research using Twitter data has expanded rapidly, little work has studied the change/evolution in the Twitter ecosystem itself. In this pa-
per, we use a taxonomy of the types of tweets posted by around 4M users during 10 weeks in 2011 and 2013. We classify users according to their tweeting behavior, and nd 5 clusters for which we can associate a different dominant tweeting type. Furthermore, we observe the evolution of
users across groups between 2011 and 2013 and interesting insights such as the decrease in conversations and increase in URLs sharing. Our findings suggest that mature users evolve to adopt Twitter as a news media rather than a social network.
Follow My Friends This Friday! An Analysis of Human-generated Friendship Reco...Ruth Garcia Gavilanes
In Twitter, the hashtag #FF, or #FOLLOWFRIDAY, arose as a popular convention for users to create contact recommendations for others. Hitherto, there has not been any quantitative study of the effect of such human-generated recommendations. This paper is the first study of a large-scale corpus of human friendship recommendations based on such hashtags, using a large corpus of recommendations gathered over a 24 week period and involving a set of nearly 6 million users. We show that these explicit recommendations have a measurable effect on the process of link creation, increasing the chance of link creation between two and three times on average, compared with a recommendation-free scenario. Also, ties created after such recommendations have up to 6\% more longevity than other Twitter ties. Finally, we build a supervised system to rank user-generated recommendations, surfacing the most valuable ones with high precision ($0.52$ MAP), and we find that features describing users and the relationships between them, are discriminative for this task.
Is the culture of a country associated with the way people use Twitter? The answer is a definite “Yes!” We analyzed more than 2.34 million geo-located user profiles in 30 countries plus their tweets for 10 weeks
Blogpost: http://crowdresearch.org/blog/?p=6767
Paper:http://www.aaai.org/ocs/index.php/ICWSM/ICWSM13/paper/view/6102
Acorn Recovery: Restore IT infra within minutesIP ServerOne
Introducing Acorn Recovery as a Service, a simple, fast, and secure managed disaster recovery (DRaaS) by IP ServerOne. A DR solution that helps restore your IT infra within minutes.
0x01 - Newton's Third Law: Static vs. Dynamic AbusersOWASP Beja
f you offer a service on the web, odds are that someone will abuse it. Be it an API, a SaaS, a PaaS, or even a static website, someone somewhere will try to figure out a way to use it to their own needs. In this talk we'll compare measures that are effective against static attackers and how to battle a dynamic attacker who adapts to your counter-measures.
About the Speaker
===============
Diogo Sousa, Engineering Manager @ Canonical
An opinionated individual with an interest in cryptography and its intersection with secure software development.
This presentation, created by Syed Faiz ul Hassan, explores the profound influence of media on public perception and behavior. It delves into the evolution of media from oral traditions to modern digital and social media platforms. Key topics include the role of media in information propagation, socialization, crisis awareness, globalization, and education. The presentation also examines media influence through agenda setting, propaganda, and manipulative techniques used by advertisers and marketers. Furthermore, it highlights the impact of surveillance enabled by media technologies on personal behavior and preferences. Through this comprehensive overview, the presentation aims to shed light on how media shapes collective consciousness and public opinion.
This presentation by Morris Kleiner (University of Minnesota), was made during the discussion “Competition and Regulation in Professions and Occupations” held at the Working Party No. 2 on Competition and Regulation on 10 June 2024. More papers and presentations on the topic can be found out at oe.cd/crps.
This presentation was uploaded with the author’s consent.
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Using Machine Learning in the delivery of ads
1. Using Machine
Learning in the
delivery of ads
Ruth Garcia
NDR Conference – June 07, 2018
Iasi, Romania
2. What is Skyscanner?
Skyscanner is a leading global travel search site offering a
comprehensive and free flight search service as well as online
comparisons for hotels, car hire and now trains.
3. Data Science at Skyscanner
Decision Science:1. Ensuring we have the right data; insights and
information to make the most impactful, scientific decisions in every aspect
of our operations.
Building Data Products:2. Leveraging our vast wealth of data to build more
contextual, relevant products for Travellers and Travel Suppliers (our
Partners)
D.S
Eng.
16. Expectation vs. reality
Tensorfiow
Optimization technique•
Embeddings•
Crossed columns•
Hashing•
Expectation
Flexible but with the risk of not
being fast enough.
Features:
User history,•
User features,•
Route features ,•
Ad features with,•
colors, text
17. Expectation vs. reality
Reality
Not flexible but fast and
easier to implement.
Tensorfiow
Optimization technique•
Embeddings•
Crossed columns•
Hashing•
Features:
User history,•
User features,•
Route features ,•
Ad features with,•
colors, text
18. Challenges in building the model
Goal: model, that estimates the likelihood of
whether an impression will result in a click or not.
19. Challenges: Which model to use?
Model Possibilities (easy to read in node.js):
• Logistic regression
• Random Forest : gets lost
• Neural networks: too slow hard to put it in json
Solvers:
• Logistic regression: Liblinear, sag
• SGDClassifier
Train all data at once
Train data in batches
Saves memory
Gridsearch for
hyperparameteres
20. Challenges: Categorical values
Categorical values:
One hot• encoding
One hot encoding grouping less frequent features•
Hashing trick•
One hot encoding•
• One hot encoding grouping
• Less frequent features
Creatives
C1
C2
C3
C1 C2 C3
1 0 0
0 1 0
0 0 1
Creatives
C1
C2
C3
C4
C1 C2 C3 C4
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
21. Challenges: Categorical values
id features
123 creative1,
advertiser2,mobile, etc.
321 creative2,
advertiser4,mobile, etc.
id Feat_1 Feat_2 Feat_3 …. Feat_k
123 1 0 1 …. 0
321 1 0 0 …. 1
Hashing Trick: Same hashing function in training, testing and production•
We gain
22. Machine Learning Performance
Precision at 1: based on
target groups.
Mean Reciprocal Rank
AUC: if caring about ranking
Log-Loss: if caring about the
value of CTR
Other metrics:
23. Challenges: How to monitor performance?
Updating model based on different
Sampling methods.
24. The road ahead: Balancing exploitation and exploration
Choose ad based on
ONLY CTR
Choose ad based on
OTHER criteria
25. The road even further ahead: quality
Post-click, colors, images, text,
landing page
26. Conclusions
1. Machine Learning can bring many benefits to the product the challenge is to prove it
2. Bringing ML into production could hard at the beginning
3. Cooperation between data scientists and engineers is crucial
D.S
Eng.