SlideShare a Scribd company logo
1 of 5
Download to read offline
Russian ECommerce Portal Avito Uses Big Data to
Master Just-in-Time Ad Fraud Detection at Scale
Transcript of a BriefingsDirect podcast on how a Russian ECommerce and search engine site is
leveraging data analystics to grow at a rapid pace.
Listen to the podcast. Find it on iTunes. Sponsor: HP
Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm
Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this
ongoing sponsored discussion on IT innovation and how it’s making an impact
on people’s lives.
Once again, we're focusing on how companies are adapting to the new style of
IT to improve IT performance and deliver better user experiences, as well as
better business results.
Our next innovation case study interview highlights how Avito in Moscow, an
eCommerce site and portal, is using Big Data technology to improve the placement of
advertisements and to better understand how their users are adapting to this new age of IT and
advertising.
With that, please join me in welcoming our guest. We're here with Nikolay
Golov, the Chief Data Warehousing Architect at Avito. Welcome.
Nikolay Golov: Hi.
Gardner: Tell us a little bit about your site and your business at Avito, not that
many people here in North America probably know about it, but it sounds like
it's the Craigslist of Russia.
Golov: Yes, absolutely. Avito is a Russian Craigslist. It's a big site in Russia and also it’s the
biggest search engine for some goods. We have more searches, for example, from iPhone on
Avitos and on Google or Yandex. Yandex is a Russian Google.
Become a member of MyVertica
Register now
And gain access to the Free HP Vertica Community Edition.
Gardner: So does this cover all of retail type of goods, services, business-to-business? Tell us
about the breadth of goods and services that are on your site at Avito.
Gardner
Golov: On Avito, you can sell almost anything that can be bought in the market. You can sell
cars, you can sell houses, or rent them, for example. You can even find boats or business jets.
Now, we have about three business jets listed.
Gardner: So quite a diversity. What are your big data needs. It sounds as if in a country as large
as Russia with that many goods and services, you have a volume-of-data issue. What is it that
attracted you to seeking a warehouse in a big-data implementation?
Size advantage
Golov: The main advantages of Avito is firstly its size. Everybody in Russia knows that if you
want to buy or sell something, the best place for it is Avito. It’s first.
Second is speed. It is very easy to use it. We have a very easy interface. So we
must keep these two advantages. But there are also some  bad people which
want to  use Avito to sell weapons, drugs, prohibited medicines. It's absolutely
critical for Avito to keep it clean, to prevent such items from appearing in such
queries of our visitors.
We're growing very fast and if we use moderators, we'll have to increase our
expense on moderation in a linear progressions as we grow. So, the only
solution to avoid a linear increase in expenses is to use some automation.
Gardner: So, in order to rapidly, and in an automated fashion, decide which should or should
not be appearing on your site, you’ve decided to use a data warehouse that provides a streaming
real-time data effect. Tell me what your requirements are for the technology?
Golov: Yes, you're right. We have various requirements. For example, we need to be able to
perform fast fraud detection. The warehouse have to have a very little delay. Hours are not
permitted, it must be 10 minutes, no more.
Second, we have to have data for long period of history to learn our data mining algorithms, to
create reports,  and to analyze trends. So our data warehouse has to be big. It has to store months,
possibly years, of data. So it has to be fast, or only slightly delayed, and it has to be big.
Third, we're developing very fast. We're adding some new services, and we're integrating  with
partners. Not long ago, for example, we added information from Google AdWords to optimize
banners. So the warehouse must be very flexible. It must be able to grow.
Gardner: So, Nikolay, how long have you been using HP Vertica and how did you come to
choose that particular platform?
Golov: Over a year now. We chose Vertica for two two main advantages. First, speed of load and
data. The I/O speed provided by Vertica was awesome.
Golov
Second is its ability to upgrade, thanks to the commodity hardware. So if you have some new
requirements which require you to increase performance you can just buy new hardware,
commodity hardware, and so its power just increases.
It’s great and it can be done really fast. We're just doing this year. So Vertica was the winner.
Measuring the impact
Gardner: And do you have any sense of what the performance and characteristics of Vertica
and your data warehouse have gotten for you. Do you have a sense of reduced fraud by X
percent or better analytics that have given you a business advantage of some sort? Are there any
ways to measure the impact?
Golov: I don’t remember them all, but I do know that during last year, Avito grew really fast. We
have moderation team of about 250 persons at the beginning of this process. Now, we have the
same moderation team, but the number of items has increased twice. I suppose that's one of the
best measures that can be used.
Become a member of MyVertica
Register now
And gain access to the Free HP Vertica Community Edition.
Gardner: Fair enough. Now, looking to the future, when you're working in a business where
your margins, your business, your revenue comes from the ability to provide advertisement
placements and value to your sellers and buyers, will there be a data warehouse and analytics
value to improving the performance and the value on the actual distribution of ads and the costs
associated with that?
That is to say, in addition to fraud protection, is there a value from your analytics over a period
of time by which you will be able to refine the business algorithms and/or actual ability to
provide value to your customers?
Golov: We're starting few more products. The main aim of them is to create our own tool for
optimizing the directions of advertising. We have banners, marketing campaigns, and SMS. So
we've achieved some results in our reporting and fraud prevention. We'll continue to work in that
direction and we are planning to add some new types of functionality to our data warehouse.
Gardner: It certainly seems that a data warehouse is perhaps something that delivers a tactical
benefit or value, but then over time, very rapidly moves to multiple tactical benefits or a strategic
benefit. The more data, inference, and understanding you have of your processes, the more
powerful you can become as a total business.
Golov: Yes. One of my teachers in data warehouse, explained the role of data warehouse
enterprise. It’s like a diesel drive inside a ship. It just works, works, and works, and it’s hot
around it. You can create various tools to increase it, to make it better.
But there always must be something deep inside that provides all of the tools with a correct fuel
and clearer data gathered from all sides to follow the business.
Gardner: I wonder for others who are listening to you and saying, "We really need to have that
core platform in order to build out these other values over time." Do you have any lessons that
you have learned that you might share. That is to say, if you're starting out to develop your own
data warehouse and your own business intelligence (BI) and analytics capabilities, do you have
any advice that you would offer people?
Be flexible
Golov: First, you have to be ready to be flexible. If you will ask business about something, if
you will ask them if it's going to change, they'll tell you that it can’t, it will be absolutely this,
every time. And in two months, it will change. If you're not ready to change the ratio of your data
warehouse to get such data, it would be a disaster. That's first.
Second, there always will be errors in data, there will be gaps, and it's absolutely critical to start
building a data warehouse together with an automated data quality system that will automatically
control and monitor the quality of data and will help you to see the problems when they occur.
Gardner: I'm afraid we'll have to leave it there. We've been discussing how Avito, a large e-
commerce portal and super site in Moscow, has been deploying a data warehouse and BI
capability to not only prevent fraud, but also to grow its business through a better understanding
of its customers and processes.
So, a big thank you to our guest. We've been here with Nikolay Golov, the Chief Data
Warehousing Architect at Avito. Thank you so much.
Become a member of MyVertica
Register now
And gain access to the Free HP Vertica Community Edition.
Golov: Thanks a lot.
Gardner: And I'd like to thank our audience as well for joining us today for our special new
style of IT discussion.
I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of
HP sponsored discussions. Thanks so much for listening, and don't forget to come back next
time.
Listen to the podcast. Find it on iTunes. Sponsor: HP
Transcript of a BriefingsDirect podcast on how a Russian ECommerce and search engine site is
leveraging data analystics to grow at a rapid pace. Copyright Interarbor Solutions, LLC,
2005-2015. All rights reserved.
You may also be interested in:
	

 •	

 How Waste Management Builds a Powerful Services Contiunuum Across Operations,
Infrastructure, Development, and IT Processes
	

 •	

 GSN Games hits top prize using big data to uncover deep insights into gamer preferences
	

 •	

 Hybrid cloud models demand more infrastructure standardization, says global service
provider Steria
	

 •	

 Service providers gain new levels of actionable customer intelligence from big data
analytics
	

 •	

 How UK data solutions developer Systems Mechanics uses HP Vertica for BI, streaming
and data analysis
	

 •	

 Advanced cloud service automation eases application delivery for global service provider
NNIT
	

 •	

 HP network management heightens performance while reducing total costs for Nordic
telco TDC
	

 •	

 How Capgemini's UK financial services unit helps clients manage risk using big data
analysis
	

 •	

 Perfecto Mobile goes to cloud-based testing so developers can build the best apps faster
	

 •	

 Software security pays off: How Heartland Payment Systems gains steep ROI via
software assurance tools and methods
	

 •	

 HP ART documentation and readiness tools bring better user experiences to Nordic IT
solutions provider EVRY

More Related Content

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Recently uploaded (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Russian ECommerce Portal Avito Uses Big Data to Master Just-in-Time Ad Fraud Detection at Scale

  • 1. Russian ECommerce Portal Avito Uses Big Data to Master Just-in-Time Ad Fraud Detection at Scale Transcript of a BriefingsDirect podcast on how a Russian ECommerce and search engine site is leveraging data analystics to grow at a rapid pace. Listen to the podcast. Find it on iTunes. Sponsor: HP Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing sponsored discussion on IT innovation and how it’s making an impact on people’s lives. Once again, we're focusing on how companies are adapting to the new style of IT to improve IT performance and deliver better user experiences, as well as better business results. Our next innovation case study interview highlights how Avito in Moscow, an eCommerce site and portal, is using Big Data technology to improve the placement of advertisements and to better understand how their users are adapting to this new age of IT and advertising. With that, please join me in welcoming our guest. We're here with Nikolay Golov, the Chief Data Warehousing Architect at Avito. Welcome. Nikolay Golov: Hi. Gardner: Tell us a little bit about your site and your business at Avito, not that many people here in North America probably know about it, but it sounds like it's the Craigslist of Russia. Golov: Yes, absolutely. Avito is a Russian Craigslist. It's a big site in Russia and also it’s the biggest search engine for some goods. We have more searches, for example, from iPhone on Avitos and on Google or Yandex. Yandex is a Russian Google. Become a member of MyVertica Register now And gain access to the Free HP Vertica Community Edition. Gardner: So does this cover all of retail type of goods, services, business-to-business? Tell us about the breadth of goods and services that are on your site at Avito. Gardner
  • 2. Golov: On Avito, you can sell almost anything that can be bought in the market. You can sell cars, you can sell houses, or rent them, for example. You can even find boats or business jets. Now, we have about three business jets listed. Gardner: So quite a diversity. What are your big data needs. It sounds as if in a country as large as Russia with that many goods and services, you have a volume-of-data issue. What is it that attracted you to seeking a warehouse in a big-data implementation? Size advantage Golov: The main advantages of Avito is firstly its size. Everybody in Russia knows that if you want to buy or sell something, the best place for it is Avito. It’s first. Second is speed. It is very easy to use it. We have a very easy interface. So we must keep these two advantages. But there are also some  bad people which want to  use Avito to sell weapons, drugs, prohibited medicines. It's absolutely critical for Avito to keep it clean, to prevent such items from appearing in such queries of our visitors. We're growing very fast and if we use moderators, we'll have to increase our expense on moderation in a linear progressions as we grow. So, the only solution to avoid a linear increase in expenses is to use some automation. Gardner: So, in order to rapidly, and in an automated fashion, decide which should or should not be appearing on your site, you’ve decided to use a data warehouse that provides a streaming real-time data effect. Tell me what your requirements are for the technology? Golov: Yes, you're right. We have various requirements. For example, we need to be able to perform fast fraud detection. The warehouse have to have a very little delay. Hours are not permitted, it must be 10 minutes, no more. Second, we have to have data for long period of history to learn our data mining algorithms, to create reports,  and to analyze trends. So our data warehouse has to be big. It has to store months, possibly years, of data. So it has to be fast, or only slightly delayed, and it has to be big. Third, we're developing very fast. We're adding some new services, and we're integrating  with partners. Not long ago, for example, we added information from Google AdWords to optimize banners. So the warehouse must be very flexible. It must be able to grow. Gardner: So, Nikolay, how long have you been using HP Vertica and how did you come to choose that particular platform? Golov: Over a year now. We chose Vertica for two two main advantages. First, speed of load and data. The I/O speed provided by Vertica was awesome. Golov
  • 3. Second is its ability to upgrade, thanks to the commodity hardware. So if you have some new requirements which require you to increase performance you can just buy new hardware, commodity hardware, and so its power just increases. It’s great and it can be done really fast. We're just doing this year. So Vertica was the winner. Measuring the impact Gardner: And do you have any sense of what the performance and characteristics of Vertica and your data warehouse have gotten for you. Do you have a sense of reduced fraud by X percent or better analytics that have given you a business advantage of some sort? Are there any ways to measure the impact? Golov: I don’t remember them all, but I do know that during last year, Avito grew really fast. We have moderation team of about 250 persons at the beginning of this process. Now, we have the same moderation team, but the number of items has increased twice. I suppose that's one of the best measures that can be used. Become a member of MyVertica Register now And gain access to the Free HP Vertica Community Edition. Gardner: Fair enough. Now, looking to the future, when you're working in a business where your margins, your business, your revenue comes from the ability to provide advertisement placements and value to your sellers and buyers, will there be a data warehouse and analytics value to improving the performance and the value on the actual distribution of ads and the costs associated with that? That is to say, in addition to fraud protection, is there a value from your analytics over a period of time by which you will be able to refine the business algorithms and/or actual ability to provide value to your customers? Golov: We're starting few more products. The main aim of them is to create our own tool for optimizing the directions of advertising. We have banners, marketing campaigns, and SMS. So we've achieved some results in our reporting and fraud prevention. We'll continue to work in that direction and we are planning to add some new types of functionality to our data warehouse. Gardner: It certainly seems that a data warehouse is perhaps something that delivers a tactical benefit or value, but then over time, very rapidly moves to multiple tactical benefits or a strategic benefit. The more data, inference, and understanding you have of your processes, the more powerful you can become as a total business.
  • 4. Golov: Yes. One of my teachers in data warehouse, explained the role of data warehouse enterprise. It’s like a diesel drive inside a ship. It just works, works, and works, and it’s hot around it. You can create various tools to increase it, to make it better. But there always must be something deep inside that provides all of the tools with a correct fuel and clearer data gathered from all sides to follow the business. Gardner: I wonder for others who are listening to you and saying, "We really need to have that core platform in order to build out these other values over time." Do you have any lessons that you have learned that you might share. That is to say, if you're starting out to develop your own data warehouse and your own business intelligence (BI) and analytics capabilities, do you have any advice that you would offer people? Be flexible Golov: First, you have to be ready to be flexible. If you will ask business about something, if you will ask them if it's going to change, they'll tell you that it can’t, it will be absolutely this, every time. And in two months, it will change. If you're not ready to change the ratio of your data warehouse to get such data, it would be a disaster. That's first. Second, there always will be errors in data, there will be gaps, and it's absolutely critical to start building a data warehouse together with an automated data quality system that will automatically control and monitor the quality of data and will help you to see the problems when they occur. Gardner: I'm afraid we'll have to leave it there. We've been discussing how Avito, a large e- commerce portal and super site in Moscow, has been deploying a data warehouse and BI capability to not only prevent fraud, but also to grow its business through a better understanding of its customers and processes. So, a big thank you to our guest. We've been here with Nikolay Golov, the Chief Data Warehousing Architect at Avito. Thank you so much. Become a member of MyVertica Register now And gain access to the Free HP Vertica Community Edition. Golov: Thanks a lot. Gardner: And I'd like to thank our audience as well for joining us today for our special new style of IT discussion. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HP sponsored discussions. Thanks so much for listening, and don't forget to come back next time.
  • 5. Listen to the podcast. Find it on iTunes. Sponsor: HP Transcript of a BriefingsDirect podcast on how a Russian ECommerce and search engine site is leveraging data analystics to grow at a rapid pace. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved. You may also be interested in: • How Waste Management Builds a Powerful Services Contiunuum Across Operations, Infrastructure, Development, and IT Processes • GSN Games hits top prize using big data to uncover deep insights into gamer preferences • Hybrid cloud models demand more infrastructure standardization, says global service provider Steria • Service providers gain new levels of actionable customer intelligence from big data analytics • How UK data solutions developer Systems Mechanics uses HP Vertica for BI, streaming and data analysis • Advanced cloud service automation eases application delivery for global service provider NNIT • HP network management heightens performance while reducing total costs for Nordic telco TDC • How Capgemini's UK financial services unit helps clients manage risk using big data analysis • Perfecto Mobile goes to cloud-based testing so developers can build the best apps faster • Software security pays off: How Heartland Payment Systems gains steep ROI via software assurance tools and methods • HP ART documentation and readiness tools bring better user experiences to Nordic IT solutions provider EVRY