Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Russian ECommerce Portal Avito Uses Big Data to Master Just-in-Time Ad Fraud Detection at Scale
1. Russian ECommerce Portal Avito Uses Big Data to
Master Just-in-Time Ad Fraud Detection at Scale
Transcript of a BriefingsDirect podcast on how a Russian ECommerce and search engine site is
leveraging data analystics to grow at a rapid pace.
Listen to the podcast. Find it on iTunes. Sponsor: HP
Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm
Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this
ongoing sponsored discussion on IT innovation and how it’s making an impact
on people’s lives.
Once again, we're focusing on how companies are adapting to the new style of
IT to improve IT performance and deliver better user experiences, as well as
better business results.
Our next innovation case study interview highlights how Avito in Moscow, an
eCommerce site and portal, is using Big Data technology to improve the placement of
advertisements and to better understand how their users are adapting to this new age of IT and
advertising.
With that, please join me in welcoming our guest. We're here with Nikolay
Golov, the Chief Data Warehousing Architect at Avito. Welcome.
Nikolay Golov: Hi.
Gardner: Tell us a little bit about your site and your business at Avito, not that
many people here in North America probably know about it, but it sounds like
it's the Craigslist of Russia.
Golov: Yes, absolutely. Avito is a Russian Craigslist. It's a big site in Russia and also it’s the
biggest search engine for some goods. We have more searches, for example, from iPhone on
Avitos and on Google or Yandex. Yandex is a Russian Google.
Become a member of MyVertica
Register now
And gain access to the Free HP Vertica Community Edition.
Gardner: So does this cover all of retail type of goods, services, business-to-business? Tell us
about the breadth of goods and services that are on your site at Avito.
Gardner
2. Golov: On Avito, you can sell almost anything that can be bought in the market. You can sell
cars, you can sell houses, or rent them, for example. You can even find boats or business jets.
Now, we have about three business jets listed.
Gardner: So quite a diversity. What are your big data needs. It sounds as if in a country as large
as Russia with that many goods and services, you have a volume-of-data issue. What is it that
attracted you to seeking a warehouse in a big-data implementation?
Size advantage
Golov: The main advantages of Avito is firstly its size. Everybody in Russia knows that if you
want to buy or sell something, the best place for it is Avito. It’s first.
Second is speed. It is very easy to use it. We have a very easy interface. So we
must keep these two advantages. But there are also some bad people which
want to use Avito to sell weapons, drugs, prohibited medicines. It's absolutely
critical for Avito to keep it clean, to prevent such items from appearing in such
queries of our visitors.
We're growing very fast and if we use moderators, we'll have to increase our
expense on moderation in a linear progressions as we grow. So, the only
solution to avoid a linear increase in expenses is to use some automation.
Gardner: So, in order to rapidly, and in an automated fashion, decide which should or should
not be appearing on your site, you’ve decided to use a data warehouse that provides a streaming
real-time data effect. Tell me what your requirements are for the technology?
Golov: Yes, you're right. We have various requirements. For example, we need to be able to
perform fast fraud detection. The warehouse have to have a very little delay. Hours are not
permitted, it must be 10 minutes, no more.
Second, we have to have data for long period of history to learn our data mining algorithms, to
create reports, and to analyze trends. So our data warehouse has to be big. It has to store months,
possibly years, of data. So it has to be fast, or only slightly delayed, and it has to be big.
Third, we're developing very fast. We're adding some new services, and we're integrating with
partners. Not long ago, for example, we added information from Google AdWords to optimize
banners. So the warehouse must be very flexible. It must be able to grow.
Gardner: So, Nikolay, how long have you been using HP Vertica and how did you come to
choose that particular platform?
Golov: Over a year now. We chose Vertica for two two main advantages. First, speed of load and
data. The I/O speed provided by Vertica was awesome.
Golov
3. Second is its ability to upgrade, thanks to the commodity hardware. So if you have some new
requirements which require you to increase performance you can just buy new hardware,
commodity hardware, and so its power just increases.
It’s great and it can be done really fast. We're just doing this year. So Vertica was the winner.
Measuring the impact
Gardner: And do you have any sense of what the performance and characteristics of Vertica
and your data warehouse have gotten for you. Do you have a sense of reduced fraud by X
percent or better analytics that have given you a business advantage of some sort? Are there any
ways to measure the impact?
Golov: I don’t remember them all, but I do know that during last year, Avito grew really fast. We
have moderation team of about 250 persons at the beginning of this process. Now, we have the
same moderation team, but the number of items has increased twice. I suppose that's one of the
best measures that can be used.
Become a member of MyVertica
Register now
And gain access to the Free HP Vertica Community Edition.
Gardner: Fair enough. Now, looking to the future, when you're working in a business where
your margins, your business, your revenue comes from the ability to provide advertisement
placements and value to your sellers and buyers, will there be a data warehouse and analytics
value to improving the performance and the value on the actual distribution of ads and the costs
associated with that?
That is to say, in addition to fraud protection, is there a value from your analytics over a period
of time by which you will be able to refine the business algorithms and/or actual ability to
provide value to your customers?
Golov: We're starting few more products. The main aim of them is to create our own tool for
optimizing the directions of advertising. We have banners, marketing campaigns, and SMS. So
we've achieved some results in our reporting and fraud prevention. We'll continue to work in that
direction and we are planning to add some new types of functionality to our data warehouse.
Gardner: It certainly seems that a data warehouse is perhaps something that delivers a tactical
benefit or value, but then over time, very rapidly moves to multiple tactical benefits or a strategic
benefit. The more data, inference, and understanding you have of your processes, the more
powerful you can become as a total business.
4. Golov: Yes. One of my teachers in data warehouse, explained the role of data warehouse
enterprise. It’s like a diesel drive inside a ship. It just works, works, and works, and it’s hot
around it. You can create various tools to increase it, to make it better.
But there always must be something deep inside that provides all of the tools with a correct fuel
and clearer data gathered from all sides to follow the business.
Gardner: I wonder for others who are listening to you and saying, "We really need to have that
core platform in order to build out these other values over time." Do you have any lessons that
you have learned that you might share. That is to say, if you're starting out to develop your own
data warehouse and your own business intelligence (BI) and analytics capabilities, do you have
any advice that you would offer people?
Be flexible
Golov: First, you have to be ready to be flexible. If you will ask business about something, if
you will ask them if it's going to change, they'll tell you that it can’t, it will be absolutely this,
every time. And in two months, it will change. If you're not ready to change the ratio of your data
warehouse to get such data, it would be a disaster. That's first.
Second, there always will be errors in data, there will be gaps, and it's absolutely critical to start
building a data warehouse together with an automated data quality system that will automatically
control and monitor the quality of data and will help you to see the problems when they occur.
Gardner: I'm afraid we'll have to leave it there. We've been discussing how Avito, a large e-
commerce portal and super site in Moscow, has been deploying a data warehouse and BI
capability to not only prevent fraud, but also to grow its business through a better understanding
of its customers and processes.
So, a big thank you to our guest. We've been here with Nikolay Golov, the Chief Data
Warehousing Architect at Avito. Thank you so much.
Become a member of MyVertica
Register now
And gain access to the Free HP Vertica Community Edition.
Golov: Thanks a lot.
Gardner: And I'd like to thank our audience as well for joining us today for our special new
style of IT discussion.
I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of
HP sponsored discussions. Thanks so much for listening, and don't forget to come back next
time.
5. Listen to the podcast. Find it on iTunes. Sponsor: HP
Transcript of a BriefingsDirect podcast on how a Russian ECommerce and search engine site is
leveraging data analystics to grow at a rapid pace. Copyright Interarbor Solutions, LLC,
2005-2015. All rights reserved.
You may also be interested in:
•
How Waste Management Builds a Powerful Services Contiunuum Across Operations,
Infrastructure, Development, and IT Processes
•
GSN Games hits top prize using big data to uncover deep insights into gamer preferences
•
Hybrid cloud models demand more infrastructure standardization, says global service
provider Steria
•
Service providers gain new levels of actionable customer intelligence from big data
analytics
•
How UK data solutions developer Systems Mechanics uses HP Vertica for BI, streaming
and data analysis
•
Advanced cloud service automation eases application delivery for global service provider
NNIT
•
HP network management heightens performance while reducing total costs for Nordic
telco TDC
•
How Capgemini's UK financial services unit helps clients manage risk using big data
analysis
•
Perfecto Mobile goes to cloud-based testing so developers can build the best apps faster
•
Software security pays off: How Heartland Payment Systems gains steep ROI via
software assurance tools and methods
•
HP ART documentation and readiness tools bring better user experiences to Nordic IT
solutions provider EVRY