Daum Communications Case Study

DAUM COMMUNICATIONS
Using big data analytics to understand and predict
user behavior
ESSENTIALS
Industry
Telecommunications
Company Size
2,000+ employees
Business Challenges
• Reduced responsiveness due to
inability to perform realtime
analysis
• Increased complexity from NoSQL
database management systems
• Reliance on resource-intensive
data analysis
• Reduced capability to make ad-
hoc queries on unstructured data
Solution
• EMC VNX unified storage
• Pivotal Greenplum Database
OVERVIEW
Daum Communications (Daum) is one of the leading providers of Korean-language
online services, including the news and information portal Daum.net, web-based email
service Hanmail.net, and the Daum Cafe online community. Headquartered in Jeju
Island, the company provides mobile web services, search marketing, and electronic
mapping. It also sells online advertising products through Daum.net. Daum is the
second largest web portal service provider in terms of daily visits in Korea and has
operating centers in Seoul and on Jeju Island.
Through its extensive range of Internet services and sale of online advertising
products, Daum generates vast amounts of unstructured data. The company has one
of the largest Apache Hadoop clusters in Korea, and analyzes its data to gain critical
competitive information in a number of areas, including user preferences and
behavior, search rankings, and advertisement targeting.
COMPLEX ENVIRONMENT IMPEDES DATA ANALYSIS
Facing intense domestic and global competition from a number of search engines that
are growing market share across desktop and mobile searches, Daum’s businesses
needed to make faster and better decisions to protect the company’s 20 percent share
of the Korean search market.
The company needed to analyze and make immediate decisions on its vast data stores
by extracting knowledge from its data in real time. But Daum was more interested in
solving analytic problems than in exploring relationships between data that are
available in traditional relational database systems. As a result, Daum was using
Hadoop to store data, and was using NoSQL non-relational database management
systems such as Cassandra and Storm as the Hadoop Distributed File System (HDFS)
to provide greater speed in performing Big Data analytics on unstructured data. This
solution landscape presented the company with serious challenges.
“Performing ad-hoc and multidimensional queries and analysis through Hadoop on our
unstructured data proved difficult,” says Jun-Sik Eom, Team Manager, Data
Technology Department, Daum Communications. “We were restricted in the speed of
data analysis due the batch processing of both unstructured and structured data,
which meant we relied heavily on the capability of our developers. Data analysis of
complex forms was also challenging in the NoSQL database.”
Because Daum’s data must be constantly reviewed, the company sought a solution
that would enable employees to perform high-speed queries on the data residing in
Hadoop. Additionally, Daum wanted to improve access through tools that were already
familiar to developers and database administrators.
CUSTOMER PROFILE

Benefits
• Increased data loading and
processing speeds
• Improved accuracy in generating
search results and predicting user
behaviour
• Increased efficiency by
performing rapid queries on the
data
• Reduced expenditures through
improved scalability
PIVOTAL GREENPLUM DATABASE ENABLES HIGH-SPEED
ANALYSIS OF UNSTRUCTURED DATA
Daum evaluated solutions that could address the limitations in the resource-intensive
analysis required by Hadoop and the NoSQL database management systems. To meet
the data analysis requirements for its search engine and Internet services businesses,
the company selected Pivotal Greenplum Database, which connects to Hadoop and
enables the co-processing of both structured and unstructured data within a single
solution.
“We were attracted to Pivotal Greenplum Database because of the advantage it had in
mixing the merits of database, data warehouse, and business intelligence,” says Eom.
“We can now use a single platform to run high-speed analytic queries on our most
appropriate data stores.”
“We were attracted to Pivotal Greenplum Database because of the
advantage it had in mixing the merits of database, data
warehouse, and business intelligence. We can now use a single
platform to run high-speed analytic queries on our most
appropriate data stores.”
Jun-Sik Eom,
Team Manager, Data Technology Department, Daum Communications
DELIVERING NEW BUSINESS INSIGHTS FROM REALTIME
ANALYSIS
To support its efforts to gain market share, Daum is using Pivotal Greenplum Database
to provide improved services and search accuracy to its users. Through realtime data
gathering and analysis of Internet searches and user behavior within its various online
services, the company can better predict future behavior and demand.
Daum can now make multiple queries—both in real time and over time as user patterns
and knowledge emerge—due to massively parallel processing (MPP) architecture, which
enables fast data loading and high-speed queries on the data. In addition to performing
real-time weblog analysis, the company can re-analyze data that has already been
processed and gain meaningful results with these various interpretations. Pivotal
helped Daum achieve an increased depth of knowledge, which is just as critical as
breadth in terms of delivering services.
ELIMINATING ROADBLOCKS TO SPEEDY QUERYING
Performing ad-hoc queries on the data stored in NoSQL databases from the Pivotal
Greenplum Database means administrators can use familiar SQL commands to perform
massive and multidimensional analysis. This reduces the company’s reliance on finding
specialist NoSQL and Hadoop skill sets, and minimizes the workload for employees.
“One of the most important elements in effectively using Big Data is securing the right
people,” says Eom. “We used to struggle with having the resources needed to perform
queries, which greatly reduced our processing efficiency. Today, instead of performing
queries on the NoSQL systems, we collect the data residing in Hadoop and NoSQL, and
then save it in Pivotal Greenplum Database to execute the analysis.”

ENABLING CONTINUOUS PROCESSING WHILE REDUCING
COSTS
Because Pivotal Greenplum Database is available as a software-only distribution, Daum
can run the data warehouse on any of its existing x86 servers running Hadoop. This
ensures scalability while eliminating the need for Daum to purchase new data center
infrastructure. Pivotal Greenplum Database enables gNet for Hadoop, a parallel
communications transport, to access the Hadoop cluster and query the data efficiently
using Hadoop servers rather than those running Pivotal Greenplum Database.
“By using our existing x86 servers, we were able to reduce expenditures and expand
capacity through linear scalability,” Eom explains. “We have continuous processing
across Pivotal Greenplum and Hadoop nodes. As the data increases, we can
conveniently expand our capacity just by adding standard x86 servers.”
LEARN MORE
To learn more about Pivotal products, services and solutions, visit gopivotal.com.
CONTACT US
To learn more about how EMC
products, services, and solutions can
help solve your business and IT
challenges, contact your local
representative or authorized reseller—
or visit us at www.EMC.com.
www.EMC.com
EMC2
, EMC and the EMC logo are registered trademarks or trademarks of EMC Corporation in the
United States and other countries. GoPivotal, Pivotal, and the Pivotal logo are registered
trademarks or trademarks of GoPivotal, Inc, in the United States and other jurisdictions. All other
trademarks used herein are the property of their respective owners. © Copyright 2013 EMC
Corporation. All rights reserved. Published in the USA. 12/13 Customer Profile H12705.
EMC believes the information in this document is accurate as of its publication date. The
information is subject to change without notice.

Daum Communications Case Study

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to Daum Communications Case Study

Similar to Daum Communications Case Study (20)

More from VMware Tanzu

More from VMware Tanzu (20)

Recently uploaded

Recently uploaded (20)

Daum Communications Case Study