Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
UBM Asia Reifier Marketing Use Case
1.
Efficient Marketing by Customer
Record Deduplication UBM Asia
How Reifier is helping UBM Asia gain single view of customers at low
costs
About UBM Asia
UBM Asia (www.ubmasia.com) is Asia’s major trade fair and exhibition organizer. Owned by
UBM plc listed on the London Stock Exchange, UBM Asia is headquartered in Hong Kong and
has subsidiary companies in Asia and US, spanning 24 cities and 31 offices and a staff of 1300
people. With a track record spanning over 30 years, UBM Asia operates in 20 market sectors
with 230 dynamic facetoface exhibitions and highlevel professional conferences, 21 targeted
trade publications, 18 roundtheclock online products for over 2,000,000 quality exhibitors,
visitors, conference delegates, advertisers and subscribers from all over the world.
The problem
UBM Asia collects contact information of visitors to its leading trade fairs. Due to the
international nature of the visitorship, the data is multilingual with a mix of English,
Chinese, Thai, Turkish, Korean, Japanese. The contact information is collected at various
points in the cycle registration desk, online portals, survey forms, social media and various
directories etc. The contact information is primarily obtained through paper forms, which get
digitized or manual entry into a database. Typical record volumes are about a million
entries. At the end of the fair, the data is collated and fed into a CRM (Client Relationship
Management) system for future correspondence, offerings and promotions. The CRM is the
cornerstone for UBM’s business. The sales and marketing team use the system heavily to invite
attendees as well as send marketing promotions, service emails and critical event planning
information to prospects both electronically and via traditional means.
The contact information is riddled with poor quality data missing fields, typographical and
lexical differences as well as field swapping within multiple entries of the same person.
Many times, visitors provide common company sales or marketing email, phone numbers and
addresses instead of their own personal email ids, phones & addresses. Other times the same
visitor may provide different emails or phone numbers, or official address in one case and
personal address in another. There are also misspellings, partial names with missing first,
middle or last names, leading and trailing spaces and other typographical variations across all
the fields. As a consequence of having these duplicates, UBM Asia was
Missing costsaving opportunities
Suboptimal customer experience arising from the same customer being
approached multiple times for the same offer
www.nubetech.co | info@nubetech.co
2.
The sheer size of the data as well as the nuanced differences make manual deduplication
impossible. As exact matches are rare, database joins and filtering are ruled out too.
Requirements
UBM Asia wanted a solution for data matching and quality which could
a. Handle different variations in fields across records Missing middle, first and last names,
abbreviations in different parts of names and addresses, typographical errors etc. The
tool also needed to handle a mix of Chinese and English characters within the same
record as well as datasets containing both Chinese and English records
b. Support different geographies even when the names are in English, there are regional
differences when the event is held in India vs one in Singapore
c. Yield results faster
d. Work without data massaging, normalization and preprocessing
Approach
UBM Asia tried multiple existing solutions but none of them could handle the complexity, volume
and variety of data and provide a useable level of accuracy. Existing deduplication solutions are
rule and dictionary based where defining and managing the rules is a complex and time
consuming activity performed by a developer who has a background in matching
algorithms and tweaks weights assigned to different fields. To create precise rules, a lot
of data cleansing and preprocessing is also needed. Rules and dictionaries need
modification when the context of data changes or with the change in language or locale. A rule
mapping English name Jonathan to Jo is invalid in an Asian context, where Jo is a name in
itself. Thus learnings from one set of data cannot be easily used on another set of data and
requires costly and time consuming intervention from an expert.
UBM Asia’s Business Intelligence team uses Reifier fuzzy machine engine to make smart
matching. With minimal setup time, Reifier matches and links contact records containing
different languages as well as variations across fields yielding an accuracy of 70% or
more. The same training model works with English and Chinese records. UBM Asia is also
able to successfully match and link Japanese records on the same setup without any
configuration changes. Using Reifier’s smart web interface, UBM
Asia’s Business Intelligence team performs their matching tasks
with ease, deduplicating and linking data within minutes
instead of days.
1
Reifier’s innovative fuzzy machine algorithms use machine
learning to overcome the limitations of traditional systems.
Reifier is directly managed by the business user, data
1
As per industry average and UBM Asia’s internal findings, a temporary worker can manually verify upto
1000 records a day.
www.nubetech.co | info@nubetech.co
3. scientist or data engineer who can train Reifer to identify duplicates just like a human would
without the need of a data matching developer or expert.
Before Reifer we had to use a lot of manual efforts to identify potential
duplicates in customer data, now the system can learn patterns and find
duplicates for us intelligently. It’s a breakthrough to a longstanding issue of our
businesses.”
Mr. Dave Chan, Regional Director Business Intelligence, UBM Asia
Reifier’s automated learning engine brings up the deduplication system 5 times faster
and identifies 2 times more duplicates than conventional tools
. As Reifier learns from the
2
data itself, it works seamlessly with different datasets products, people, organizations,
addresses etc. Built on Apache Spark, Reifier is highly scalable to billions of records.
Reifier can be deployed on premise or on the cloud, providing sufficient ROI to the end user.
To see how Reifier can help you, write to us at info@nubetech.co / tweet to @nubetech / call
+918800541717 today.
2
Comparison performed independently by another customer, reference available on request
www.nubetech.co | info@nubetech.co