SlideShare a Scribd company logo
1 of 35
Download to read offline
© Nube Technologies
Better decisions through better data
© Nube Technologies
About Myself and Nube
- Big data - Hadoop, Spark
- Analytics, Data wrangling, Machine Learning
- Nube Products - Reifier, Crux and HIHO
- IIT Delhi, 98.
- Cofounder from IIT Kanpur, 97
© Nube Technologies
Business Data is spread across many systems
● Discovering information a challenge - which are the
entities whom we need to address?
● Consolidating information a challenge - not sure if the
data is tied back to a single entity
● Enhancing data a challenge - are these new records
genuine or do they already exist?
Business Challenges
© Nube Technologies
The problem - lake or swamp?
According to Gartner, businesses lose upto 25% of potential revenue due to
lack of multichannel view of data. 67% data scientists say cleaning, organizing
and linking data is their most time consuming task, and 52.3% cite poor data
quality as their biggest challenge.
© Nube Technologies
● Data volumes are high
● Each record has multiple dimensions
● Exact matches are rare
● Comparing each record with every other is not possible
● There are many disparate systems
● Languages have unique issues
Technical Challenges for Matching
© Nube Technologies
● Discovering and maintaining rules for data quality is
extremely tough
● Custom coding and domain specific logic makes
maintenance a nightmare
● No one size fits all, big custom implementations needed
every time even after using existing tools
Technical Challenges for Matching
© Nube Technologies
● Point and Shoot - Zero config
● Learns similarity definitions from data
● No hard coding of business rules
● Highly scalable - runs on open source Apache Spark
● Advanced Machine Learning algorithms pick most
optimal solution
● Domain agnostic, can work with various kinds of data
● Utilities to create labeled data available - just point it to
the data
Reifier Advantages
© Nube Technologies
● Handles different languages - English, Chinese,
Japanese
● Highly accurate results
● Available as a library or as a private/public cloud
deployment
● REST interface
● AJAX based web front end
● Real time as well as batch support
● Support and Documentation through web based support
portal http://reifier.freshdesk.com
Reifier Advantages
© Nube Technologies
Customer Feedback
Before Reifer we had to use a lot of manual efforts to identify potential duplicates
in customer data, now the system can learn patterns and find duplicates for us
intelligently. It’s a breakthrough to a long-standing issue of our businesses.”
- Mr. Dave Chan, Regional Director Business Intelligence, UBM Asia
© Nube Technologies
Case Study - UBM Asia
- Deduplication of marketing data
- Combination of English, Chinese, Japanese
and other languages
- Upto 1 million new records per week
- Temp can do only about 800 records per day
- AWS Hosted, yearly license
- Reference customer
© Nube Technologies
Case Study - Government of India
- Invited for data matching for intelligence
agencies
- Reifier outperformed leading international
competition 2x on accuracy and >10x for
speed
- Matched 40million records
© Nube Technologies
A banking institution uses Reifier to run loan
applications against credit listing data to ensure
that they are not dealing with blacklisted
individuals and corporates.
Case Study - BFSI
© Nube Technologies
Case Study - BFSI
A leading insurance provider uses Reifier to
prevent fraudulent claims. By creating a
centralized consolidated data repository, the
company reduces overexposure of an
individual who has multiple policies. By
matching records, Reifier also helps find out
average policy per individual and household.
© Nube Technologies
A credit rating company utilizes Reifier to
consolidate personal credit histories from
different sources and provide accurate ratings
to their customers.
Case Study - BFSI
© Nube Technologies
A telecom company offers various products and
services and wants to cross sell to existing
customers. Existing information is fuzzily
matched for accurate customer segmentation
and marketing.
Case Study - Cross Selling
© Nube Technologies
Case Study - Regulatory
Regulatory compliance of all kinds - including related to
policies, taxes, privacy, anti terror, and anti money-
laundering - require matching up data pulled from a variety
of sources. With Reifier, organizations can meet regulatory
mandates with capabilities that support everything from
simple deduplication of customer lists to matching data
against government lists of suspected terrorists.
© Nube Technologies
A services company sources organization and
people data from LinkedIn and Crunchbase and
uses Reifier to match existing in house entities
to identify leads.
Case Study - Lead Generation
© Nube Technologies
By consolidating vendor information from different
geographies, source systems and channels, a retail
operator gets a complete view of its supply chain and it
able to garner better deals and discounts from its vendors.
Reifier helps in cutting costs for the retailer.
Case Study - Retail Operations
© Nube Technologies
Case Study - Telecom
Using Reifier, telecom companies can detect
delinquency patterns by identifying non paying
customers who evade detection by enrolling
with give similar sounding names and
addresses with different formatting and
spellings.
© Nube Technologies
A local search company lists millions of regional
businesses, restaurants and contacts. They periodically
crawl the web to update their listing database. Information
crawled from the web have similar entries found from
different websites and also with pre-existing entries in the
database. Reifier helps the search company compare their
existing listings with potential listings from the crawled data,
and keeps their directory up to date and free from duplicate
data.
Case Study - Directory Service
© Nube Technologies
Case Study - Ecommerce
Matching for competitive pricing and catalog
enrichment
© Nube Technologies
Reifier News
Invited to present at Strata Hadoop World 2015, Singapore.
© Nube Technologies
Reifier - News
Reifier presented at Spark Summit 2015, SFO,
USA.
© Nube Technologies
Reifier Technology presented at Spark Summit, 2014 at
San Francisco, USA
Reifier - News
© Nube Technologies
Reifier - News
● Reifier 1.0 released in October 2014 with
one international paying customer.
● Reifier 2.0 with interactive web GUI released
March 2015.
● GOI POC in Aug - Sep 2015
● Working on real time matching, merging,
GUI enhancements.
© Nube Technologies
Reifier - News
© Nube Technologies
Part of MapR App Gallery
Reifier Industry Validation
© Nube Technologies
Covered in Databricks as a leading machine learning tool
Reifier Industry Validation
© Nube Technologies
Part of Databricks Certified on Spark Apps
Reifier Industry Validation
© Nube Technologies
● Accept or create training data with marked
duplicates
● Identify similarity and indexing rules through
Machine Learning
● Group near similar records together
● Match and predict similar records
Reifier Technology
© Nube Technologies
Reifier - learn
© Nube Technologies
Reifier - learn
© Nube Technologies
Reifier - learn
© Nube Technologies
● Built using open source
● Apache Spark
● ElasticSearch
● Machine Learning
● Java
● Scala
Reifier Under The Hood
© Nube Technologies
Thanks for your time, please feel free to write to
sonal@nubetech.co for more details.
Thank You

More Related Content

Similar to Reifier

Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US InformationJulian Tong
 
Come fare business con i big data in concreto
Come fare business con i big data in concretoCome fare business con i big data in concreto
Come fare business con i big data in concretoHP Enterprise Italia
 
How First to Value Beats First to Market: Case Studies of Fast Data Success
How First to Value Beats First to Market: Case Studies of Fast Data SuccessHow First to Value Beats First to Market: Case Studies of Fast Data Success
How First to Value Beats First to Market: Case Studies of Fast Data SuccessVoltDB
 
Gain Deep Visibility into APIs and Integrations with Anypoint Monitoring
Gain Deep Visibility into APIs and Integrations with Anypoint MonitoringGain Deep Visibility into APIs and Integrations with Anypoint Monitoring
Gain Deep Visibility into APIs and Integrations with Anypoint MonitoringInfluxData
 
Getting Started with Big Data for Business Managers
Getting Started with Big Data for Business ManagersGetting Started with Big Data for Business Managers
Getting Started with Big Data for Business ManagersDatameer
 
Increasing Business Agility with Platform-as-a-Service
Increasing Business Agility with Platform-as-a-ServiceIncreasing Business Agility with Platform-as-a-Service
Increasing Business Agility with Platform-as-a-ServicePerficient, Inc.
 
From Data to Data Driven - Applications that will change your business
From Data to Data Driven - Applications that will change your businessFrom Data to Data Driven - Applications that will change your business
From Data to Data Driven - Applications that will change your businessNG DATA
 
BIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceBIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceSkillspeed
 
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...CA Technologies
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...BigDataEverywhere
 
Ariba, SAP Procurement and Business Network Roadmap [New York City]
Ariba, SAP Procurement and Business Network Roadmap [New York City]Ariba, SAP Procurement and Business Network Roadmap [New York City]
Ariba, SAP Procurement and Business Network Roadmap [New York City]SAP Ariba
 
Guiding Principles for the Low Code Revolution – Intuit QuickBase EMPOWER2015...
Guiding Principles for the Low Code Revolution – Intuit QuickBase EMPOWER2015...Guiding Principles for the Low Code Revolution – Intuit QuickBase EMPOWER2015...
Guiding Principles for the Low Code Revolution – Intuit QuickBase EMPOWER2015...QuickBase, Inc.
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneySai Paravastu
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsSkillspeed
 
BIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in RetailBIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in RetailSkillspeed
 
4. Big data & analytics HP
4. Big data & analytics HP4. Big data & analytics HP
4. Big data & analytics HPMITEF México
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 

Similar to Reifier (20)

Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Come fare business con i big data in concreto
Come fare business con i big data in concretoCome fare business con i big data in concreto
Come fare business con i big data in concreto
 
How First to Value Beats First to Market: Case Studies of Fast Data Success
How First to Value Beats First to Market: Case Studies of Fast Data SuccessHow First to Value Beats First to Market: Case Studies of Fast Data Success
How First to Value Beats First to Market: Case Studies of Fast Data Success
 
Gain Deep Visibility into APIs and Integrations with Anypoint Monitoring
Gain Deep Visibility into APIs and Integrations with Anypoint MonitoringGain Deep Visibility into APIs and Integrations with Anypoint Monitoring
Gain Deep Visibility into APIs and Integrations with Anypoint Monitoring
 
Getting Started with Big Data for Business Managers
Getting Started with Big Data for Business ManagersGetting Started with Big Data for Business Managers
Getting Started with Big Data for Business Managers
 
Increasing Business Agility with Platform-as-a-Service
Increasing Business Agility with Platform-as-a-ServiceIncreasing Business Agility with Platform-as-a-Service
Increasing Business Agility with Platform-as-a-Service
 
From Data to Data Driven - Applications that will change your business
From Data to Data Driven - Applications that will change your businessFrom Data to Data Driven - Applications that will change your business
From Data to Data Driven - Applications that will change your business
 
BIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceBIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-Commerce
 
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...
Hewlett Packard Enterprise View on Going Big with API Management - Applicatio...
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
 
Lohit_Resume_New
Lohit_Resume_NewLohit_Resume_New
Lohit_Resume_New
 
Ariba, SAP Procurement and Business Network Roadmap [New York City]
Ariba, SAP Procurement and Business Network Roadmap [New York City]Ariba, SAP Procurement and Business Network Roadmap [New York City]
Ariba, SAP Procurement and Business Network Roadmap [New York City]
 
Guiding Principles for the Low Code Revolution – Intuit QuickBase EMPOWER2015...
Guiding Principles for the Low Code Revolution – Intuit QuickBase EMPOWER2015...Guiding Principles for the Low Code Revolution – Intuit QuickBase EMPOWER2015...
Guiding Principles for the Low Code Revolution – Intuit QuickBase EMPOWER2015...
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in Logistics
 
BIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in RetailBIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in Retail
 
Why mTAB?
Why mTAB?Why mTAB?
Why mTAB?
 
4. Big data & analytics HP
4. Big data & analytics HP4. Big data & analytics HP
4. Big data & analytics HP
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 

Recently uploaded

Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 

Reifier

  • 1. © Nube Technologies Better decisions through better data
  • 2. © Nube Technologies About Myself and Nube - Big data - Hadoop, Spark - Analytics, Data wrangling, Machine Learning - Nube Products - Reifier, Crux and HIHO - IIT Delhi, 98. - Cofounder from IIT Kanpur, 97
  • 3. © Nube Technologies Business Data is spread across many systems ● Discovering information a challenge - which are the entities whom we need to address? ● Consolidating information a challenge - not sure if the data is tied back to a single entity ● Enhancing data a challenge - are these new records genuine or do they already exist? Business Challenges
  • 4. © Nube Technologies The problem - lake or swamp? According to Gartner, businesses lose upto 25% of potential revenue due to lack of multichannel view of data. 67% data scientists say cleaning, organizing and linking data is their most time consuming task, and 52.3% cite poor data quality as their biggest challenge.
  • 5. © Nube Technologies ● Data volumes are high ● Each record has multiple dimensions ● Exact matches are rare ● Comparing each record with every other is not possible ● There are many disparate systems ● Languages have unique issues Technical Challenges for Matching
  • 6. © Nube Technologies ● Discovering and maintaining rules for data quality is extremely tough ● Custom coding and domain specific logic makes maintenance a nightmare ● No one size fits all, big custom implementations needed every time even after using existing tools Technical Challenges for Matching
  • 7. © Nube Technologies ● Point and Shoot - Zero config ● Learns similarity definitions from data ● No hard coding of business rules ● Highly scalable - runs on open source Apache Spark ● Advanced Machine Learning algorithms pick most optimal solution ● Domain agnostic, can work with various kinds of data ● Utilities to create labeled data available - just point it to the data Reifier Advantages
  • 8. © Nube Technologies ● Handles different languages - English, Chinese, Japanese ● Highly accurate results ● Available as a library or as a private/public cloud deployment ● REST interface ● AJAX based web front end ● Real time as well as batch support ● Support and Documentation through web based support portal http://reifier.freshdesk.com Reifier Advantages
  • 9. © Nube Technologies Customer Feedback Before Reifer we had to use a lot of manual efforts to identify potential duplicates in customer data, now the system can learn patterns and find duplicates for us intelligently. It’s a breakthrough to a long-standing issue of our businesses.” - Mr. Dave Chan, Regional Director Business Intelligence, UBM Asia
  • 10. © Nube Technologies Case Study - UBM Asia - Deduplication of marketing data - Combination of English, Chinese, Japanese and other languages - Upto 1 million new records per week - Temp can do only about 800 records per day - AWS Hosted, yearly license - Reference customer
  • 11. © Nube Technologies Case Study - Government of India - Invited for data matching for intelligence agencies - Reifier outperformed leading international competition 2x on accuracy and >10x for speed - Matched 40million records
  • 12. © Nube Technologies A banking institution uses Reifier to run loan applications against credit listing data to ensure that they are not dealing with blacklisted individuals and corporates. Case Study - BFSI
  • 13. © Nube Technologies Case Study - BFSI A leading insurance provider uses Reifier to prevent fraudulent claims. By creating a centralized consolidated data repository, the company reduces overexposure of an individual who has multiple policies. By matching records, Reifier also helps find out average policy per individual and household.
  • 14. © Nube Technologies A credit rating company utilizes Reifier to consolidate personal credit histories from different sources and provide accurate ratings to their customers. Case Study - BFSI
  • 15. © Nube Technologies A telecom company offers various products and services and wants to cross sell to existing customers. Existing information is fuzzily matched for accurate customer segmentation and marketing. Case Study - Cross Selling
  • 16. © Nube Technologies Case Study - Regulatory Regulatory compliance of all kinds - including related to policies, taxes, privacy, anti terror, and anti money- laundering - require matching up data pulled from a variety of sources. With Reifier, organizations can meet regulatory mandates with capabilities that support everything from simple deduplication of customer lists to matching data against government lists of suspected terrorists.
  • 17. © Nube Technologies A services company sources organization and people data from LinkedIn and Crunchbase and uses Reifier to match existing in house entities to identify leads. Case Study - Lead Generation
  • 18. © Nube Technologies By consolidating vendor information from different geographies, source systems and channels, a retail operator gets a complete view of its supply chain and it able to garner better deals and discounts from its vendors. Reifier helps in cutting costs for the retailer. Case Study - Retail Operations
  • 19. © Nube Technologies Case Study - Telecom Using Reifier, telecom companies can detect delinquency patterns by identifying non paying customers who evade detection by enrolling with give similar sounding names and addresses with different formatting and spellings.
  • 20. © Nube Technologies A local search company lists millions of regional businesses, restaurants and contacts. They periodically crawl the web to update their listing database. Information crawled from the web have similar entries found from different websites and also with pre-existing entries in the database. Reifier helps the search company compare their existing listings with potential listings from the crawled data, and keeps their directory up to date and free from duplicate data. Case Study - Directory Service
  • 21. © Nube Technologies Case Study - Ecommerce Matching for competitive pricing and catalog enrichment
  • 22. © Nube Technologies Reifier News Invited to present at Strata Hadoop World 2015, Singapore.
  • 23. © Nube Technologies Reifier - News Reifier presented at Spark Summit 2015, SFO, USA.
  • 24. © Nube Technologies Reifier Technology presented at Spark Summit, 2014 at San Francisco, USA Reifier - News
  • 25. © Nube Technologies Reifier - News ● Reifier 1.0 released in October 2014 with one international paying customer. ● Reifier 2.0 with interactive web GUI released March 2015. ● GOI POC in Aug - Sep 2015 ● Working on real time matching, merging, GUI enhancements.
  • 27. © Nube Technologies Part of MapR App Gallery Reifier Industry Validation
  • 28. © Nube Technologies Covered in Databricks as a leading machine learning tool Reifier Industry Validation
  • 29. © Nube Technologies Part of Databricks Certified on Spark Apps Reifier Industry Validation
  • 30. © Nube Technologies ● Accept or create training data with marked duplicates ● Identify similarity and indexing rules through Machine Learning ● Group near similar records together ● Match and predict similar records Reifier Technology
  • 34. © Nube Technologies ● Built using open source ● Apache Spark ● ElasticSearch ● Machine Learning ● Java ● Scala Reifier Under The Hood
  • 35. © Nube Technologies Thanks for your time, please feel free to write to sonal@nubetech.co for more details. Thank You