BIGDATA and
HADOOP PROJECT
Name : Somappa Srinivasan
Phone: +91 8531963734
Email id : somappa@sparrowanalytics.com
Website : www.sparrowanlaytics.com
Real Time Big Data Predictive
Analytics(Recommender system)
call : 8531963734 email id : somappa@sparrowanlytics.com
Our Goal
• Create a powerful, scalable recommendation
engine with minimum development.
• Make recommendation to users as they are
browsing movie related offer.
• Recommendation must have context to user
interest.
call : 8531963734 email id : somappa@sparrowanlytics.com
How do we hope to accomplish this?
• Hadoop – Distributed file system and processing
platform. We would be acquiring data from
various sources and stroing into data hub. We
call it data lake.
• Real Time engine : This engine receives request
in real time from user and send recommendation
back to user after doing modeling and scoring.
call : 8531963734 email id : somappa@sparrowanlytics.com
Data Lake
Enterprise
Unstructured
Informational
External
Data Sources
App Data
Custome
r
Sqoop
Flume
Other
Data Hub
BusinessApplications
Query &
Reporting
Modeling
Scorecard
Visualization
AnalyticsData Acquisition
Load / Apply
Data Landing Zone
Validation & Cleansing
Loading
Transformation
Data Ingestion
App Data
HDFS
Notification and Events Logging
call : 8531963734 email id : somappa@sparrowanlytics.com
Real-Time EngineBROWSER
Data Hub
Validate Request Model Selection Scoring
Recommendation
Feature Extraction
UIDashboard
call : 8531963734 email id : somappa@sparrowanlytics.com
Data Lake Components
➢ Data acquisition : In this module we will be
writing developing components to received
data from various sources.
➢ Data Ingestion : In this Module we will be
writing processor to process received data and
load to data hub
➢ Data Hub : Data hub is collection of related
HIVE and Hbase table.
➢ Business App : Layer will be exposed to
business team for analytics.
call : 8531963734 email id : somappa@sparrowanlytics.com
Real-Time Engine Component
➢ Validator : This component validate users requests
➢ Model Selector : Selects model based on users
request to be executed.
➢ Scoring : Responsible to calculate score for the
product/offer
➢ Recommendation : Based on score, recommend
right product/offer to right customer.
➢ Feature Extraction : Response for feature extraction.
➢ UI Dashboard : Responsible for visualize insight.
call : 8531963734 email id : somappa@sparrowanlytics.com
Thank You
call : 8531963734 email id : somappa@sparrowanlytics.com

BIGDATA & HADOOP PROJECT

  • 1.
    BIGDATA and HADOOP PROJECT Name: Somappa Srinivasan Phone: +91 8531963734 Email id : somappa@sparrowanalytics.com Website : www.sparrowanlaytics.com
  • 2.
    Real Time BigData Predictive Analytics(Recommender system) call : 8531963734 email id : somappa@sparrowanlytics.com
  • 3.
    Our Goal • Createa powerful, scalable recommendation engine with minimum development. • Make recommendation to users as they are browsing movie related offer. • Recommendation must have context to user interest. call : 8531963734 email id : somappa@sparrowanlytics.com
  • 4.
    How do wehope to accomplish this? • Hadoop – Distributed file system and processing platform. We would be acquiring data from various sources and stroing into data hub. We call it data lake. • Real Time engine : This engine receives request in real time from user and send recommendation back to user after doing modeling and scoring. call : 8531963734 email id : somappa@sparrowanlytics.com
  • 5.
    Data Lake Enterprise Unstructured Informational External Data Sources AppData Custome r Sqoop Flume Other Data Hub BusinessApplications Query & Reporting Modeling Scorecard Visualization AnalyticsData Acquisition Load / Apply Data Landing Zone Validation & Cleansing Loading Transformation Data Ingestion App Data HDFS Notification and Events Logging call : 8531963734 email id : somappa@sparrowanlytics.com
  • 6.
    Real-Time EngineBROWSER Data Hub ValidateRequest Model Selection Scoring Recommendation Feature Extraction UIDashboard call : 8531963734 email id : somappa@sparrowanlytics.com
  • 7.
    Data Lake Components ➢Data acquisition : In this module we will be writing developing components to received data from various sources. ➢ Data Ingestion : In this Module we will be writing processor to process received data and load to data hub ➢ Data Hub : Data hub is collection of related HIVE and Hbase table. ➢ Business App : Layer will be exposed to business team for analytics. call : 8531963734 email id : somappa@sparrowanlytics.com
  • 8.
    Real-Time Engine Component ➢Validator : This component validate users requests ➢ Model Selector : Selects model based on users request to be executed. ➢ Scoring : Responsible to calculate score for the product/offer ➢ Recommendation : Based on score, recommend right product/offer to right customer. ➢ Feature Extraction : Response for feature extraction. ➢ UI Dashboard : Responsible for visualize insight. call : 8531963734 email id : somappa@sparrowanlytics.com
  • 9.
    Thank You call :8531963734 email id : somappa@sparrowanlytics.com