SlideShare a Scribd company logo
1 of 20
Download to read offline
What is Big Data?



A new generation of technologies and architectures
designed to economically extract value from very large
volumes of a wide variety of data, by enabling high
velocity capture, discovery and/or analysis
VELOCITY
                              VARIETY
                              VOLUME
                    +   VISUALIZATION

                                   VALUE



Big Data’s impact can be expressed by The Five V’s
 E-Commerce Site fed by outsourced Ad Servers
 Ads appear on a wide range of sites with various offers
 Massive amount of data is generated by these servers:
 • Web logs and click stream data from the E-Commerce Site
 • Ad logs and click stream data from the Ad Servers
 • Results in relational transactions on the site


 Goal: Maximize Traffic Analysis for Business Value
 • Velocity Demo: Pinpoint activity in real-time & react
 • Variety Demo: Examine historical trends across sources
 • Visualization Demo: Enable ad-hoc data analysis for insights



Demo Context
WEB SERVERS



                        How to identify when Ad clicks results in Site Traffic?
                         High volume stream of log activity coming in:
                           •   Web logs and Ad Server logs
                         Real-time stream analysis allows for pinpointing
                          data when it happens
      LOG FILES          Simultaneously join structured and unstructured
                          data in a persistent query
                         Can be used for A/B testing, Offer improvement,
                          Site Dynamic behavior, or Fraud Detection




     AD SERVERS

Velocity Architecture
DEMO: StreamInsight
WEB SERVERS
                       How to do historical analysis on unstructured data?




                        M/R
      LOG FILES


                        Ad Servers and Web Servers generate different log files with different formats
                         making them hard to analyze
                        Map/Reduce processing allows for us to execute a query across variant data
                         formats stored in Hadoop
                        Hive provides a traditional query interface to Map/Reduce
                        Correlate and connect high variety data for trend analysis
     AD SERVERS

Variety Architecture
Access Azure blob storage via a Hive “view” and aggregate session data
 CREATE EXTERNAL TABLE logs (
 date1 STRING,
 time1 STRING,
 action STRING,
 page_uri STRING,
 cookie STRING)
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ' '
 STORED AS TEXTFILE
 LOCATION 'asv://logs/logs/';
 CREATE TABLE log_summary AS
 SELECT l.cookie
 ,MAX(regexp_replace(cookie, '[-]', '') % 36) AS geo_hash
 ,MAX(l.time1) AS time1
 ,l.page_uri
 ,MAX(CASE LOWER(action) WHEN 'click' THEN concat(l.date1, ' ', l.time1) ELSE NULL END) AS click_time
 ,MIN(CASE LOWER(action) WHEN 'view' THEN concat(l.date1, ' ', l.time1) ELSE NULL END) AS view_time
 ,MAX(l.date1) AS date1
 FROM logs l
 GROUP BY l.cookie, l.page_uri;

Hive HQL Queries
DEMO: Azure HDInsight
Hadoop is an open source framework for building large scale,
             distributed, data- intensive applications

                                               • Hadoop is HDFS, the
                                                 kernel & M/R
                                               • MapReduce brings the
                                                 code to the data
                                               • Open set of tools exist to
                                                 extend its functional uses
                                                 and representations




Hadoop Ecosystem Overview
The "Map" step                                                     The "Reduce" step
 The mappers are responsible for reading the input data and         Each reducer executes a function on all values for a given
 emitting key/value pairs. The input file can be CSV, XML, or any   key. The framework ensures that all values for the same
 format as long as it can be converted into k/v pairs.              key are sent to the same reducer.




Map/Reduce Distributes Processing of Operations
WEB SERVERS
                     How to do ad-hoc data discovery and visualizations?




                      M/R
      LOG FILES


                      Ad Servers and Web Servers generate different log files with different formats
                       making them hard to analyze
                      Map/Reduce processing allows for us to execute a query across variant data
                       formats stored in Hadoop
                      Hive provides a traditional query interface to Map/Reduce
                      Correlate and connect high variety data for trend analysis
     AD SERVERS

Visualization Architecture
DEMO: Excel & Hive Adapter
 Big Data & Analytics Projects are often Additive
 • New Capabilities layered on top of existing data & apps
 • Analytics can drive Applications in new ways
 Visualizations put Big Data in the hands of the Business




Summary
We are BlueMetal Architects
Take the next steps – Imagine, Define, Build
 Envisioning & Strategy Briefing: Big Data, Analytics & Collaboration
 Envisioning Session: Data is the App – Envisioning the Next
  Generation, Data Driven Enterprise
 Architecture Design Session: Big Data & Analytics
 Healthcare / Life Sciences: Strategy Briefing or Architecture Design
  Session – Big Data Architecture, Cloud & Use Case Driven Analytics
  and applications, Portal, M-Health and UX design for Providers,
  Patients, Pharma & Biotechnology
 Financial Services: Strategy Briefing or Architecture Design Session –
  Big Data & Analytics for Banking, Capital Markets, Retail Brokerage or
  Insurance




Take the next steps - our offerings
Thank You
DESIGN            Differentiation




             UX   DATA     SOCIAL   Specialization




                  CODE              Foundation




Who We Are
DESIGN
                                                       Differentiation
              Strategy     Analysis      Creative



               UX          DATA        SOCIAL
              Desktop      Analytics   Web Content
                                                       Specialization
              Mobile       Big Data      Intranets

             Web Client    Core SQL    Collaboration



               .NET       SERVICES     On-Premise
                                                       Foundation
                Java         PPP          Cloud




Who We Are

More Related Content

What's hot

ESRI Mapping & Charting Solution: ArcGIS 10 Production Mapping
ESRI Mapping & Charting Solution: ArcGIS 10 Production MappingESRI Mapping & Charting Solution: ArcGIS 10 Production Mapping
ESRI Mapping & Charting Solution: ArcGIS 10 Production Mappingmmarques_esri
 
RDX Insights Presentation - Microsoft Business Intelligence
RDX Insights Presentation - Microsoft Business IntelligenceRDX Insights Presentation - Microsoft Business Intelligence
RDX Insights Presentation - Microsoft Business IntelligenceChristopher Foot
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics SuiteJames Serra
 
Transitioning to a BI Role
Transitioning to a BI RoleTransitioning to a BI Role
Transitioning to a BI RoleJames Serra
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BIKellyn Pot'Vin-Gorman
 
Day1 concurrent fellows
Day1 concurrent fellowsDay1 concurrent fellows
Day1 concurrent fellowstoptrails
 
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...Esri Ireland
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsMark Kromer
 
Semantic Web Application Development
Semantic Web Application DevelopmentSemantic Web Application Development
Semantic Web Application DevelopmentDaniel Slamowitz
 
Evolution of Esri Data Formats Seminar
Evolution of Esri Data Formats SeminarEvolution of Esri Data Formats Seminar
Evolution of Esri Data Formats SeminarEsri South Africa
 
ArcGIS
ArcGISArcGIS
ArcGISEsri
 
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of TerabytesOverview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of TerabytesJames Serra
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 
Big Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsBig Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsWSO2
 
Customer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data PerspectiveCustomer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data PerspectiveDatabricks
 
Data Virtualization Primer - Introduction
Data Virtualization Primer - IntroductionData Virtualization Primer - Introduction
Data Virtualization Primer - IntroductionKenneth Peeples
 
Exploring Puerto Rico Open Data with Power BI
Exploring Puerto Rico Open Data with Power BIExploring Puerto Rico Open Data with Power BI
Exploring Puerto Rico Open Data with Power BIGuillermo Caicedo
 
Open Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they CompareOpen Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they CompareSafe Software
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Mariano Gonzalez
 

What's hot (20)

ESRI Mapping & Charting Solution: ArcGIS 10 Production Mapping
ESRI Mapping & Charting Solution: ArcGIS 10 Production MappingESRI Mapping & Charting Solution: ArcGIS 10 Production Mapping
ESRI Mapping & Charting Solution: ArcGIS 10 Production Mapping
 
RDX Insights Presentation - Microsoft Business Intelligence
RDX Insights Presentation - Microsoft Business IntelligenceRDX Insights Presentation - Microsoft Business Intelligence
RDX Insights Presentation - Microsoft Business Intelligence
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
 
Transitioning to a BI Role
Transitioning to a BI RoleTransitioning to a BI Role
Transitioning to a BI Role
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BI
 
Day1 concurrent fellows
Day1 concurrent fellowsDay1 concurrent fellows
Day1 concurrent fellows
 
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analytics
 
Semantic Web Application Development
Semantic Web Application DevelopmentSemantic Web Application Development
Semantic Web Application Development
 
Modernizando plataforma de bi
Modernizando plataforma de biModernizando plataforma de bi
Modernizando plataforma de bi
 
Evolution of Esri Data Formats Seminar
Evolution of Esri Data Formats SeminarEvolution of Esri Data Formats Seminar
Evolution of Esri Data Formats Seminar
 
ArcGIS
ArcGISArcGIS
ArcGIS
 
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of TerabytesOverview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Big Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsBig Data Storage Challenges and Solutions
Big Data Storage Challenges and Solutions
 
Customer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data PerspectiveCustomer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data Perspective
 
Data Virtualization Primer - Introduction
Data Virtualization Primer - IntroductionData Virtualization Primer - Introduction
Data Virtualization Primer - Introduction
 
Exploring Puerto Rico Open Data with Power BI
Exploring Puerto Rico Open Data with Power BIExploring Puerto Rico Open Data with Power BI
Exploring Puerto Rico Open Data with Power BI
 
Open Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they CompareOpen Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they Compare
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
 

Viewers also liked

El "Internet de Todo" (IoT)
El "Internet de Todo" (IoT)El "Internet de Todo" (IoT)
El "Internet de Todo" (IoT)Egdares Futch H.
 
Iot- Construyendo negocios a través de la información - Carlos Calderón
Iot- Construyendo negocios a través de la información - Carlos CalderónIot- Construyendo negocios a través de la información - Carlos Calderón
Iot- Construyendo negocios a través de la información - Carlos CalderónCNT
 
Big data architectures
Big data architecturesBig data architectures
Big data architecturesDaan Gerits
 
User and IoT Data Analytics
User and IoT Data AnalyticsUser and IoT Data Analytics
User and IoT Data AnalyticsEricsson
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...SoftServe
 
Tableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data VisualizationTableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data Visualizationlesterathayde
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics ArchitectureArvind Sathi
 
Big Data: Architectures and Approaches
Big Data: Architectures and ApproachesBig Data: Architectures and Approaches
Big Data: Architectures and ApproachesThoughtworks
 
The Internet of Things is Here: Implementing IoT in Your Facility
The Internet of Things is Here: Implementing IoT in Your FacilityThe Internet of Things is Here: Implementing IoT in Your Facility
The Internet of Things is Here: Implementing IoT in Your FacilitySenseware
 
Internet of Things (IoT) - We Are at the Tip of An Iceberg
Internet of Things (IoT) - We Are at the Tip of An IcebergInternet of Things (IoT) - We Are at the Tip of An Iceberg
Internet of Things (IoT) - We Are at the Tip of An IcebergDr. Mazlan Abbas
 
Internet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabatiInternet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabatinabati
 
IoT in Agriculture
IoT in AgricultureIoT in Agriculture
IoT in AgricultureTibbo
 
The Future of Everything
The Future of EverythingThe Future of Everything
The Future of EverythingCharbel Zeaiter
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerLuminary Labs
 

Viewers also liked (15)

El "Internet de Todo" (IoT)
El "Internet de Todo" (IoT)El "Internet de Todo" (IoT)
El "Internet de Todo" (IoT)
 
Iot- Construyendo negocios a través de la información - Carlos Calderón
Iot- Construyendo negocios a través de la información - Carlos CalderónIot- Construyendo negocios a través de la información - Carlos Calderón
Iot- Construyendo negocios a través de la información - Carlos Calderón
 
Big data architectures
Big data architecturesBig data architectures
Big data architectures
 
User and IoT Data Analytics
User and IoT Data AnalyticsUser and IoT Data Analytics
User and IoT Data Analytics
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
Tableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data VisualizationTableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data Visualization
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics Architecture
 
Big Data: Architectures and Approaches
Big Data: Architectures and ApproachesBig Data: Architectures and Approaches
Big Data: Architectures and Approaches
 
The Internet of Things is Here: Implementing IoT in Your Facility
The Internet of Things is Here: Implementing IoT in Your FacilityThe Internet of Things is Here: Implementing IoT in Your Facility
The Internet of Things is Here: Implementing IoT in Your Facility
 
Internet of Things (IoT) - We Are at the Tip of An Iceberg
Internet of Things (IoT) - We Are at the Tip of An IcebergInternet of Things (IoT) - We Are at the Tip of An Iceberg
Internet of Things (IoT) - We Are at the Tip of An Iceberg
 
Internet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabatiInternet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabati
 
IoT in Agriculture
IoT in AgricultureIoT in Agriculture
IoT in Agriculture
 
The Future of Everything
The Future of EverythingThe Future of Everything
The Future of Everything
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 

Similar to What is Big Data? The 5 V's and Real World Use Cases

Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016StampedeCon
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure CloudCaserta
 
Benefits of the Azure cloud
Benefits of the Azure cloudBenefits of the Azure cloud
Benefits of the Azure cloudJames Serra
 
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATION
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATIONLogitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATION
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATIONAvinash Deshpande
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategyJames Serra
 
Kyvos Insights
Kyvos Insights Kyvos Insights
Kyvos Insights rebeccatho
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceSalesforce Developers
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?James Serra
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
 
OpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Denodo
 
Denodo Design Studio: Modeling and Creation of Data Services
Denodo Design Studio: Modeling and Creation of Data ServicesDenodo Design Studio: Modeling and Creation of Data Services
Denodo Design Studio: Modeling and Creation of Data ServicesDenodo
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
Self service BI with sql server 2008 R2 and microsoft power pivot short
Self service BI with sql server 2008 R2 and microsoft power pivot shortSelf service BI with sql server 2008 R2 and microsoft power pivot short
Self service BI with sql server 2008 R2 and microsoft power pivot shortEduardo Castro
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventTrivadis
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview Rajesh Menon
 
Technology Overview
Technology OverviewTechnology Overview
Technology OverviewLiran Zelkha
 
Apache hadoop for windows server and windwos azure
Apache hadoop for windows server and windwos azureApache hadoop for windows server and windwos azure
Apache hadoop for windows server and windwos azureBrad Sarsfield
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 

Similar to What is Big Data? The 5 V's and Real World Use Cases (20)

Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure Cloud
 
Benefits of the Azure cloud
Benefits of the Azure cloudBenefits of the Azure cloud
Benefits of the Azure cloud
 
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATION
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATIONLogitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATION
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATION
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
Kyvos Insights
Kyvos Insights Kyvos Insights
Kyvos Insights
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to Salesforce
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
OpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas Corporate Presentation
OpenSistemas Corporate Presentation
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)
 
Denodo Design Studio: Modeling and Creation of Data Services
Denodo Design Studio: Modeling and Creation of Data ServicesDenodo Design Studio: Modeling and Creation of Data Services
Denodo Design Studio: Modeling and Creation of Data Services
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Self service BI with sql server 2008 R2 and microsoft power pivot short
Self service BI with sql server 2008 R2 and microsoft power pivot shortSelf service BI with sql server 2008 R2 and microsoft power pivot short
Self service BI with sql server 2008 R2 and microsoft power pivot short
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview
 
Neelima_Resume
Neelima_ResumeNeelima_Resume
Neelima_Resume
 
Technology Overview
Technology OverviewTechnology Overview
Technology Overview
 
Apache hadoop for windows server and windwos azure
Apache hadoop for windows server and windwos azureApache hadoop for windows server and windwos azure
Apache hadoop for windows server and windwos azure
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 

More from BlueMetalInc

Field enablement roadshow keynote - Bob Familiar
Field enablement roadshow keynote - Bob FamiliarField enablement roadshow keynote - Bob Familiar
Field enablement roadshow keynote - Bob FamiliarBlueMetalInc
 
Field Enablement Business Drivers - Matt Bienfang
Field Enablement Business Drivers - Matt BienfangField Enablement Business Drivers - Matt Bienfang
Field Enablement Business Drivers - Matt BienfangBlueMetalInc
 
Field enablement roadshow - Real World Solutions - John Pelak
Field enablement roadshow - Real World Solutions - John PelakField enablement roadshow - Real World Solutions - John Pelak
Field enablement roadshow - Real World Solutions - John PelakBlueMetalInc
 
BlueMetal - Our Company Culture in 30 Seconds
BlueMetal - Our Company Culture in 30 SecondsBlueMetal - Our Company Culture in 30 Seconds
BlueMetal - Our Company Culture in 30 SecondsBlueMetalInc
 
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...BlueMetalInc
 
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...BlueMetalInc
 
20130427 What's Your Social IQ?
20130427 What's Your Social IQ?20130427 What's Your Social IQ?
20130427 What's Your Social IQ?BlueMetalInc
 
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 SearchBlueMetalInc
 
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010BlueMetalInc
 
Empowering business users with hybrid solutions
Empowering business users with hybrid solutionsEmpowering business users with hybrid solutions
Empowering business users with hybrid solutionsBlueMetalInc
 

More from BlueMetalInc (10)

Field enablement roadshow keynote - Bob Familiar
Field enablement roadshow keynote - Bob FamiliarField enablement roadshow keynote - Bob Familiar
Field enablement roadshow keynote - Bob Familiar
 
Field Enablement Business Drivers - Matt Bienfang
Field Enablement Business Drivers - Matt BienfangField Enablement Business Drivers - Matt Bienfang
Field Enablement Business Drivers - Matt Bienfang
 
Field enablement roadshow - Real World Solutions - John Pelak
Field enablement roadshow - Real World Solutions - John PelakField enablement roadshow - Real World Solutions - John Pelak
Field enablement roadshow - Real World Solutions - John Pelak
 
BlueMetal - Our Company Culture in 30 Seconds
BlueMetal - Our Company Culture in 30 SecondsBlueMetal - Our Company Culture in 30 Seconds
BlueMetal - Our Company Culture in 30 Seconds
 
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...
 
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...
 
20130427 What's Your Social IQ?
20130427 What's Your Social IQ?20130427 What's Your Social IQ?
20130427 What's Your Social IQ?
 
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search
 
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010
 
Empowering business users with hybrid solutions
Empowering business users with hybrid solutionsEmpowering business users with hybrid solutions
Empowering business users with hybrid solutions
 

Recently uploaded

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 

Recently uploaded (20)

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 

What is Big Data? The 5 V's and Real World Use Cases

  • 1.
  • 2. What is Big Data? A new generation of technologies and architectures designed to economically extract value from very large volumes of a wide variety of data, by enabling high velocity capture, discovery and/or analysis
  • 3. VELOCITY VARIETY VOLUME + VISUALIZATION VALUE Big Data’s impact can be expressed by The Five V’s
  • 4.  E-Commerce Site fed by outsourced Ad Servers  Ads appear on a wide range of sites with various offers  Massive amount of data is generated by these servers: • Web logs and click stream data from the E-Commerce Site • Ad logs and click stream data from the Ad Servers • Results in relational transactions on the site  Goal: Maximize Traffic Analysis for Business Value • Velocity Demo: Pinpoint activity in real-time & react • Variety Demo: Examine historical trends across sources • Visualization Demo: Enable ad-hoc data analysis for insights Demo Context
  • 5. WEB SERVERS How to identify when Ad clicks results in Site Traffic?  High volume stream of log activity coming in: • Web logs and Ad Server logs  Real-time stream analysis allows for pinpointing data when it happens LOG FILES  Simultaneously join structured and unstructured data in a persistent query  Can be used for A/B testing, Offer improvement, Site Dynamic behavior, or Fraud Detection AD SERVERS Velocity Architecture
  • 7. WEB SERVERS How to do historical analysis on unstructured data? M/R LOG FILES  Ad Servers and Web Servers generate different log files with different formats making them hard to analyze  Map/Reduce processing allows for us to execute a query across variant data formats stored in Hadoop  Hive provides a traditional query interface to Map/Reduce  Correlate and connect high variety data for trend analysis AD SERVERS Variety Architecture
  • 8. Access Azure blob storage via a Hive “view” and aggregate session data CREATE EXTERNAL TABLE logs ( date1 STRING, time1 STRING, action STRING, page_uri STRING, cookie STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' STORED AS TEXTFILE LOCATION 'asv://logs/logs/'; CREATE TABLE log_summary AS SELECT l.cookie ,MAX(regexp_replace(cookie, '[-]', '') % 36) AS geo_hash ,MAX(l.time1) AS time1 ,l.page_uri ,MAX(CASE LOWER(action) WHEN 'click' THEN concat(l.date1, ' ', l.time1) ELSE NULL END) AS click_time ,MIN(CASE LOWER(action) WHEN 'view' THEN concat(l.date1, ' ', l.time1) ELSE NULL END) AS view_time ,MAX(l.date1) AS date1 FROM logs l GROUP BY l.cookie, l.page_uri; Hive HQL Queries
  • 10. Hadoop is an open source framework for building large scale, distributed, data- intensive applications • Hadoop is HDFS, the kernel & M/R • MapReduce brings the code to the data • Open set of tools exist to extend its functional uses and representations Hadoop Ecosystem Overview
  • 11. The "Map" step The "Reduce" step The mappers are responsible for reading the input data and Each reducer executes a function on all values for a given emitting key/value pairs. The input file can be CSV, XML, or any key. The framework ensures that all values for the same format as long as it can be converted into k/v pairs. key are sent to the same reducer. Map/Reduce Distributes Processing of Operations
  • 12. WEB SERVERS How to do ad-hoc data discovery and visualizations? M/R LOG FILES  Ad Servers and Web Servers generate different log files with different formats making them hard to analyze  Map/Reduce processing allows for us to execute a query across variant data formats stored in Hadoop  Hive provides a traditional query interface to Map/Reduce  Correlate and connect high variety data for trend analysis AD SERVERS Visualization Architecture
  • 13. DEMO: Excel & Hive Adapter
  • 14.  Big Data & Analytics Projects are often Additive • New Capabilities layered on top of existing data & apps • Analytics can drive Applications in new ways Visualizations put Big Data in the hands of the Business Summary
  • 15. We are BlueMetal Architects
  • 16. Take the next steps – Imagine, Define, Build
  • 17.  Envisioning & Strategy Briefing: Big Data, Analytics & Collaboration  Envisioning Session: Data is the App – Envisioning the Next Generation, Data Driven Enterprise  Architecture Design Session: Big Data & Analytics  Healthcare / Life Sciences: Strategy Briefing or Architecture Design Session – Big Data Architecture, Cloud & Use Case Driven Analytics and applications, Portal, M-Health and UX design for Providers, Patients, Pharma & Biotechnology  Financial Services: Strategy Briefing or Architecture Design Session – Big Data & Analytics for Banking, Capital Markets, Retail Brokerage or Insurance Take the next steps - our offerings
  • 19. DESIGN Differentiation UX DATA SOCIAL Specialization CODE Foundation Who We Are
  • 20. DESIGN Differentiation Strategy Analysis Creative UX DATA SOCIAL Desktop Analytics Web Content Specialization Mobile Big Data Intranets Web Client Core SQL Collaboration .NET SERVICES On-Premise Foundation Java PPP Cloud Who We Are