SlideShare a Scribd company logo
The Partner to your Cloud
Transformation Journey
1
2
3
4
5
Agenda
Introduction - Why do we need to think beyond Data Lakes?
Driving automation and insights utilizing AWS data services
Best Practices for Data Architecture
Implementation Case Studies and Outcomes Delivered by
Searce
Q&A
Speaker
Bhuvaneshwaran R.
Database Architect
Searce
Driving Automation and Insights utilizing AWS Data Services
Data Generated vs Analyzed
Know your data
Structured Un-structuredSemi-structured
● CRM
● ERP
● SQL Databases
● Log Files
● Image files
● Calls
● Mobile Data
● iOT Sensors
● Social media data
Batch Streaming
Define the pipeline
Data Life Cycle in AWS Data Platform
Kinesis SFTP DMS Snowball Direct Connect
DynamoDB ElasticSearch Glue Catalog
Glue EMR RedShift Athena QuickSight
Data Ingestion
Get your data into
S3 with secure
Data Catalog
Access & Search
metadata
Process &
Analytics
Get insights from
your data
MSK
Data Lake or Data Warehouse
Data Lake Data Warehouse
Schema on Read PROCESSING Schema on write
Structured, Semi Structured,
Unstructured, Raw
DATA Structured and Processed
Designed For Low Cost
Storage
STORAGE
Expensive for large data
volumes
Helps for fast ingestion of
new data
DATA PROCESSING
Time-consuming to
introduce new content.
Data Scientists, etc. USERS Business Professionals
Transformation
Extract, Transform & Load
Extract, Load and Transform
1
2
3
4
5
AWS Data Lake Infrastructure
Highly durable & Unlimited storage
Support for open file formats
Easy integration to other AWS services
Secure, Complainant & Audit
Decouple of storage and compute
Reference Architecture - Building a Data Lake in AWS
ETL for Analytics
● RDS - Source
● Glue - ETL
● S3 - Storage
● Athena -
Interactive Query
service
Streaming Data Solutions with Amazon Kinesis
Components:
● Kinesis Data Stream
● Kinesis FireHose
● Kinesis Analytics
● Lambda
● DynamoDB
● SNS
Streaming Relational Database Solution - CDC
Components:
● RDS MySQL
● Debezium Connector
● AWS MSK
● S3
● ElasticSearch
● EMR
● RedShift
● Consumer App
1 2 3 4 5 6
Extract Transform
& Process
Data Lake
(Storage)
Visualization AI/ML
Data Lake Lifecycle
Security
Data Governance
“Data governance is the formal orchestration of people, processes, and technology that enables an
organization to leverage data as an enterprise asset.”
Data Governance on AWS:
● De-Identified Data lake
● Data Matching
● Data Transformation
● Data Catalog
● Analytics and Data processing
● Monitoring
Maintain the Data Catalog
Glue:
● Crawler
● MetaData
● Versioning
● Custom classifiers
Data Governance Reference Architecture
Lake Formation
Lake Formation - Security
Where are you in your Data Journey?
Ecommerce or Retail - Real-
time Analytics
● Real time clickstream
data
● Use ML for
Recommendation engine.
Services:
1. Kinesis
2. Sagemaker
3. DynamoDB
Digital Native already on Cloud -
cost optimization
● Move complex ETL
workloads to BigData
clusters
● Move Large volume of cold
data to DataLake
Services:
1. RedShift
2. EMR
3. Glue
4. S3
5. Athena
6. Spectrum
Traditional Enterprise or DNB
- DW/DL - Security
● Move your Glue catalog,
Athena to Lake
Formation.
● Control the
database/Storage level
access with AWS Lake
formation
Services:
1. Lake Formation
2. IAM
3. KMS
Speaker
Wei Chung Low
Sr. Specialist Partner
Solution Architect
Big Data and Analytics
Amazon Web Services
Best Practices for Data Architecture
Challenge
Solution
Business Impact
Case Studies | AWS | FlowerAura
Needed reliable Data Lake solutions to:
● Collect and process POS as well as website/ mobile application data
● Support analytics-based services for deeper understanding of purchasing
behaviors
● Help the customers/visitors to make a better decision while purchasing
● Built a Data Lake that collected real-time data from the existing data sources
and used AWS Glue which performed ETL on the collected data
● Trained the transformed data using Sagemaker which provided
recommendations to the customer as per the browsing and purchasing
history
● A single source of truth with all data sources in one repository
● The recommendation engine presented users with choices regarding items
based on selections and from the list of available items.
● It led to upsell, higher offtake, greater retention of existing customers, and
lower advertising costs.
FlowerAura is an online flower store
that delivers deliver the best quality
fresh cut flowers in more than 220
cities across India using strong
affiliate network and channel stores.
Workload
AWS S3,Redshift, DynamoDb,
Quicksight, Sagemaker, AWS
Glue/Kinesis
Industry
E-Commerce
Challenge
Solution
Business Impact
Case Studies | AWS | Britannia
● Manual processes for ETL and consolidating data- took 3-4 days and scale was a big
bottleneck.
● Fulfillment for 18000+ stores all across India by analyzing the purchase behavior of
customers to help Britannia identify the demand-supply pattern and keep up to date
with the SKU's
● Provisioned infrastructure on AWS - VPC, ETL instances, RedShift, Processing Server
and Server which will host Tableau.
● Established Site to Site VPN
● Initiated one time dump of the data from On Premise SQL server to S3
● Authored ETL jobs for loading from multiple data sources from On-premise to AWS S3
and help Emisha team connect Tableau Server to the Redshift
● Deployed and served ML model
Workload
S3, Redshift, VPC, EC2, Sagemaker
Industry
FMCG
Britannia Industries Limited is
one of the oldest existing Indian
food-products corporations.
Existing manual process of consolidating data for analytics and predictions on customer's
buying patterns took 3-4 days, now replaced by a real time dashboard, thus making tracking
and management easy.
Generative Designs: Data meets AI, meets creativity
For world’s 5th largest watchmaker, Searce created a Deep Learning model that is capable of
generating watch designs based on input parameters
These parameters included
● Band Color
● Dial Color
● Gender
● Dial Size
● Band Material
Q&A
Please type your questions on the chat window
Thank YouThank You

More Related Content

What's hot

Using Big Data to Driving Big Engagement
Using Big Data to Driving Big EngagementUsing Big Data to Driving Big Engagement
Using Big Data to Driving Big Engagement
Amazon Web Services
 
Spark Streaming with Azure Databricks
Spark Streaming with Azure DatabricksSpark Streaming with Azure Databricks
Spark Streaming with Azure Databricks
Dustin Vannoy
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
Amazon Web Services
 
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
Microsoft Tech Community
 
Building Custom Big Data Integrations
Building Custom Big Data IntegrationsBuilding Custom Big Data Integrations
Building Custom Big Data Integrations
Pat Patterson
 
How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?
Jeraldine Phneah
 
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Trivadis
 
Lecture1
Lecture1Lecture1
Lecture1
Manish Singh
 
Survey of Real-time Processing Systems for Big Data
Survey of Real-time Processing Systems for Big DataSurvey of Real-time Processing Systems for Big Data
Survey of Real-time Processing Systems for Big Data
Luiz Henrique Zambom Santana
 
FSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory ReportingFSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory Reporting
Amazon Web Services
 
Amazon Redshift, Customer Acquisition Cost & Advertising ROI presented with A...
Amazon Redshift, Customer Acquisition Cost & Advertising ROI presented with A...Amazon Redshift, Customer Acquisition Cost & Advertising ROI presented with A...
Amazon Redshift, Customer Acquisition Cost & Advertising ROI presented with A...
Amazon Web Services
 
Scalable Data Management for Kafka and Beyond | Dan Rice, BigID
Scalable Data Management for Kafka and Beyond | Dan Rice, BigIDScalable Data Management for Kafka and Beyond | Dan Rice, BigID
Scalable Data Management for Kafka and Beyond | Dan Rice, BigID
HostedbyConfluent
 
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
HostedbyConfluent
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
Trivadis
 
BDA304 Data-Driven Post Mortems
BDA304 Data-Driven Post MortemsBDA304 Data-Driven Post Mortems
BDA304 Data-Driven Post Mortems
Amazon Web Services
 
Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
Cathrine Wilhelmsen
 
Webinar: Transforming Customer Experience Through an Always-On Data Platform
Webinar: Transforming Customer Experience Through an Always-On Data PlatformWebinar: Transforming Customer Experience Through an Always-On Data Platform
Webinar: Transforming Customer Experience Through an Always-On Data Platform
DataStax
 
New capabilities for modern data integration in the cloud
New capabilities for modern data integration in the cloudNew capabilities for modern data integration in the cloud
New capabilities for modern data integration in the cloud
Microsoft Tech Community
 
How to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the CloudHow to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the Cloud
Attunity
 

What's hot (20)

Using Big Data to Driving Big Engagement
Using Big Data to Driving Big EngagementUsing Big Data to Driving Big Engagement
Using Big Data to Driving Big Engagement
 
Spark Streaming with Azure Databricks
Spark Streaming with Azure DatabricksSpark Streaming with Azure Databricks
Spark Streaming with Azure Databricks
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
 
Building Custom Big Data Integrations
Building Custom Big Data IntegrationsBuilding Custom Big Data Integrations
Building Custom Big Data Integrations
 
How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?
 
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
 
Lecture1
Lecture1Lecture1
Lecture1
 
Survey of Real-time Processing Systems for Big Data
Survey of Real-time Processing Systems for Big DataSurvey of Real-time Processing Systems for Big Data
Survey of Real-time Processing Systems for Big Data
 
FSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory ReportingFSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory Reporting
 
Amazon Redshift, Customer Acquisition Cost & Advertising ROI presented with A...
Amazon Redshift, Customer Acquisition Cost & Advertising ROI presented with A...Amazon Redshift, Customer Acquisition Cost & Advertising ROI presented with A...
Amazon Redshift, Customer Acquisition Cost & Advertising ROI presented with A...
 
Scalable Data Management for Kafka and Beyond | Dan Rice, BigID
Scalable Data Management for Kafka and Beyond | Dan Rice, BigIDScalable Data Management for Kafka and Beyond | Dan Rice, BigID
Scalable Data Management for Kafka and Beyond | Dan Rice, BigID
 
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
 
BDA304 Data-Driven Post Mortems
BDA304 Data-Driven Post MortemsBDA304 Data-Driven Post Mortems
BDA304 Data-Driven Post Mortems
 
Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
 
Webinar: Transforming Customer Experience Through an Always-On Data Platform
Webinar: Transforming Customer Experience Through an Always-On Data PlatformWebinar: Transforming Customer Experience Through an Always-On Data Platform
Webinar: Transforming Customer Experience Through an Always-On Data Platform
 
New capabilities for modern data integration in the cloud
New capabilities for modern data integration in the cloudNew capabilities for modern data integration in the cloud
New capabilities for modern data integration in the cloud
 
How to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the CloudHow to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the Cloud
 

Similar to Delivering business insights and automation utilizing aws data services

AWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions ShowcaseAWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions Showcase
Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Amazon Web Services
 
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
Amazon Web Services
 
Confluent:AWS - GameDay.pptx
 Confluent:AWS - GameDay.pptx Confluent:AWS - GameDay.pptx
Confluent:AWS - GameDay.pptx
Ahmed791434
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
Amazon Web Services
 
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Amazon Web Services
 
State of the Union: Database & Analytics
State of the Union: Database & AnalyticsState of the Union: Database & Analytics
State of the Union: Database & Analytics
Amazon Web Services
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
Amazon Web Services
 
Database Freedom - ADB304 - Santa Clara AWS Summit
Database Freedom - ADB304 - Santa Clara AWS SummitDatabase Freedom - ADB304 - Santa Clara AWS Summit
Database Freedom - ADB304 - Santa Clara AWS Summit
Amazon Web Services
 
Building your First Big Data Application on AWS
Building your First Big Data Application on AWSBuilding your First Big Data Application on AWS
Building your First Big Data Application on AWS
Amazon Web Services
 
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Amazon Web Services
 
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
Amazon Web Services
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
Amazon Web Services
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
DATAVERSITY
 
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
Informatica
 
ABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data Applications
Amazon Web Services
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
Amazon Web Services
 
Aws meetup 20190427
Aws meetup 20190427Aws meetup 20190427
Aws meetup 20190427
Sridevi Murugayen
 
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
Amazon Web Services
 

Similar to Delivering business insights and automation utilizing aws data services (20)

AWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions ShowcaseAWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions Showcase
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
 
Confluent:AWS - GameDay.pptx
 Confluent:AWS - GameDay.pptx Confluent:AWS - GameDay.pptx
Confluent:AWS - GameDay.pptx
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
 
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
 
State of the Union: Database & Analytics
State of the Union: Database & AnalyticsState of the Union: Database & Analytics
State of the Union: Database & Analytics
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
 
Database Freedom - ADB304 - Santa Clara AWS Summit
Database Freedom - ADB304 - Santa Clara AWS SummitDatabase Freedom - ADB304 - Santa Clara AWS Summit
Database Freedom - ADB304 - Santa Clara AWS Summit
 
Building your First Big Data Application on AWS
Building your First Big Data Application on AWSBuilding your First Big Data Application on AWS
Building your First Big Data Application on AWS
 
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
 
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
 
ABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data Applications
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
 
Aws meetup 20190427
Aws meetup 20190427Aws meetup 20190427
Aws meetup 20190427
 
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
 

Recently uploaded

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 

Recently uploaded (20)

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 

Delivering business insights and automation utilizing aws data services

  • 1. The Partner to your Cloud Transformation Journey
  • 2. 1 2 3 4 5 Agenda Introduction - Why do we need to think beyond Data Lakes? Driving automation and insights utilizing AWS data services Best Practices for Data Architecture Implementation Case Studies and Outcomes Delivered by Searce Q&A
  • 3. Speaker Bhuvaneshwaran R. Database Architect Searce Driving Automation and Insights utilizing AWS Data Services
  • 4. Data Generated vs Analyzed
  • 5. Know your data Structured Un-structuredSemi-structured ● CRM ● ERP ● SQL Databases ● Log Files ● Image files ● Calls ● Mobile Data ● iOT Sensors ● Social media data
  • 7. Data Life Cycle in AWS Data Platform Kinesis SFTP DMS Snowball Direct Connect DynamoDB ElasticSearch Glue Catalog Glue EMR RedShift Athena QuickSight Data Ingestion Get your data into S3 with secure Data Catalog Access & Search metadata Process & Analytics Get insights from your data MSK
  • 8. Data Lake or Data Warehouse Data Lake Data Warehouse Schema on Read PROCESSING Schema on write Structured, Semi Structured, Unstructured, Raw DATA Structured and Processed Designed For Low Cost Storage STORAGE Expensive for large data volumes Helps for fast ingestion of new data DATA PROCESSING Time-consuming to introduce new content. Data Scientists, etc. USERS Business Professionals
  • 9. Transformation Extract, Transform & Load Extract, Load and Transform
  • 10. 1 2 3 4 5 AWS Data Lake Infrastructure Highly durable & Unlimited storage Support for open file formats Easy integration to other AWS services Secure, Complainant & Audit Decouple of storage and compute
  • 11. Reference Architecture - Building a Data Lake in AWS
  • 12. ETL for Analytics ● RDS - Source ● Glue - ETL ● S3 - Storage ● Athena - Interactive Query service
  • 13. Streaming Data Solutions with Amazon Kinesis Components: ● Kinesis Data Stream ● Kinesis FireHose ● Kinesis Analytics ● Lambda ● DynamoDB ● SNS
  • 14. Streaming Relational Database Solution - CDC Components: ● RDS MySQL ● Debezium Connector ● AWS MSK ● S3 ● ElasticSearch ● EMR ● RedShift ● Consumer App
  • 15. 1 2 3 4 5 6 Extract Transform & Process Data Lake (Storage) Visualization AI/ML Data Lake Lifecycle Security
  • 16. Data Governance “Data governance is the formal orchestration of people, processes, and technology that enables an organization to leverage data as an enterprise asset.” Data Governance on AWS: ● De-Identified Data lake ● Data Matching ● Data Transformation ● Data Catalog ● Analytics and Data processing ● Monitoring
  • 17. Maintain the Data Catalog Glue: ● Crawler ● MetaData ● Versioning ● Custom classifiers
  • 20. Lake Formation - Security
  • 21. Where are you in your Data Journey? Ecommerce or Retail - Real- time Analytics ● Real time clickstream data ● Use ML for Recommendation engine. Services: 1. Kinesis 2. Sagemaker 3. DynamoDB Digital Native already on Cloud - cost optimization ● Move complex ETL workloads to BigData clusters ● Move Large volume of cold data to DataLake Services: 1. RedShift 2. EMR 3. Glue 4. S3 5. Athena 6. Spectrum Traditional Enterprise or DNB - DW/DL - Security ● Move your Glue catalog, Athena to Lake Formation. ● Control the database/Storage level access with AWS Lake formation Services: 1. Lake Formation 2. IAM 3. KMS
  • 22. Speaker Wei Chung Low Sr. Specialist Partner Solution Architect Big Data and Analytics Amazon Web Services Best Practices for Data Architecture
  • 23. Challenge Solution Business Impact Case Studies | AWS | FlowerAura Needed reliable Data Lake solutions to: ● Collect and process POS as well as website/ mobile application data ● Support analytics-based services for deeper understanding of purchasing behaviors ● Help the customers/visitors to make a better decision while purchasing ● Built a Data Lake that collected real-time data from the existing data sources and used AWS Glue which performed ETL on the collected data ● Trained the transformed data using Sagemaker which provided recommendations to the customer as per the browsing and purchasing history ● A single source of truth with all data sources in one repository ● The recommendation engine presented users with choices regarding items based on selections and from the list of available items. ● It led to upsell, higher offtake, greater retention of existing customers, and lower advertising costs. FlowerAura is an online flower store that delivers deliver the best quality fresh cut flowers in more than 220 cities across India using strong affiliate network and channel stores. Workload AWS S3,Redshift, DynamoDb, Quicksight, Sagemaker, AWS Glue/Kinesis Industry E-Commerce
  • 24. Challenge Solution Business Impact Case Studies | AWS | Britannia ● Manual processes for ETL and consolidating data- took 3-4 days and scale was a big bottleneck. ● Fulfillment for 18000+ stores all across India by analyzing the purchase behavior of customers to help Britannia identify the demand-supply pattern and keep up to date with the SKU's ● Provisioned infrastructure on AWS - VPC, ETL instances, RedShift, Processing Server and Server which will host Tableau. ● Established Site to Site VPN ● Initiated one time dump of the data from On Premise SQL server to S3 ● Authored ETL jobs for loading from multiple data sources from On-premise to AWS S3 and help Emisha team connect Tableau Server to the Redshift ● Deployed and served ML model Workload S3, Redshift, VPC, EC2, Sagemaker Industry FMCG Britannia Industries Limited is one of the oldest existing Indian food-products corporations. Existing manual process of consolidating data for analytics and predictions on customer's buying patterns took 3-4 days, now replaced by a real time dashboard, thus making tracking and management easy.
  • 25. Generative Designs: Data meets AI, meets creativity For world’s 5th largest watchmaker, Searce created a Deep Learning model that is capable of generating watch designs based on input parameters These parameters included ● Band Color ● Dial Color ● Gender ● Dial Size ● Band Material
  • 26. Q&A Please type your questions on the chat window