Creating new data lakes or migrating existing ones on the cloud has been a de-facto trend. With reduced costs and agile infrastructure, it is easier to derive business value on the cloud.
While there are so many choices for architecture, tech stack, and solutions available on various platforms to fit your use case, it is imperative to consider the key mantras and lifeline principles that will help you succeed in your cloud data lake journey.
In this webinar, we share the best practices and the key considerations that will help you build a robust data lake architecture on the cloud.
To view the webinar - visit https://bit.ly/2sG8BAp
10. -Gartner
If you have not developed a cloud-first strategy yet, you are likely falling behind your competitors
11. What’s in it for me?
Key influencers in building a data lake on cloud
Best practices for building a sustainable data lake on cloud
12. What is a Data Lake?
Data Lake
Database
Logs
XML Data
Spreadsheet
Text
13. Data lake: Key functional aspects
Ingest Store Process Analyze Consume
14. Data Privacy, Security and Access Management
Data Management
Data lake: Governance essentials
Ingress
Data Discovery and Curation
Profiling | Classification | Lineage | Prepare
Metadata Catalog | Catalog | MDM | Archive
Quality
Physical | Encryption | Access | Audit
Data Discovery, Reporting and Visualization
15. End-to-end big data lake capability view
Metadata / Governance Layer
Streaming
Structured (ODS) /
Unstructured
Geospatial /
Machine /
Time Series
External Social
RawLayer
ProcessedLayer
Landing/
Staging
Active Archive
Common Data
Model
ODL – Rest /
Motion
Master
Reference Data
API Data
Mart
Stewardship/
Policies
QuickSight
Athena
SLA-BasedDataVendingLayer
Sandbox
Stores
Post Analytics
Store
Search
ODL-End
User
Lineage
Service
Catalog
Audit /
Monitoring
Workflow
/DQ
Workflow
/DQ
23. Build a solution from an enterprise view
Goal alignment
Budgeting
Chargebacks
ROI assessment
Assess reskill needs
24. Focus on value delivery
Adopt cloud first commitment
Have flexibility to change
Use accelerators
25. Build a foundation first
Security should be your top priority
Address key challenges first
Reusable templates
Logging and monitoring
Financial controls
26. Select the right tooling
Data integration
Data catalog
Data quality
Data governance
34. Advisory and Consulting
Cloud Enablement and Migration, Big Data Enablement and Migration, Data Lake,
DevOps, Usability, Mobility etc.
Architecture, Design and
Engineering
Technology Evaluation and Benchmarking, Solution Design and Architecture,
Engineering, Quality Assurance, NFRs and Performance Engineering etc.
Enterprise Data Management
Data Lake, Data Modelling, Data Migration, Data Visualization, Data Democratization
and Governance etc.
DevOps and Productionization
Capacity Planning, Infra as Code, Infra Provisioning, Automation and Administration,
Ops Support etc.
Data Science and Analytics
NLP and NLG, AI and Deep Learning, Descriptive-Prescriptive-Predictive Analytics,
Sentiment Analysis etc.
User Experience
UX Design and Architecture, Rich Media Design, Mobile App Design, Responsive
Design, UX Lab Assessment etc.
Our Services
35. Impetus cloud practice competencies
Amazon Web Services
Microsoft Azure
Google Cloud
Salesforce
Advisory, strategy,
TCO
Architecture evaluation
Cloud infrastructure
realization
Cloud cost optimization
Workload assessment
and transformation
Capacity planning
Automation and
orchestration
Security and
governance
DevOps
Maintenance
and administration