SlideShare a Scribd company logo
1 of 5
Download to read offline
Why serve an undercooked Big Data
Solution!! Go for the well-baked one
Introduction
Today Big Data is the buzz word in the information and technology domain. Thereis too much of
discussion and at the sametime confusion pertaining to this revolution calledBig Data. Inclient’smind
(most of the time) Big Data cando wonders on their data with its intelligent processing power and
analyticalfeatures. Most often, the client is in a cloud nine situation where they visualize a Utopian/
ideal environment whereeverything is getting resolved.
Through this article, I want to deconstruct that belief. I want to share and pass on my learning on how to
rightly approach Big Data solution in order to get the insight in line with the expectationsof the client.
The intension of this BLOG is not to provide any technicalsolution for Big-Data problems, but a process
to approach Big Data solution in a smart way.
Understand the Client Requirement
Often we come across a successfully implemented Big-Data solution with optimal processing time, jazzy
visualization & analyticalcapabilities!! But generally it doesn’t meet the client’s expectations!!
Therefore, we need to understand why a client wants to go for the Big Data solution. Without having
this insight, we may not be able to produce thedesired output in line with their anticipation.
Image 1: Client mind mapping
Following the initial step of understanding the reason behind the client going for Big Data solution, it’s
time to generateanextrapolationof the study. We have to come up with a blueprint which will
highlight our suggestions and recommendations to executethe Big Data solution.
Image 2: The inference from client’sexpectation& Outcome
A First Person Account
Let me explain toyou this concept from a realtime example and walk you through a smart way to
execute the solution and present theoutcomes.
The tablebelow gives a brief overview of a client’s problem and how they expect it to resolve it.
Client Situation A leading retailer in USA wants to enhance the existing Decision
Support System application. The DSS application presently serves a
huge user base (50+) on a data volume of about 250TB. It runs on a
traditionalRDBMSand is not able to scale to the said expectation.
Client Requirement According to the client’sIT Director, iftheir existing application is
moved to Big-Data (byhosting on Microsoft Azure Cloud) it should be
able to resolve their problem, both on cost and performance.
Image 3: Smart Way Inference
Points to ponder for implementing Big Data Solution
i) Baseline the Solution Coverage
Based on the client situation mentioned above, we need to come up with solution for the following
glitches:
 Reduce cost on data storage
 Provide stable code base
 Implement a scalable solution
 Deliver cost effective application
From the inference above, I amtrying to come up with two use cases (scenario):
UseCase 1
i. Ingest one unit of the data set (1 TB) in to the system. Repeat this procedure for 30 units.
ii. Run this use case on 10, 50 and 100 Node clusters.
UseCase 2
i. Readone unit of data through multiple requests. Consider them as supplier request and can
rangefrom 1 to 40K.
ii. Run this use case on 10, 50 & 100 Node clusters.
ii) Configuring theEnvironment
This is yet another crucialphase of implementation. Identify theoptimal infrastructureto validatethe
solution. Ones we have identified the vendor (For e.g., Microsoft)or an infrastructure provider (For e.g.,
Big Decisions or Client’s IT Dept.)toimplement Big Data solution, keep the following aspects for one’s
cognizance:
 Is the infrastructurestable? Validate thefollowing:
 ValidateCPU & Memory
 ValidateI/O
 Validatethe Storage(In our case it should hold 250TB)
 Adequate access rightsfor you to execute
 Edgenode access of a cluster
 Root folder access
iii) Identifying the Core Team
No solution canbe productive, until or unless we get the right stream of resource. We need to engage
the spot-on resource to run the solution.
In this sample POC (Proof of Concept), wewould need an expert or COE to support the coreteam.
There is certainly no time for training, therefore wehave to identify a coreteam who can reachthe
expectancyin no time.
iv) All Set to Go
Ones there is clarityon the objective of the exercise (here POC) with right resources and optimal
environment, you could expect the outcome alignwith the client’s expectation. It is now timeto draft a
plan for implementing thesolution.
For instance, you candetermine the storagetypes (BLOB, AzureData LakeStore) and tool stacks
(Spark, Hive, ADLF) for data processing and querying. Createa code base and execute it over the
environments.
NOTE: It’s mandatory to complete the above steps before moving to the next. All the
above activities can be executedin parallel
v) Present the Insights and Recommendations
Having clarity on the objectives, it is easy to matchor meet the client’sexpectation, be it implementing
the Big Data solution. Last phase of theexercise will capture thestatistics by running the code on the
configured environments. It will derive the metrics/insights from the execution outcome and present it
to the client with recommendations. Refer imagebelow for a sample outcome.
I would like to conclude with the saying, “Don’t just meet the expectations. Exceed them.”And with
this approach you are bound to exceedthe expectations of the client.

More Related Content

What's hot

Machine learning101 v1.2
Machine learning101 v1.2Machine learning101 v1.2
Machine learning101 v1.2CCG
 
A strategy for security data analytics - SIRACon 2016
A strategy for security data analytics - SIRACon 2016A strategy for security data analytics - SIRACon 2016
A strategy for security data analytics - SIRACon 2016Jon Hawes
 
Transforming IoT using Dynamical Machine Learning
Transforming IoT using Dynamical Machine LearningTransforming IoT using Dynamical Machine Learning
Transforming IoT using Dynamical Machine LearningPG Madhavan
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptopRising Media, Inc.
 
Harness the power of data
Harness the power of dataHarness the power of data
Harness the power of dataHarsha MV
 
Ml in a Day Workshop 5/1
Ml in a Day Workshop 5/1Ml in a Day Workshop 5/1
Ml in a Day Workshop 5/1CCG
 
Skytree Partner Program 2-15
Skytree Partner Program 2-15Skytree Partner Program 2-15
Skytree Partner Program 2-15Dylan Steeg
 
H2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.ioH2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.ioSri Ambati
 
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
2013  International Conference on Knowledge, Innovation and Enterprise Presen...2013  International Conference on Knowledge, Innovation and Enterprise Presen...
2013 International Conference on Knowledge, Innovation and Enterprise Presen...oj08
 
H2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
H2O World - NCS Continuous Media Optimization w/H2O - Satya SatyamoorthyH2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
H2O World - NCS Continuous Media Optimization w/H2O - Satya SatyamoorthySri Ambati
 
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...Formulatedby
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologySergey Shelpuk
 
Ml in a day v 1.1
Ml in a day v 1.1Ml in a day v 1.1
Ml in a day v 1.1CCG
 
What Comes After The Star Schema? Dimensional Modeling For Enterprise Data Hubs
What Comes After The Star Schema? Dimensional Modeling For Enterprise Data HubsWhat Comes After The Star Schema? Dimensional Modeling For Enterprise Data Hubs
What Comes After The Star Schema? Dimensional Modeling For Enterprise Data HubsCloudera, Inc.
 
Machine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopMachine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopCCG
 
How can a quality engineering and assurance consultancy keep you ahead of others
How can a quality engineering and assurance consultancy keep you ahead of othersHow can a quality engineering and assurance consultancy keep you ahead of others
How can a quality engineering and assurance consultancy keep you ahead of othersgreyaudrina
 
Data Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileData Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileDaniel Upton
 
Anatomy of a data science project
Anatomy of a data science projectAnatomy of a data science project
Anatomy of a data science projectAdam Sroka
 

What's hot (20)

Machine learning101 v1.2
Machine learning101 v1.2Machine learning101 v1.2
Machine learning101 v1.2
 
A strategy for security data analytics - SIRACon 2016
A strategy for security data analytics - SIRACon 2016A strategy for security data analytics - SIRACon 2016
A strategy for security data analytics - SIRACon 2016
 
Transforming IoT using Dynamical Machine Learning
Transforming IoT using Dynamical Machine LearningTransforming IoT using Dynamical Machine Learning
Transforming IoT using Dynamical Machine Learning
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
Harness the power of data
Harness the power of dataHarness the power of data
Harness the power of data
 
Ml in a Day Workshop 5/1
Ml in a Day Workshop 5/1Ml in a Day Workshop 5/1
Ml in a Day Workshop 5/1
 
Skytree Partner Program 2-15
Skytree Partner Program 2-15Skytree Partner Program 2-15
Skytree Partner Program 2-15
 
H2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.ioH2O World - What you need before doing predictive analysis - Keen.io
H2O World - What you need before doing predictive analysis - Keen.io
 
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
2013  International Conference on Knowledge, Innovation and Enterprise Presen...2013  International Conference on Knowledge, Innovation and Enterprise Presen...
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
 
IBM Watson
IBM WatsonIBM Watson
IBM Watson
 
H2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
H2O World - NCS Continuous Media Optimization w/H2O - Satya SatyamoorthyH2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
H2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
 
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
 
Data vault
Data vaultData vault
Data vault
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
Ml in a day v 1.1
Ml in a day v 1.1Ml in a day v 1.1
Ml in a day v 1.1
 
What Comes After The Star Schema? Dimensional Modeling For Enterprise Data Hubs
What Comes After The Star Schema? Dimensional Modeling For Enterprise Data HubsWhat Comes After The Star Schema? Dimensional Modeling For Enterprise Data Hubs
What Comes After The Star Schema? Dimensional Modeling For Enterprise Data Hubs
 
Machine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopMachine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual Workshop
 
How can a quality engineering and assurance consultancy keep you ahead of others
How can a quality engineering and assurance consultancy keep you ahead of othersHow can a quality engineering and assurance consultancy keep you ahead of others
How can a quality engineering and assurance consultancy keep you ahead of others
 
Data Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileData Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes Agile
 
Anatomy of a data science project
Anatomy of a data science projectAnatomy of a data science project
Anatomy of a data science project
 

Similar to A practice to perfect the big data solution

Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
 
DX2000 from NEC lets you put big data to work
DX2000 from NEC lets you put big data to workDX2000 from NEC lets you put big data to work
DX2000 from NEC lets you put big data to workPrincipled Technologies
 
Achieve New Heights with Modern Analytics
Achieve New Heights with Modern AnalyticsAchieve New Heights with Modern Analytics
Achieve New Heights with Modern AnalyticsSense Corp
 
Building a Big Data Solution
Building a Big Data SolutionBuilding a Big Data Solution
Building a Big Data SolutionJames Serra
 
Traditional data word
Traditional data wordTraditional data word
Traditional data wordorcoxsm
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperImpetus Technologies
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data CentersGina Buck
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentTasktop
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Denodo
 
TaskYou are required to prepare for this Assessment Item by1..docx
TaskYou are required to prepare for this Assessment Item by1..docxTaskYou are required to prepare for this Assessment Item by1..docx
TaskYou are required to prepare for this Assessment Item by1..docxbradburgess22840
 
TaskYou are required to prepare for this Assessment Item by1..docx
TaskYou are required to prepare for this Assessment Item by1..docxTaskYou are required to prepare for this Assessment Item by1..docx
TaskYou are required to prepare for this Assessment Item by1..docxdeanmtaylor1545
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Precisely
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
CHAPTER 10 SystemArchitectureChapter 10 is the final chapter.docx
CHAPTER 10 SystemArchitectureChapter 10 is the final chapter.docxCHAPTER 10 SystemArchitectureChapter 10 is the final chapter.docx
CHAPTER 10 SystemArchitectureChapter 10 is the final chapter.docxcravennichole326
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItDenodo
 

Similar to A practice to perfect the big data solution (20)

Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
DX2000 from NEC lets you put big data to work
DX2000 from NEC lets you put big data to workDX2000 from NEC lets you put big data to work
DX2000 from NEC lets you put big data to work
 
Achieve New Heights with Modern Analytics
Achieve New Heights with Modern AnalyticsAchieve New Heights with Modern Analytics
Achieve New Heights with Modern Analytics
 
Data Warehouse Questions
Data Warehouse QuestionsData Warehouse Questions
Data Warehouse Questions
 
AI at Scale in Enterprises
AI at Scale in Enterprises AI at Scale in Enterprises
AI at Scale in Enterprises
 
Cloud Analytics Playbook
Cloud Analytics PlaybookCloud Analytics Playbook
Cloud Analytics Playbook
 
Building a Big Data Solution
Building a Big Data SolutionBuilding a Big Data Solution
Building a Big Data Solution
 
Traditional data word
Traditional data wordTraditional data word
Traditional data word
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White Paper
 
Introduction to BigData
Introduction to BigData Introduction to BigData
Introduction to BigData
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data Centers
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics Environment
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 
Building a SaaS Style Application
Building a SaaS Style ApplicationBuilding a SaaS Style Application
Building a SaaS Style Application
 
TaskYou are required to prepare for this Assessment Item by1..docx
TaskYou are required to prepare for this Assessment Item by1..docxTaskYou are required to prepare for this Assessment Item by1..docx
TaskYou are required to prepare for this Assessment Item by1..docx
 
TaskYou are required to prepare for this Assessment Item by1..docx
TaskYou are required to prepare for this Assessment Item by1..docxTaskYou are required to prepare for this Assessment Item by1..docx
TaskYou are required to prepare for this Assessment Item by1..docx
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
CHAPTER 10 SystemArchitectureChapter 10 is the final chapter.docx
CHAPTER 10 SystemArchitectureChapter 10 is the final chapter.docxCHAPTER 10 SystemArchitectureChapter 10 is the final chapter.docx
CHAPTER 10 SystemArchitectureChapter 10 is the final chapter.docx
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need It
 

Recently uploaded

MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 

Recently uploaded (20)

MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 

A practice to perfect the big data solution

  • 1. Why serve an undercooked Big Data Solution!! Go for the well-baked one Introduction Today Big Data is the buzz word in the information and technology domain. Thereis too much of discussion and at the sametime confusion pertaining to this revolution calledBig Data. Inclient’smind (most of the time) Big Data cando wonders on their data with its intelligent processing power and analyticalfeatures. Most often, the client is in a cloud nine situation where they visualize a Utopian/ ideal environment whereeverything is getting resolved. Through this article, I want to deconstruct that belief. I want to share and pass on my learning on how to rightly approach Big Data solution in order to get the insight in line with the expectationsof the client. The intension of this BLOG is not to provide any technicalsolution for Big-Data problems, but a process to approach Big Data solution in a smart way. Understand the Client Requirement Often we come across a successfully implemented Big-Data solution with optimal processing time, jazzy visualization & analyticalcapabilities!! But generally it doesn’t meet the client’s expectations!! Therefore, we need to understand why a client wants to go for the Big Data solution. Without having this insight, we may not be able to produce thedesired output in line with their anticipation. Image 1: Client mind mapping
  • 2. Following the initial step of understanding the reason behind the client going for Big Data solution, it’s time to generateanextrapolationof the study. We have to come up with a blueprint which will highlight our suggestions and recommendations to executethe Big Data solution. Image 2: The inference from client’sexpectation& Outcome A First Person Account Let me explain toyou this concept from a realtime example and walk you through a smart way to execute the solution and present theoutcomes. The tablebelow gives a brief overview of a client’s problem and how they expect it to resolve it. Client Situation A leading retailer in USA wants to enhance the existing Decision Support System application. The DSS application presently serves a huge user base (50+) on a data volume of about 250TB. It runs on a traditionalRDBMSand is not able to scale to the said expectation. Client Requirement According to the client’sIT Director, iftheir existing application is moved to Big-Data (byhosting on Microsoft Azure Cloud) it should be able to resolve their problem, both on cost and performance.
  • 3. Image 3: Smart Way Inference Points to ponder for implementing Big Data Solution i) Baseline the Solution Coverage Based on the client situation mentioned above, we need to come up with solution for the following glitches:  Reduce cost on data storage  Provide stable code base  Implement a scalable solution  Deliver cost effective application From the inference above, I amtrying to come up with two use cases (scenario): UseCase 1 i. Ingest one unit of the data set (1 TB) in to the system. Repeat this procedure for 30 units. ii. Run this use case on 10, 50 and 100 Node clusters. UseCase 2 i. Readone unit of data through multiple requests. Consider them as supplier request and can rangefrom 1 to 40K. ii. Run this use case on 10, 50 & 100 Node clusters.
  • 4. ii) Configuring theEnvironment This is yet another crucialphase of implementation. Identify theoptimal infrastructureto validatethe solution. Ones we have identified the vendor (For e.g., Microsoft)or an infrastructure provider (For e.g., Big Decisions or Client’s IT Dept.)toimplement Big Data solution, keep the following aspects for one’s cognizance:  Is the infrastructurestable? Validate thefollowing:  ValidateCPU & Memory  ValidateI/O  Validatethe Storage(In our case it should hold 250TB)  Adequate access rightsfor you to execute  Edgenode access of a cluster  Root folder access iii) Identifying the Core Team No solution canbe productive, until or unless we get the right stream of resource. We need to engage the spot-on resource to run the solution. In this sample POC (Proof of Concept), wewould need an expert or COE to support the coreteam. There is certainly no time for training, therefore wehave to identify a coreteam who can reachthe expectancyin no time. iv) All Set to Go Ones there is clarityon the objective of the exercise (here POC) with right resources and optimal environment, you could expect the outcome alignwith the client’s expectation. It is now timeto draft a plan for implementing thesolution. For instance, you candetermine the storagetypes (BLOB, AzureData LakeStore) and tool stacks (Spark, Hive, ADLF) for data processing and querying. Createa code base and execute it over the environments. NOTE: It’s mandatory to complete the above steps before moving to the next. All the above activities can be executedin parallel
  • 5. v) Present the Insights and Recommendations Having clarity on the objectives, it is easy to matchor meet the client’sexpectation, be it implementing the Big Data solution. Last phase of theexercise will capture thestatistics by running the code on the configured environments. It will derive the metrics/insights from the execution outcome and present it to the client with recommendations. Refer imagebelow for a sample outcome. I would like to conclude with the saying, “Don’t just meet the expectations. Exceed them.”And with this approach you are bound to exceedthe expectations of the client.