SlideShare a Scribd company logo
1 of 3
Big Data
Processing
Training,
R&DPower by Data Cloud Lab
[Bigdata isa fieldthattreats ways to analyze, systematically extract
informationfrom,orotherwisedeal withdata sets that are too large
or complex to be dealt with by traditional data-processing
application software. Big data was originally associated with three
key concepts: volume, variety, and velocity.]
Data Set – 1M Data:
1. Healthcare_ [Record – 46935]
2. Weather-history - [Record – 4573]
3. World Demography - [Record – 5000]
4. Census Tracts 2010 - [Record -21
5. Animal_Services_Intake_Data - [Record -187594]
6. Average_Daily_Traffic_Counts - [Record -1280]
7. Acciental_Durg_Related_Death - [Record -5106]
8. Retails Store - [Record – 182728]
customer12435,category_59,Departments_7,orders_68883,products_1345,order_items_99999
9. Popular_Baby_Names - [Record – 46935]
10. SAT__College_Board__2010_School_Level_Results - Total Data [Record -461]
11. Sales_Tax_Rates - [Record -1911]
12. Restaurants [Record -1328]
13. Transportation : 34_drivers , 17076_truck_event_text_partition , 1768_timesheet - [Record -
18878]
14. Acciental_Durg_Related_Death - [Record -5106]
15. Census Tracts 2010 - [Record -216]
16. Employees_Salary - [Record – 824]
17. Customer_transactional_spending - [Record – 60000]
18. Customer_Order - [Record – 1000]
19. Employees_Salary - [Record – 824]
Power by: Software Linux, Hadoop Big Data, Hive & Power BI)
Case Study 01: Healthcare [Record – 46935]
Raw Data (Date, Sex, Diseases, Age) :
12/10/1950,M,Diabetes,78
12/10/1984,F,PCOS,67
712/11/1940,M,Fever,90
12/12/1950,F,Cold,88
12/13/1960,M,Blood Pressure,76
Result :
Blood Pressure,5215
Cold,5215
Diabetes,5215
Fever,15645
Malaria,5215
PCOS,5215
Swine Flu,5215
Data Visualizations:
Backend Data Process by HiveQL command:
select diseases, count(*) from healthgroupby diseases;
WARNING: Hive-on-MR is deprecated inHive2 and may not beavailableinthefuture versions. Considerusing a different execution engine(i.e.
spark, tez) or using Hive 1.X releases.
Query ID =hduser_20200125220715_338a065f-f176-4464-b03e-28fb18dc66f5
Total jobs =1
Launching Job 1 outof1
Number ofreducetasks not specified. Estimated frominputdata size: 1
In order to changethe average load for a reducer (inbytes): , set hive.exec.reducers.bytes.per.reducer=<number>
In order to limitthemaximum number ofreducers: , sethive.exec.reducers.max=<number>
In order to set a constant numberofreducers: , setmapreduce.job.reduces=<number>
Job running in-process (localHadoop) , 2020-01-25 22:07:18,630Stage-1 map =100%, reduce=100%
Ended Job =job_local171670995_0001, Moving data to localdirectory /home/hduser/Dataset
MapReduceJobs Launched: , Stage-Stage-1: HDFS Read:2336322 HDFS Write: 0 SUCCESS, TotalMapReduce CPU TimeSpent:0 msec, OK
Time taken: 3.617seconds

More Related Content

What's hot

John Gladstone - ‎EMEA Healthcare Pathways and Alliances, Netapp
John Gladstone -  ‎EMEA Healthcare Pathways and Alliances, NetappJohn Gladstone -  ‎EMEA Healthcare Pathways and Alliances, Netapp
John Gladstone - ‎EMEA Healthcare Pathways and Alliances, NetappHIMSS UK
 
Starting the Hadoop Journey at a Global Leader in Cancer Research
Starting the Hadoop Journey at a Global Leader in Cancer ResearchStarting the Hadoop Journey at a Global Leader in Cancer Research
Starting the Hadoop Journey at a Global Leader in Cancer ResearchDataWorks Summit/Hadoop Summit
 
Interoperability Solution - Hybrid Update -- From Pahe II and III to Post Mar...
Interoperability Solution - Hybrid Update -- From Pahe II and III to Post Mar...Interoperability Solution - Hybrid Update -- From Pahe II and III to Post Mar...
Interoperability Solution - Hybrid Update -- From Pahe II and III to Post Mar...Stephen Allan Weitzman
 
Data Management Planning and Data Compliance Reporting with IEDA
Data Management Planning and Data Compliance Reporting with IEDAData Management Planning and Data Compliance Reporting with IEDA
Data Management Planning and Data Compliance Reporting with IEDAVicki Ferrini
 
Sapiens data science and snowflake data warehouse
Sapiens data science and snowflake data warehouseSapiens data science and snowflake data warehouse
Sapiens data science and snowflake data warehouseLarry Heminger
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareSkillspeed
 
Big data and the Healthcare Sector
Big data and the Healthcare Sector Big data and the Healthcare Sector
Big data and the Healthcare Sector Chris Groves
 
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCampSteve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCampBigDataCamp
 
Aimia: The Big Deal About Big Data -- How It Will Transform Pharma Meeting an...
Aimia: The Big Deal About Big Data -- How It Will Transform Pharma Meeting an...Aimia: The Big Deal About Big Data -- How It Will Transform Pharma Meeting an...
Aimia: The Big Deal About Big Data -- How It Will Transform Pharma Meeting an...David Nickelson, PsyD, JD
 

What's hot (11)

John Gladstone - ‎EMEA Healthcare Pathways and Alliances, Netapp
John Gladstone -  ‎EMEA Healthcare Pathways and Alliances, NetappJohn Gladstone -  ‎EMEA Healthcare Pathways and Alliances, Netapp
John Gladstone - ‎EMEA Healthcare Pathways and Alliances, Netapp
 
Starting the Hadoop Journey at a Global Leader in Cancer Research
Starting the Hadoop Journey at a Global Leader in Cancer ResearchStarting the Hadoop Journey at a Global Leader in Cancer Research
Starting the Hadoop Journey at a Global Leader in Cancer Research
 
Interoperability Solution - Hybrid Update -- From Pahe II and III to Post Mar...
Interoperability Solution - Hybrid Update -- From Pahe II and III to Post Mar...Interoperability Solution - Hybrid Update -- From Pahe II and III to Post Mar...
Interoperability Solution - Hybrid Update -- From Pahe II and III to Post Mar...
 
8 1open ehr-helsinki_29oct2018
8 1open ehr-helsinki_29oct20188 1open ehr-helsinki_29oct2018
8 1open ehr-helsinki_29oct2018
 
Data Management Planning and Data Compliance Reporting with IEDA
Data Management Planning and Data Compliance Reporting with IEDAData Management Planning and Data Compliance Reporting with IEDA
Data Management Planning and Data Compliance Reporting with IEDA
 
Sapiens data science and snowflake data warehouse
Sapiens data science and snowflake data warehouseSapiens data science and snowflake data warehouse
Sapiens data science and snowflake data warehouse
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in Healthcare
 
Big data and the Healthcare Sector
Big data and the Healthcare Sector Big data and the Healthcare Sector
Big data and the Healthcare Sector
 
Lecture 7
Lecture 7Lecture 7
Lecture 7
 
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCampSteve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
 
Aimia: The Big Deal About Big Data -- How It Will Transform Pharma Meeting an...
Aimia: The Big Deal About Big Data -- How It Will Transform Pharma Meeting an...Aimia: The Big Deal About Big Data -- How It Will Transform Pharma Meeting an...
Aimia: The Big Deal About Big Data -- How It Will Transform Pharma Meeting an...
 

Similar to Data cloud-lab-version-v0012020

IRJET- Predictive Analysis and Healthcare of Diabetes
IRJET- Predictive Analysis and Healthcare of DiabetesIRJET- Predictive Analysis and Healthcare of Diabetes
IRJET- Predictive Analysis and Healthcare of DiabetesIRJET Journal
 
PREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUESPREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUESIRJET Journal
 
Customer Spotlight: How WellCare Accelerated Big Data Delivery to Improve Ana...
Customer Spotlight: How WellCare Accelerated Big Data Delivery to Improve Ana...Customer Spotlight: How WellCare Accelerated Big Data Delivery to Improve Ana...
Customer Spotlight: How WellCare Accelerated Big Data Delivery to Improve Ana...VMware Tanzu
 
Mr. Neil Hammerschmidt - USDA-APHIS IT Update
Mr. Neil Hammerschmidt - USDA-APHIS IT UpdateMr. Neil Hammerschmidt - USDA-APHIS IT Update
Mr. Neil Hammerschmidt - USDA-APHIS IT UpdateJohn Blue
 
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET Journal
 
IRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare ApplicationsIRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare ApplicationsIRJET Journal
 
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care SectorIRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care SectorIRJET Journal
 
Final Presentation.pptx
Final Presentation.pptxFinal Presentation.pptx
Final Presentation.pptxsainathk18
 
Big Data Testing Using Hadoop Platform
Big Data Testing Using Hadoop PlatformBig Data Testing Using Hadoop Platform
Big Data Testing Using Hadoop PlatformIRJET Journal
 
76 s201915
76 s20191576 s201915
76 s201915IJRAT
 
IDC Perspectives on Big Data Outside of HPC
IDC Perspectives on Big Data Outside of HPCIDC Perspectives on Big Data Outside of HPC
IDC Perspectives on Big Data Outside of HPCinside-BigData.com
 
IRJET- A Scenario on Big Data
IRJET- A Scenario on Big DataIRJET- A Scenario on Big Data
IRJET- A Scenario on Big DataIRJET Journal
 
Private Hidden Data for Health Care
Private Hidden Data for Health CarePrivate Hidden Data for Health Care
Private Hidden Data for Health CareIRJET Journal
 
Improving the Business of Healthcare through Better Analytics
Improving the Business of Healthcare through Better Analytics Improving the Business of Healthcare through Better Analytics
Improving the Business of Healthcare through Better Analytics Pentaho
 
Shrink your DB and increase SAP BW performance
Shrink your DB and increase SAP BW performanceShrink your DB and increase SAP BW performance
Shrink your DB and increase SAP BW performanceDataVard
 
IRJET- Medical Data Mining
IRJET- Medical Data MiningIRJET- Medical Data Mining
IRJET- Medical Data MiningIRJET Journal
 
Lesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxLesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxPankajkumar496281
 
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...EMC
 
HEALTH CARE DATA WAREHOUSE SYSTEM ARCHITECTURE FOR INFLUENZA (FLU) DISEASES
HEALTH CARE DATA WAREHOUSE SYSTEM ARCHITECTURE FOR INFLUENZA (FLU) DISEASES HEALTH CARE DATA WAREHOUSE SYSTEM ARCHITECTURE FOR INFLUENZA (FLU) DISEASES
HEALTH CARE DATA WAREHOUSE SYSTEM ARCHITECTURE FOR INFLUENZA (FLU) DISEASES cscpconf
 

Similar to Data cloud-lab-version-v0012020 (20)

IRJET- Predictive Analysis and Healthcare of Diabetes
IRJET- Predictive Analysis and Healthcare of DiabetesIRJET- Predictive Analysis and Healthcare of Diabetes
IRJET- Predictive Analysis and Healthcare of Diabetes
 
PREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUESPREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUES
 
Customer Spotlight: How WellCare Accelerated Big Data Delivery to Improve Ana...
Customer Spotlight: How WellCare Accelerated Big Data Delivery to Improve Ana...Customer Spotlight: How WellCare Accelerated Big Data Delivery to Improve Ana...
Customer Spotlight: How WellCare Accelerated Big Data Delivery to Improve Ana...
 
Mr. Neil Hammerschmidt - USDA-APHIS IT Update
Mr. Neil Hammerschmidt - USDA-APHIS IT UpdateMr. Neil Hammerschmidt - USDA-APHIS IT Update
Mr. Neil Hammerschmidt - USDA-APHIS IT Update
 
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User BehaviorIRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
IRJET- A Survey on Mining of Tweeter Data for Predicting User Behavior
 
IRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare ApplicationsIRJET- Advances in Data Mining: Healthcare Applications
IRJET- Advances in Data Mining: Healthcare Applications
 
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care SectorIRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
 
Final Presentation.pptx
Final Presentation.pptxFinal Presentation.pptx
Final Presentation.pptx
 
Big Data Testing Using Hadoop Platform
Big Data Testing Using Hadoop PlatformBig Data Testing Using Hadoop Platform
Big Data Testing Using Hadoop Platform
 
76 s201915
76 s20191576 s201915
76 s201915
 
IDC Perspectives on Big Data Outside of HPC
IDC Perspectives on Big Data Outside of HPCIDC Perspectives on Big Data Outside of HPC
IDC Perspectives on Big Data Outside of HPC
 
IRJET- A Scenario on Big Data
IRJET- A Scenario on Big DataIRJET- A Scenario on Big Data
IRJET- A Scenario on Big Data
 
Private Hidden Data for Health Care
Private Hidden Data for Health CarePrivate Hidden Data for Health Care
Private Hidden Data for Health Care
 
Improving the Business of Healthcare through Better Analytics
Improving the Business of Healthcare through Better Analytics Improving the Business of Healthcare through Better Analytics
Improving the Business of Healthcare through Better Analytics
 
Shrink your DB and increase SAP BW performance
Shrink your DB and increase SAP BW performanceShrink your DB and increase SAP BW performance
Shrink your DB and increase SAP BW performance
 
IRJET- Medical Data Mining
IRJET- Medical Data MiningIRJET- Medical Data Mining
IRJET- Medical Data Mining
 
Innovative project1
Innovative project1Innovative project1
Innovative project1
 
Lesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxLesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptx
 
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
 
HEALTH CARE DATA WAREHOUSE SYSTEM ARCHITECTURE FOR INFLUENZA (FLU) DISEASES
HEALTH CARE DATA WAREHOUSE SYSTEM ARCHITECTURE FOR INFLUENZA (FLU) DISEASES HEALTH CARE DATA WAREHOUSE SYSTEM ARCHITECTURE FOR INFLUENZA (FLU) DISEASES
HEALTH CARE DATA WAREHOUSE SYSTEM ARCHITECTURE FOR INFLUENZA (FLU) DISEASES
 

Recently uploaded

Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationBoston Institute of Analytics
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 

Recently uploaded (20)

Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 

Data cloud-lab-version-v0012020

  • 1. Big Data Processing Training, R&DPower by Data Cloud Lab [Bigdata isa fieldthattreats ways to analyze, systematically extract informationfrom,orotherwisedeal withdata sets that are too large or complex to be dealt with by traditional data-processing application software. Big data was originally associated with three key concepts: volume, variety, and velocity.]
  • 2. Data Set – 1M Data: 1. Healthcare_ [Record – 46935] 2. Weather-history - [Record – 4573] 3. World Demography - [Record – 5000] 4. Census Tracts 2010 - [Record -21 5. Animal_Services_Intake_Data - [Record -187594] 6. Average_Daily_Traffic_Counts - [Record -1280] 7. Acciental_Durg_Related_Death - [Record -5106] 8. Retails Store - [Record – 182728] customer12435,category_59,Departments_7,orders_68883,products_1345,order_items_99999 9. Popular_Baby_Names - [Record – 46935] 10. SAT__College_Board__2010_School_Level_Results - Total Data [Record -461] 11. Sales_Tax_Rates - [Record -1911] 12. Restaurants [Record -1328] 13. Transportation : 34_drivers , 17076_truck_event_text_partition , 1768_timesheet - [Record - 18878] 14. Acciental_Durg_Related_Death - [Record -5106] 15. Census Tracts 2010 - [Record -216] 16. Employees_Salary - [Record – 824] 17. Customer_transactional_spending - [Record – 60000] 18. Customer_Order - [Record – 1000] 19. Employees_Salary - [Record – 824]
  • 3. Power by: Software Linux, Hadoop Big Data, Hive & Power BI) Case Study 01: Healthcare [Record – 46935] Raw Data (Date, Sex, Diseases, Age) : 12/10/1950,M,Diabetes,78 12/10/1984,F,PCOS,67 712/11/1940,M,Fever,90 12/12/1950,F,Cold,88 12/13/1960,M,Blood Pressure,76 Result : Blood Pressure,5215 Cold,5215 Diabetes,5215 Fever,15645 Malaria,5215 PCOS,5215 Swine Flu,5215 Data Visualizations: Backend Data Process by HiveQL command: select diseases, count(*) from healthgroupby diseases; WARNING: Hive-on-MR is deprecated inHive2 and may not beavailableinthefuture versions. Considerusing a different execution engine(i.e. spark, tez) or using Hive 1.X releases. Query ID =hduser_20200125220715_338a065f-f176-4464-b03e-28fb18dc66f5 Total jobs =1 Launching Job 1 outof1 Number ofreducetasks not specified. Estimated frominputdata size: 1 In order to changethe average load for a reducer (inbytes): , set hive.exec.reducers.bytes.per.reducer=<number> In order to limitthemaximum number ofreducers: , sethive.exec.reducers.max=<number> In order to set a constant numberofreducers: , setmapreduce.job.reduces=<number> Job running in-process (localHadoop) , 2020-01-25 22:07:18,630Stage-1 map =100%, reduce=100% Ended Job =job_local171670995_0001, Moving data to localdirectory /home/hduser/Dataset MapReduceJobs Launched: , Stage-Stage-1: HDFS Read:2336322 HDFS Write: 0 SUCCESS, TotalMapReduce CPU TimeSpent:0 msec, OK Time taken: 3.617seconds