Data science uses statistics, machine learning, and data analysis to extract knowledge and insights from data. It allows companies to make better decisions, predictive analysis, and discover patterns. Data science is used across many industries and can be applied anywhere data is available, such as consumer goods, stock markets, logistics, and e-commerce. A data scientist requires expertise in machine learning, statistics, programming, mathematics, and databases to explore and analyze data, find patterns, and make predictions to help businesses.
Congrats ! You got your Data Science JobRohit Dubey
Congrats ! You got your Data Science Job after completion of this presentation course.
What can you find on this presentation course?
I aim to provide as many resources as possible for learning Data Science. These resources include:
Course to upskill yourself in analytics and data science
Real life industry problems being released in form of contests
This slide will help you get:
Jobs – Apply on data science jobs to start or improve your career
DSAT – Access your data science knowledge using our adaptive test
Tips and tricks related to Data Science, Machine Learning, Business Analytics and Business Intelligence tools
Case studies: Case studies of problems and their analytical solutions Interviews of Business Analytics & Business Intelligence leaders.
#datascience #machinelearning #python #artificialintelligence #ai #data #dataanalytics #bigdata #programming #coding #technology #datascientist #deeplearning #computerscience #datavisualization #tech #pythonprogramming #analytics #iot #dataanalysis #java #programmer #developer #business #database #ml #javascript #software #innovation #cybersecurity
#coder #statistics #datamining #dataanalyst #code #engineering #linux #codinglife #cloudcomputing #businessintelligence #robotics #softwaredeveloper #automation #cloud #neuralnetworks #sql #science #softwareengineer #digitaltransformation #computer #daysofcode #coders #bigdataanalytics #programminglife #dataviz #html #digitalmarketing #devops #datasciencetraining #dataprotection
#programming #coding #programmer #python #developer #javascript #technology #code #java #html #coder
#job #work #jobs #jobsearch #business #career #hiring #love #recruitment #o #instagood #employment #life #motivation #instagram #jobseekers #loker #recruiting #marketing #jobfair #working #careers #nowhiring #resume #follow #jobvacancy #like #lowongankerja #photography #jobopportunity
#computerscience #tech #css #software #webdeveloper #webdevelopment #codinglife #softwaredeveloper #linux #programmingmemes #webdesign #programmers #hacking #php #programminglife #pythonprogramming #machinelearning #softwareengineer #computer
#programming #business #technology #tech #android #engineering #webdesign #code #web #development #computer #programming #coding #python #security #developer #java #software #webdevelopment #webdeveloper #javascript
#programmingtips #programming #programmingmeme #programmingislife #programmingfacts #learnprogramming #coding #programmer #coder #codinglife #programminglanguages #computerprogramming #programmingisfun #programminglife #javaprogramming #codingbootcamp #pythonprogramming #javascript #webprogramming #codingisfun #programminglanguage #programmingmemes #programmingstudents #computerscience #programmingfun #codingchallenge #webdevelopment #programmingproblems #programmerlife #computersciencestudent
#free #love #giveaway #freedom #follow #music #life #like #instagood #art #instagram #nature
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Rohit Dubey
How Much Do Data Scientists Make?
The demand and salary for data scientists tend to be higher than most other ITES jobs. Experience is one of the key factors in determining the salary range of a data science professional.
According to Glassdoor, a Data Scientist in the United States earns an annual average of USD 117,212, and the same site reports that Data Scientists in India make a yearly average of ₹1,000,000.
Data Scientist Career Path
Data Science is currently considered one of the most lucrative careers available. Companies across all major industries/sectors have data scientist requirements to help them gain valuable insights from big data. There is a sharp growth in demand for highly skilled data science professionals who can straddle the business and IT worlds.
The career path to becoming a data scientist isn’t clearly defined since this is a relatively new profession. People from different backgrounds like mathematics, statistics, computer science or economics, end up in data science.
The major designations for data science professionals are:
Data Analyst
Data Scientist (entry-level)
Associate data scientist
Data Scientist (senior-level)
Product Manager
Lead data scientist
Director/VP/SVP
That was all about Data Scientist Job Description.
Become a Data Scientist Today!
In this write-up, we covered the Data Scientist job description in detail. Irrespective of which location you are in, there is no dearth of jobs for skillful data scientists. A career in data science is a rewarding journey to embark on, especially in the finance, retail, and e-commerce sectors. Jobs are also available with Government departments, universities and research institutes, telecoms, transports, the list goes on.
This video covers
Introductory Questions
Data Science Introduction
Data Science Technical Interview QnA :
#Excel
#SQL
#Python3
#MachineLearning
#DataAnalyticstechnical Interview
#DataScienceProjects
#coder #statistics #datamining #dataanalyst #code #engineering #linux #codinglife #cloudcomputing #businessintelligence #robotics #softwaredeveloper #automation #cloud #neuralnetworks #sql #science #softwareengineer #digitaltransformation #computer #daysofcode #coders #bigdataanalytics #programminglife #dataviz #html #digitalmarketing #devops #datasciencetraining #dataprotection
#rohitdubey
#teachtechtoe
#datascience #datasciencetraining #datasciencejobs #datasciencecourse #datasciencenigeria #datasciencebootcamp #datascienceworkshop #datasciencecareers #datasciencestudent #datascienceproject #datascienceforall #datasciencetraininginpatelnagar#datasciencetrainingindelhi
Congrats ! You got your Data Science JobRohit Dubey
Congrats ! You got your Data Science Job after completion of this presentation course.
What can you find on this presentation course?
I aim to provide as many resources as possible for learning Data Science. These resources include:
Course to upskill yourself in analytics and data science
Real life industry problems being released in form of contests
This slide will help you get:
Jobs – Apply on data science jobs to start or improve your career
DSAT – Access your data science knowledge using our adaptive test
Tips and tricks related to Data Science, Machine Learning, Business Analytics and Business Intelligence tools
Case studies: Case studies of problems and their analytical solutions Interviews of Business Analytics & Business Intelligence leaders.
#datascience #machinelearning #python #artificialintelligence #ai #data #dataanalytics #bigdata #programming #coding #technology #datascientist #deeplearning #computerscience #datavisualization #tech #pythonprogramming #analytics #iot #dataanalysis #java #programmer #developer #business #database #ml #javascript #software #innovation #cybersecurity
#coder #statistics #datamining #dataanalyst #code #engineering #linux #codinglife #cloudcomputing #businessintelligence #robotics #softwaredeveloper #automation #cloud #neuralnetworks #sql #science #softwareengineer #digitaltransformation #computer #daysofcode #coders #bigdataanalytics #programminglife #dataviz #html #digitalmarketing #devops #datasciencetraining #dataprotection
#programming #coding #programmer #python #developer #javascript #technology #code #java #html #coder
#job #work #jobs #jobsearch #business #career #hiring #love #recruitment #o #instagood #employment #life #motivation #instagram #jobseekers #loker #recruiting #marketing #jobfair #working #careers #nowhiring #resume #follow #jobvacancy #like #lowongankerja #photography #jobopportunity
#computerscience #tech #css #software #webdeveloper #webdevelopment #codinglife #softwaredeveloper #linux #programmingmemes #webdesign #programmers #hacking #php #programminglife #pythonprogramming #machinelearning #softwareengineer #computer
#programming #business #technology #tech #android #engineering #webdesign #code #web #development #computer #programming #coding #python #security #developer #java #software #webdevelopment #webdeveloper #javascript
#programmingtips #programming #programmingmeme #programmingislife #programmingfacts #learnprogramming #coding #programmer #coder #codinglife #programminglanguages #computerprogramming #programmingisfun #programminglife #javaprogramming #codingbootcamp #pythonprogramming #javascript #webprogramming #codingisfun #programminglanguage #programmingmemes #programmingstudents #computerscience #programmingfun #codingchallenge #webdevelopment #programmingproblems #programmerlife #computersciencestudent
#free #love #giveaway #freedom #follow #music #life #like #instagood #art #instagram #nature
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Rohit Dubey
How Much Do Data Scientists Make?
The demand and salary for data scientists tend to be higher than most other ITES jobs. Experience is one of the key factors in determining the salary range of a data science professional.
According to Glassdoor, a Data Scientist in the United States earns an annual average of USD 117,212, and the same site reports that Data Scientists in India make a yearly average of ₹1,000,000.
Data Scientist Career Path
Data Science is currently considered one of the most lucrative careers available. Companies across all major industries/sectors have data scientist requirements to help them gain valuable insights from big data. There is a sharp growth in demand for highly skilled data science professionals who can straddle the business and IT worlds.
The career path to becoming a data scientist isn’t clearly defined since this is a relatively new profession. People from different backgrounds like mathematics, statistics, computer science or economics, end up in data science.
The major designations for data science professionals are:
Data Analyst
Data Scientist (entry-level)
Associate data scientist
Data Scientist (senior-level)
Product Manager
Lead data scientist
Director/VP/SVP
That was all about Data Scientist Job Description.
Become a Data Scientist Today!
In this write-up, we covered the Data Scientist job description in detail. Irrespective of which location you are in, there is no dearth of jobs for skillful data scientists. A career in data science is a rewarding journey to embark on, especially in the finance, retail, and e-commerce sectors. Jobs are also available with Government departments, universities and research institutes, telecoms, transports, the list goes on.
This video covers
Introductory Questions
Data Science Introduction
Data Science Technical Interview QnA :
#Excel
#SQL
#Python3
#MachineLearning
#DataAnalyticstechnical Interview
#DataScienceProjects
#coder #statistics #datamining #dataanalyst #code #engineering #linux #codinglife #cloudcomputing #businessintelligence #robotics #softwaredeveloper #automation #cloud #neuralnetworks #sql #science #softwareengineer #digitaltransformation #computer #daysofcode #coders #bigdataanalytics #programminglife #dataviz #html #digitalmarketing #devops #datasciencetraining #dataprotection
#rohitdubey
#teachtechtoe
#datascience #datasciencetraining #datasciencejobs #datasciencecourse #datasciencenigeria #datasciencebootcamp #datascienceworkshop #datasciencecareers #datasciencestudent #datascienceproject #datascienceforall #datasciencetraininginpatelnagar#datasciencetrainingindelhi
Several Python libraries offer solid execution of a range of machine learning algorithms. One of the best called is Scikit-Learn, a package that supports accurate versions of a large number of standard algorithms. A clean, uniform features and Scikit-Learn, and streamlined API, as well as by beneficial and complete online documentation.
Types of database processing,OLTP VS Data Warehouses(OLAP), Subject-oriented
Integrated
Time-variant
Non-volatile,
Functionalities of Data Warehouse,Roll-Up(Consolidation),
Drill-down,
Slicing,
Dicing,
Pivot,
KDD Process,Application of Data Mining
Introduction to Data Science, Prerequisites (tidyverse), Import Data (readr), Data Tyding (tidyr),
pivot_longer(), pivot_wider(), separate(), unite(), Data Transformation (dplyr - Grammar of Manipulation): arrange(), filter(),
select(), mutate(), summarise()m
Data Visualization (ggplot - Grammar of Graphics): Column Chart, Stacked Column Graph, Bar Graph, Line Graph, Dual Axis Chart, Area Chart, Pie Chart, Heat Map, Scatter Chart, Bubble Chart
Just finished a basic course on data science (highly recommend it if you wish to explore what data science is all about). Here are my takeaways from the course.
In this slide I answer the basic questions about machine learning like:
What is Machine Learning?
What are the types of machine learning?
How to deal with data?
How to test model performance?
If you’re learning data science, you’re probably on the lookout for cool data science projects. Look no further! We have a wide variety of guided projects that’ll get you working with real data in real-world scenarios while also helping you learn and apply new data science skills.
The projects in the list below are also designed to help you get a job! Each project was designed by a data scientist on our content team, and they’re representative examples of the real projects working data analysts and data scientists do every day. They’re designed to guide you through the process while also challenging your skills, and they’re open-ended so that you can put your own twist on each project and use it for your data science portfolio.
You can complete each project right in your browser, or you can download the data set to your computer and work locally! If you work on our site, you’ll also be able to download your code at any time so that you can continue locally, or upload your project to GitHub.
The sky is the limit here and what you decide to look into further is completely up to you and your imagination!
1. Learning by Doing
Learning by doing refers to a theory of education expounded by American philosopher John Dewey. It is a hands-on approach to learning, meaning students must interact with their environment in order to adapt and learn. This way of learning sharpen your current skills and knowledge and also helps in gaining new skills that could only be acquired by doing.
Car driving is a perfect example of this, you can read as much as you would like about the theory of driving and the rules, and this is very important, and the more you understand the theory the better you get in the practical part. But you will only be able to drive better by applying this knowledge on the real road. In addition to that, there are some skills and knowledge that will be only gained by actually driving.
Data science is the same as driving. It is very important to have solid theoretical knowledge and to regularly increase them to be able to get better while working on a project. However, you should always apply this theoretical knowledge to projects. By this, you will deepen your understanding of these concepts and Knowledge, have a better point of view of how they work in a real-life, and will also show others that you have strong theoretical knowledge and are able to put them into practice.
There are different types of guided projects. One of them is a guided project for
There are a lot of benefits for it:
It removes the barriers between you and doing projects
Saves you much time thinking about the project and preparing the data.
It allows you to apply the theoretical knowledge without getting distracted by obstacles.
Practical tips that can save your effort and time in the future.
#datasciencefree
#rohitdubey
#teachtechtoe
#linkedin.com/in/therohitdubey
A binary tree is a hierarchical data structure in computer science that consists of nodes connected by edges. Each node in a binary tree has at most two children, referred to as the left child and the right child. The topmost node in a binary tree is called the root.
Here are some key terms and concepts associated with binary trees:
Root: The topmost node in the tree, from which all other nodes are descended.
Node: A fundamental unit of a binary tree that contains data and may have zero, one, or two children nodes.
Parent: A node in the tree that has one or more child nodes.
Child: Nodes that are descendants of a parent node. In a binary tree, a node can have at most two children.
Leaf: A node in the tree that has no children, i.e., it is at the bottom of the tree.
Subtree: A tree formed by a node and its descendants.
Height: The length of the longest path from the root to a leaf. The height of an empty tree is typically defined as -1.
Depth: The length of the path from the root to a particular node.
Binary trees are commonly used in various applications, such as expression trees, binary search trees, and Huffman coding trees. They provide an efficient way to organize and search data, and their recursive nature makes them well-suited for certain algorithms and data manipulations. Understanding binary trees is fundamental to many aspects of computer science and programming.
Several Python libraries offer solid execution of a range of machine learning algorithms. One of the best called is Scikit-Learn, a package that supports accurate versions of a large number of standard algorithms. A clean, uniform features and Scikit-Learn, and streamlined API, as well as by beneficial and complete online documentation.
Types of database processing,OLTP VS Data Warehouses(OLAP), Subject-oriented
Integrated
Time-variant
Non-volatile,
Functionalities of Data Warehouse,Roll-Up(Consolidation),
Drill-down,
Slicing,
Dicing,
Pivot,
KDD Process,Application of Data Mining
Introduction to Data Science, Prerequisites (tidyverse), Import Data (readr), Data Tyding (tidyr),
pivot_longer(), pivot_wider(), separate(), unite(), Data Transformation (dplyr - Grammar of Manipulation): arrange(), filter(),
select(), mutate(), summarise()m
Data Visualization (ggplot - Grammar of Graphics): Column Chart, Stacked Column Graph, Bar Graph, Line Graph, Dual Axis Chart, Area Chart, Pie Chart, Heat Map, Scatter Chart, Bubble Chart
Just finished a basic course on data science (highly recommend it if you wish to explore what data science is all about). Here are my takeaways from the course.
In this slide I answer the basic questions about machine learning like:
What is Machine Learning?
What are the types of machine learning?
How to deal with data?
How to test model performance?
If you’re learning data science, you’re probably on the lookout for cool data science projects. Look no further! We have a wide variety of guided projects that’ll get you working with real data in real-world scenarios while also helping you learn and apply new data science skills.
The projects in the list below are also designed to help you get a job! Each project was designed by a data scientist on our content team, and they’re representative examples of the real projects working data analysts and data scientists do every day. They’re designed to guide you through the process while also challenging your skills, and they’re open-ended so that you can put your own twist on each project and use it for your data science portfolio.
You can complete each project right in your browser, or you can download the data set to your computer and work locally! If you work on our site, you’ll also be able to download your code at any time so that you can continue locally, or upload your project to GitHub.
The sky is the limit here and what you decide to look into further is completely up to you and your imagination!
1. Learning by Doing
Learning by doing refers to a theory of education expounded by American philosopher John Dewey. It is a hands-on approach to learning, meaning students must interact with their environment in order to adapt and learn. This way of learning sharpen your current skills and knowledge and also helps in gaining new skills that could only be acquired by doing.
Car driving is a perfect example of this, you can read as much as you would like about the theory of driving and the rules, and this is very important, and the more you understand the theory the better you get in the practical part. But you will only be able to drive better by applying this knowledge on the real road. In addition to that, there are some skills and knowledge that will be only gained by actually driving.
Data science is the same as driving. It is very important to have solid theoretical knowledge and to regularly increase them to be able to get better while working on a project. However, you should always apply this theoretical knowledge to projects. By this, you will deepen your understanding of these concepts and Knowledge, have a better point of view of how they work in a real-life, and will also show others that you have strong theoretical knowledge and are able to put them into practice.
There are different types of guided projects. One of them is a guided project for
There are a lot of benefits for it:
It removes the barriers between you and doing projects
Saves you much time thinking about the project and preparing the data.
It allows you to apply the theoretical knowledge without getting distracted by obstacles.
Practical tips that can save your effort and time in the future.
#datasciencefree
#rohitdubey
#teachtechtoe
#linkedin.com/in/therohitdubey
A binary tree is a hierarchical data structure in computer science that consists of nodes connected by edges. Each node in a binary tree has at most two children, referred to as the left child and the right child. The topmost node in a binary tree is called the root.
Here are some key terms and concepts associated with binary trees:
Root: The topmost node in the tree, from which all other nodes are descended.
Node: A fundamental unit of a binary tree that contains data and may have zero, one, or two children nodes.
Parent: A node in the tree that has one or more child nodes.
Child: Nodes that are descendants of a parent node. In a binary tree, a node can have at most two children.
Leaf: A node in the tree that has no children, i.e., it is at the bottom of the tree.
Subtree: A tree formed by a node and its descendants.
Height: The length of the longest path from the root to a leaf. The height of an empty tree is typically defined as -1.
Depth: The length of the path from the root to a particular node.
Binary trees are commonly used in various applications, such as expression trees, binary search trees, and Huffman coding trees. They provide an efficient way to organize and search data, and their recursive nature makes them well-suited for certain algorithms and data manipulations. Understanding binary trees is fundamental to many aspects of computer science and programming.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
1. What is Data Science?
Data Science is a combination of multiple disciplines that
uses statistics, data analysis, and machine learning to
analyze data and to extract knowledge and insights from
it.
2. By using Data Science, companies are able to make:
•Better decisions (should we choose A or B)
•Predictive analysis (what will happen next?)
•Pattern discoveries (find pattern, or maybe hidden
information in the data)
3. Where is Data Science Needed?
• Data Science is used in many industries in the world
today, e.g. banking, consultancy, healthcare, and
manufacturing.
• Examples of where Data Science is needed:
4. Data Science can be applied in nearly
every part of a business where data is
available. Examples are:
• Consumer goods
• Stock markets
• Industry
• Politics
• Logistic companies
• E-commerce
5. How Does a Data Scientist Work?
• A Data Scientist requires expertise in several
backgrounds:
• Machine Learning
• Statistics
• Programming (Python or R)
• Mathematics
• Databases
6. Here is how a Data Scientist works:
1.Ask the right questions - To understand the business problem.
2.Explore and collect data - From database, web logs, customer
feedback, etc.
3.Extract the data - Transform the data to a standardized format.
4.Clean the data - Remove erroneous values from the data.
5.Find and replace missing values - Check for missing values and
replace them with a suitable value (e.g. an average value).
6.Normalize data - Scale the values in a practical range (e.g. 140 cm is
smaller than 1,8 m. However, the number 140 is larger than 1,8. - so
scaling is important).
7.Analyze data, find patterns and make future predictions.
8.Represent the result - Present the result with useful insights in a
way the "company" can understand.
7. What is Data?
• Data is a collection of information.
• One purpose of Data Science is to structure data,
making it interpretable and easy to work with.
8. Data can be categorized into two groups:
• Structured data
• Unstructured data
11. How to Structure Data?
• We can use an array or a database table to structure or
present data.
• Example of an array:
• [80, 85, 90, 95, 100, 105, 110, 115, 120, 125]
13. Database Table
• A database table is a table with structured data.
• The following table shows a database table with health
data extracted from a sports watch:
15. Variables
• A variable is defined as something that can be
measured or counted.
• Examples can be characters, numbers or time.
• In the example under, we can observe that each column
represents a variable.
16.
17. Data Science & Python
• Python is a programming language widely used by Data
Scientists.
• Python has in-built mathematical libraries and
functions, making it easier to calculate mathematical
problems and to perform data analysis.
18. Python Libraries
• Python has libraries with large collections of mathematical
functions and analytical tools.
• In this course, we will use the following libraries:
• Pandas - This library is used for structured data operations, like
import CSV files, create dataframes, and data preparation
• Numpy - This is a mathematical library. Has a powerful N-
dimensional array object, linear algebra, Fourier transform, etc.
• Matplotlib - This library is used for visualization of data.
• SciPy - This library has linear algebra modules
19. Data Science - Python DataFrame
• Create a DataFrame with Pandas
• A data frame is a structured representation of data.
import pandas as pd
d = {'col1': [1, 2, 3, 4, 7], 'col2': [4, 5, 6, 9, 5], 'col3':
[7, 8, 12, 1, 11]}
df = pd.DataFrame(data=d)
print(df)
We write pd. in front of DataFrame() to let Python know that we want to activate the
DataFrame() function from the Pandas library.
Be aware of the capital D and F in DataFrame!
20. Example 1
Count the number of columns:
• count_column = df.shape[1]
print(count_column)
Example 2
Count the number of rows:
count_row = df.shape[0]
print(count_row)
21. Data Science Functions
• This chapter shows three commonly used functions
when working with Data Science: max(), min(), and
mean().
22.
23. The max() function
• The Python max() function is used to find the highest value in an
array.
• Ex
Average_pulse_max = max(80, 85, 90, 95, 100, 105, 110, 115, 120, 125)
print (Average_pulse_max)
24. The mean() function
The NumPy mean() function is used to find the average value of an
array.
Example:
import numpy as np
Calorie_burnage = [240, 250, 260, 270, 280, 290, 300, 310, 320, 330]
Average_calorie_burnage = np.mean(Calorie_burnage)
print(Average_calorie_burnage)
25. Data Science - Data Preparation
• Extract and Read Data With Pandas
• import pandas as pd
sample_data = pd.read_csv(‘music.csv’)
sample_data
26. Data Cleaning
• import pandas as pd
• sample_data = pd.read_csv("music.csv")
• X= sample_data.drop(columns=['genre'])
• print(X)
27. Data Categories
• Data can be split into three main categories:
1.Numerical - Contains numerical values. Can be divided
into two categories:
1.Discrete: Numbers are counted as "whole". Example: You cannot
have trained 2.5 sessions, it is either 2 or 3
2.Continuous: Numbers can be of infinite precision. For example, you
can sleep for 7 hours, 30 minutes and 20 seconds, or 7.533 hours
2.Categorical - Contains values that cannot be measured up
against each other. Example: A color or a type of training
3.Ordinal - Contains categorical data that can be measured
up against each other. Example: School grades where A is
better than B and so on
28. Data Types
We can use the info() function to list the data types
within our data set:
Ex: print(sample_data.info())
29. Analyze the Data
When we have cleaned the data set, we can start
analyzing the data.
We can use the describe() function in Python to
summarize data:
Editor's Notes
Data Science is about data gathering, analysis and decision-making.
Data Science is about finding patterns in data, through analysis, and make future predictions.
For route planning: To discover the best routes to ship
To foresee delays for flight/ship/train etc. (through predictive analysis)
To create promotional offers
To find the best suited time to deliver goods
To forecast the next years revenue for a company
To analyze health benefit of training
To predict who will win elections
A Data Scientist must find patterns within the data. Before he/she can find the patterns, he/she must organize the data in a standard format.
It is common to work with very large data sets in Data Science.
This dataset contains information of a typical training session such as duration, average pulse, calorie burnage etc.
A row is a horizontal representation of data.
A column is a vertical representation of data.
There are 6 columns, meaning that there are 6 variables (Duration, Average_Pulse, Max_Pulse, Calorie_Burnage, Hours_Work, Hours_Sleep).
There are 11 rows, meaning that each variable has 10 observations.
***But if there are 11 rows, how come there are only 10 observations?
It is because the first row is the label, meaning that it is the name of the variable.
Example Explained
Import the Pandas library as pd
Define data with column and rows in a variable named d
Create a data frame using the function pd.DataFrame()
The data frame contains 3 columns and 5 rows
Print the data frame output with the print() function
Why Can We Not Just Count the Rows and Columns Ourselves?
If we work with larger data sets with many columns and rows, it will be confusing to count it by yourself. You risk to count it wrongly. If we use the built-in functions in Python correctly, we assure that the count is correct.
The data set above consists of 6 variables, each with 10 observations:
Duration - How long lasted the training session in minutes?
Average_Pulse - What was the average pulse of the training session? This is measured by beats per minute
Max_Pulse - What was the max pulse of the training session?
Calorie_Burnage - How much calories were burnt on the training session?
Hours_Work - How many hours did we work at our job before the training session?
Hours_Sleep - How much did we sleep the night before the training session?
We use underscore (_) to separate strings because Python cannot read space as separator.
We write np. in front of mean to let Python know that we want to activate the mean function from the Numpy library.
1.Before analyzing data, a Data Scientist must extract the data, and make it clean and valuable.
2. Example Explained
Import the Pandas library
Name the data frame as sample_data.
By knowing the type of your data, you will be able to know what technique to use when analyzing them.