SlideShare a Scribd company logo
Python for Statistical Analysis
AND ITS DIFFERENT PACKAGES
18SE02CE011 : URJA DIYORA
SUBMIT TO :
DR.JASLEEN KAUR
OUTLINE
• Introduction to Pandas
• Data Wrangling with Pandas
• Plotting and Visualization
• NumPy Basics: Arrays and Vectorized Computation
• Statistical Data Modeling
• Data Loading, Storage, and File Formats
• Packages For Statistical Analysis
Introduction to Pandas
• Importing data
• Series and DataFrame objects
• Indexing, data selection and subsetting
• Hierarchical indexing
• Reading and writing files
• Sorting and ranking
• Missing data
• Data summarization
Data Wrangling with Pandas
• Date/time types
• Merging and joining DataFrame objects
• Concatenation
• Reshaping DataFrame objects
• Pivoting
• Data transformation
• Permutation and sampling
• Data aggregation and GroupBy operation
Plotting and Visualization
• Plotting in Pandas vs Matplotlib
• Bar plots
• Histograms
• Box plots
• Grouped plots
• Scatterplots
• Trellis plots
Statistical Data Modeling
• Statistical modeling
• Fitting data to probability distributions
• Fitting regression models
• Model selection
• Bootstrapping
Data Loading, Storage, and File Formats
• Indexing: Can treat one or more columns as the returned DataFrame,
and whether to get column names from the file, the user, or not at all.
• Type inference and data conversion: This includes the user-defined value
conversions and custom list of missing value markers.
• Datetime parsing: Includes combining capability, including combining
date and time information spread over multiple columns into a single
column in the result.
• Iterating: Support for iterating over chunks of very large files.
• Unclean data issues: Skipping rows or a footer, comments, or other
minor things like numeric data with thousands separated by commas
Packages For Statistical Analysis
• pandas >= 0.11.1 and its dependencies
• NumPy >= 1.6.1
• matplotlib >= 1.0.0
• pytz
• IPython >= 0.1.2
• pyzmq
• Tornado
• Optional: statsmodels, xlrd and openpyxl
NumPy Basics: Arrays and Vectorized Computation
• Fast vectorized array operations for data munging and cleaning,
subsetting and filtering, transformation, and any other kinds of
computations
• Common array algorithms like sorting, unique, and set operations
• Efficient descriptive statistics and aggregating/summarizing data
• Data alignment and relational data manipulations for merging and
joining together heterogeneous data sets
• Expressing conditional logic as array expressions instead of loops with if-
elifelse branches
• Group-wise data manipulations (aggregation, transformation, function
application).
Scipy
SciPy is a collection of packages addressing a number of different standard
problem domains in scientific computing.
• SciPy. Integrate: numerical integration routines and differential equation
solvers
• scipy.linalg: linear algebra routines and matrix decompositions extending
beyond those provided in numpy.linalg
• scipy.optimize: function optimizers (minimizers) and root finding
algorithms
• scipy.signal: signal processing tools
• scipy.sparse: sparse matrices and sparse linear system solvers
REFERENCES
http://oreilly.com/catalog/errata.csp?isbn=9781449319793f
corporate@oreilly.com
http://oreil.ly/python_for_data_analysis
http://facebook.com/oreilly
http://twitter.com/oreillymedia
http://www.youtube.com/oreillymedia

More Related Content

What's hot

Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
Stéphane Fréchette
 
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal DatabaseDebunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
Stavros Papadopoulos
 
introduction to data mining tutorial
introduction to data mining tutorial introduction to data mining tutorial
introduction to data mining tutorial
Salah Amean
 
Data Mining Concepts and Techniques
Data Mining Concepts and TechniquesData Mining Concepts and Techniques
Data Mining Concepts and TechniquesPratik Tambekar
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data miningSlideshare
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
DataminingTools Inc
 
An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folks
Thomas Hütter
 
Chapter -11 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter -11 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter -11 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter -11 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
error007
 
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Python
Wes McKinney
 
Data pre processing
Data pre processingData pre processing
Data pre processingpommurajopt
 
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Salah Amean
 
data mining
data miningdata mining
data mining
uoitc
 
pandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statisticspandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statistics
Wes McKinney
 
ECU SBL Learning Analytics for Assurance of Learning
ECU SBL Learning Analytics for Assurance of LearningECU SBL Learning Analytics for Assurance of Learning
ECU SBL Learning Analytics for Assurance of Learning
Sue Hickton
 
Data Mining: Key definitions
Data Mining: Key definitionsData Mining: Key definitions
Data Mining: Key definitions
DataminingTools Inc
 
Tatyana Matvienko,Senior Java Developer, Big data storages
 Tatyana Matvienko,Senior Java Developer, Big data storages Tatyana Matvienko,Senior Java Developer, Big data storages
Tatyana Matvienko,Senior Java Developer, Big data storages
Alina Vilk
 
Big data storages
Big data storagesBig data storages
Big data storages
DataArt
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
DataminingTools Inc
 
Data mining techniques unit 2
Data mining techniques unit 2Data mining techniques unit 2
Data mining techniques unit 2
malathieswaran29
 

What's hot (20)

Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal DatabaseDebunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
 
introduction to data mining tutorial
introduction to data mining tutorial introduction to data mining tutorial
introduction to data mining tutorial
 
Data Mining Concepts and Techniques
Data Mining Concepts and TechniquesData Mining Concepts and Techniques
Data Mining Concepts and Techniques
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folks
 
Chapter -11 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter -11 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter -11 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter -11 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Dbm630_lecture02-03
Dbm630_lecture02-03Dbm630_lecture02-03
Dbm630_lecture02-03
 
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Python
 
Data pre processing
Data pre processingData pre processing
Data pre processing
 
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
 
data mining
data miningdata mining
data mining
 
pandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statisticspandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statistics
 
ECU SBL Learning Analytics for Assurance of Learning
ECU SBL Learning Analytics for Assurance of LearningECU SBL Learning Analytics for Assurance of Learning
ECU SBL Learning Analytics for Assurance of Learning
 
Data Mining: Key definitions
Data Mining: Key definitionsData Mining: Key definitions
Data Mining: Key definitions
 
Tatyana Matvienko,Senior Java Developer, Big data storages
 Tatyana Matvienko,Senior Java Developer, Big data storages Tatyana Matvienko,Senior Java Developer, Big data storages
Tatyana Matvienko,Senior Java Developer, Big data storages
 
Big data storages
Big data storagesBig data storages
Big data storages
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data mining techniques unit 2
Data mining techniques unit 2Data mining techniques unit 2
Data mining techniques unit 2
 

Similar to Python for statistical analysis

All python data_analyst_r_course
All python data_analyst_r_courseAll python data_analyst_r_course
All python data_analyst_r_course
Kamal A
 
Pandas
PandasPandas
Pandas
Jyoti shukla
 
2. Data Preprocessing with Numpy and Pandas.pptx
2. Data Preprocessing with Numpy and Pandas.pptx2. Data Preprocessing with Numpy and Pandas.pptx
2. Data Preprocessing with Numpy and Pandas.pptx
PeangSereysothirich
 
Data preprocessing ppt1
Data preprocessing ppt1Data preprocessing ppt1
Data preprocessing ppt1meenas06
 
Data structures
Data structuresData structures
Data structures
BALUJAINSTITUTE
 
Data extraction, cleanup & transformation tools 29.1.16
Data extraction, cleanup & transformation tools 29.1.16Data extraction, cleanup & transformation tools 29.1.16
Data extraction, cleanup & transformation tools 29.1.16
Dhilsath Fathima
 
Quick dive to pandas
Quick dive to pandasQuick dive to pandas
Quick dive to pandas
Robin Kiplangat
 
Data mining techniques unit 1
Data mining techniques  unit 1Data mining techniques  unit 1
Data mining techniques unit 1
malathieswaran29
 
How to build a data stack from scratch
How to build a data stack from scratchHow to build a data stack from scratch
How to build a data stack from scratch
Vinayak Hegde
 
10. Data to Information: NumPy and Pandas
10. Data to Information: NumPy and Pandas10. Data to Information: NumPy and Pandas
10. Data to Information: NumPy and Pandas
Napier University
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
Dhilsath Fathima
 
Apply Raw Data Set And Implement The Different Data Warngliing Functionalitie...
Apply Raw Data Set And Implement The Different Data Warngliing Functionalitie...Apply Raw Data Set And Implement The Different Data Warngliing Functionalitie...
Apply Raw Data Set And Implement The Different Data Warngliing Functionalitie...
SaiM947604
 
Build data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesBuild data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelines
Mark Kromer
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 abhagathk
 
Python
PythonPython
Unit 3 part ii Data mining
Unit 3 part ii Data miningUnit 3 part ii Data mining
Unit 3 part ii Data mining
Dhilsath Fathima
 
Data preprocessing 2
Data preprocessing 2Data preprocessing 2
Data preprocessing 2extraganesh
 
Data science
Data scienceData science
Data science
allytech
 
IBM SPSS Statistics Subscription (월 구독) 제품 구성
IBM SPSS Statistics Subscription (월 구독) 제품 구성 IBM SPSS Statistics Subscription (월 구독) 제품 구성
IBM SPSS Statistics Subscription (월 구독) 제품 구성
Jin Sol Kim 김진솔
 

Similar to Python for statistical analysis (20)

All python data_analyst_r_course
All python data_analyst_r_courseAll python data_analyst_r_course
All python data_analyst_r_course
 
Pandas
PandasPandas
Pandas
 
2. Data Preprocessing with Numpy and Pandas.pptx
2. Data Preprocessing with Numpy and Pandas.pptx2. Data Preprocessing with Numpy and Pandas.pptx
2. Data Preprocessing with Numpy and Pandas.pptx
 
Data preprocessing ppt1
Data preprocessing ppt1Data preprocessing ppt1
Data preprocessing ppt1
 
Data structures
Data structuresData structures
Data structures
 
Data extraction, cleanup & transformation tools 29.1.16
Data extraction, cleanup & transformation tools 29.1.16Data extraction, cleanup & transformation tools 29.1.16
Data extraction, cleanup & transformation tools 29.1.16
 
Quick dive to pandas
Quick dive to pandasQuick dive to pandas
Quick dive to pandas
 
Data mining techniques unit 1
Data mining techniques  unit 1Data mining techniques  unit 1
Data mining techniques unit 1
 
How to build a data stack from scratch
How to build a data stack from scratchHow to build a data stack from scratch
How to build a data stack from scratch
 
10. Data to Information: NumPy and Pandas
10. Data to Information: NumPy and Pandas10. Data to Information: NumPy and Pandas
10. Data to Information: NumPy and Pandas
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
Apply Raw Data Set And Implement The Different Data Warngliing Functionalitie...
Apply Raw Data Set And Implement The Different Data Warngliing Functionalitie...Apply Raw Data Set And Implement The Different Data Warngliing Functionalitie...
Apply Raw Data Set And Implement The Different Data Warngliing Functionalitie...
 
Build data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesBuild data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelines
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
Python
PythonPython
Python
 
Unit 3 part ii Data mining
Unit 3 part ii Data miningUnit 3 part ii Data mining
Unit 3 part ii Data mining
 
Data preprocessing 2
Data preprocessing 2Data preprocessing 2
Data preprocessing 2
 
Data science
Data scienceData science
Data science
 
IBM SPSS Statistics Subscription (월 구독) 제품 구성
IBM SPSS Statistics Subscription (월 구독) 제품 구성 IBM SPSS Statistics Subscription (월 구독) 제품 구성
IBM SPSS Statistics Subscription (월 구독) 제품 구성
 

Recently uploaded

TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
DuvanRamosGarzon1
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
ssuser9bd3ba
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
PrashantGoswami42
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
MuhammadTufail242431
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 

Recently uploaded (20)

TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 

Python for statistical analysis

  • 1. Python for Statistical Analysis AND ITS DIFFERENT PACKAGES 18SE02CE011 : URJA DIYORA SUBMIT TO : DR.JASLEEN KAUR
  • 2. OUTLINE • Introduction to Pandas • Data Wrangling with Pandas • Plotting and Visualization • NumPy Basics: Arrays and Vectorized Computation • Statistical Data Modeling • Data Loading, Storage, and File Formats • Packages For Statistical Analysis
  • 3. Introduction to Pandas • Importing data • Series and DataFrame objects • Indexing, data selection and subsetting • Hierarchical indexing • Reading and writing files • Sorting and ranking • Missing data • Data summarization
  • 4. Data Wrangling with Pandas • Date/time types • Merging and joining DataFrame objects • Concatenation • Reshaping DataFrame objects • Pivoting • Data transformation • Permutation and sampling • Data aggregation and GroupBy operation
  • 5. Plotting and Visualization • Plotting in Pandas vs Matplotlib • Bar plots • Histograms • Box plots • Grouped plots • Scatterplots • Trellis plots
  • 6. Statistical Data Modeling • Statistical modeling • Fitting data to probability distributions • Fitting regression models • Model selection • Bootstrapping
  • 7. Data Loading, Storage, and File Formats • Indexing: Can treat one or more columns as the returned DataFrame, and whether to get column names from the file, the user, or not at all. • Type inference and data conversion: This includes the user-defined value conversions and custom list of missing value markers. • Datetime parsing: Includes combining capability, including combining date and time information spread over multiple columns into a single column in the result. • Iterating: Support for iterating over chunks of very large files. • Unclean data issues: Skipping rows or a footer, comments, or other minor things like numeric data with thousands separated by commas
  • 8. Packages For Statistical Analysis • pandas >= 0.11.1 and its dependencies • NumPy >= 1.6.1 • matplotlib >= 1.0.0 • pytz • IPython >= 0.1.2 • pyzmq • Tornado • Optional: statsmodels, xlrd and openpyxl
  • 9. NumPy Basics: Arrays and Vectorized Computation • Fast vectorized array operations for data munging and cleaning, subsetting and filtering, transformation, and any other kinds of computations • Common array algorithms like sorting, unique, and set operations • Efficient descriptive statistics and aggregating/summarizing data • Data alignment and relational data manipulations for merging and joining together heterogeneous data sets • Expressing conditional logic as array expressions instead of loops with if- elifelse branches • Group-wise data manipulations (aggregation, transformation, function application).
  • 10. Scipy SciPy is a collection of packages addressing a number of different standard problem domains in scientific computing. • SciPy. Integrate: numerical integration routines and differential equation solvers • scipy.linalg: linear algebra routines and matrix decompositions extending beyond those provided in numpy.linalg • scipy.optimize: function optimizers (minimizers) and root finding algorithms • scipy.signal: signal processing tools • scipy.sparse: sparse matrices and sparse linear system solvers