SlideShare a Scribd company logo
GOVERNMENT ENGINEERING COLLEGE
MAJALI, KARWAR-581345
Department of Computer Science
Engineering
Internship Presentation
On
DATA SCIENCE
Presented by
ADARSH MASEKAR
2GP19CS003
COMPANY PROFILE
 Technologics Global Pvt Ltd is a company based in
Bangalore, India that provides training and consulting
services in the field of engineering and technology.
 Their courses are designed for students, professionals, and
organizations looking to enhance their skills and knowledge in
these areas.
 In addition to training services, Technologics Global Pvt Ltd
also offers consulting services in areas such as product
design, development, and testing, as well as project
management and implementation.
 To provide high-quality training and education in the field of
engineering and technology, including emerging areas like
Data Science, Artificial Intelligence and Robotics.
 To help individuals and organizations improve their skills and
knowledge in engineering and technology and stay up-to-date
with the latest trends and developments.
 To offer customized training and consulting solutions to meet
the specific needs of clients and ensure that their training
objectives are met.
COMPANY PROFILE
INTRODUCTION DATA SCIENCE
 Data science is a deep study of the massive
amount of data, which involves extracting
meaningful insights from raw data.
 Data Science is about finding patterns in
data, through analysis, and make future
predictions.
FIELDS IN DATA SCIENCE
Fig 1.1 Fields in data science
By using Data Science, companies are able to make
 Better decisions(should we choose A or B)
 Predictive analysis (what will happen next?)
 Pattern discoveries (find pattern, or maybe hidden
information in the data)
Where is Data Science Needed?
Data Science is used in many industries in the world
today,
e.g. banking, consultancy, healthcare, and
manufacturing.
Examples of where Data Science is needed
 For route planning: To discover the
best routes to ship
 Anaconda Python is a free, open-source platform that
allows you to write and execute code in the
programming language Python. It is by continuum.io,
a company that specializes in Python development.
The Anaconda platform is the most popular way to
learn and use Python for scientific computing, data
science, and machine learning. It is used by over thirty
million people worldwide and is available for Windows,
macOS, and Linux.
INTRODUCTION TO ANACONDA AND
JUPYTER NOTEBOOK
 Anaconda is a distribution of the Python and R
programming languages for scientific computing, that
aims to simplify package management and
deployment. The distribution includes data-science
packages suitable for Windows, Linux, and macOS.
 Anaconda is an open-source distribution of the
Python and R programming languages for data
science that aims to simplify package
management and deployment.
 Anaconda Python is the perfect platform for
beginners who want to learn Python. It's easy to
install, and you can get started quickly with the
included Jupyter Notebook. Plus, Anaconda Python
has many features and libraries that you can use
PYTHON BASICS
Basic Operators: Arithmetic operators, relational operators,
Assignment operators, logical operators ,Bitwise operators ,
Identity operators and so on.
Decision Making: IF statement, IF-ELSE statement, ELIF
statement, Nested IF statement.
Mathematical Functions:
abs(x),cmp(x,y),exp(x),log(x),floor(x),max(x1,x2…),min(x1,x2…),p
ow(x,y) and so on.
Loops: While Loop Statements ,Using Else Statement with While
Loop, For Loop Statements, Using Else Statement with For Loop,
Nested Loops , Do While Loop.
PYTHON DATA STRUCTURES
 Python List: In Python, a list is a collection of items that can be of
different data types, such as integers, floats, strings, and even other
lists. Lists are a fundamental data structure in Python and are used to
store and manipulate data.
 Python Tuple: Python tuple is similar to a list, but it is an immutable
data structure, which means its contents cannot be changed after
creation. Tuples are often used to store related pieces of information that
belong together.
 Python Dictionary: In Python, a dictionary is a collection of key-value
pairs. It is a fundamental data structure used to store and retrieve data in
an associative manner. The keys are unique identifiers that map to their
corresponding values.
 Python Sets: In Python set is an unordered collection of unique items. It
is a fundamental data structure used to store and manipulate data in a
way that eliminates duplicates. To create a set in Python, you can use
curly braces { } or the set() function and separate the items with
commas.
NUMPY
 Numpy is a Python package that allows you to interact with
numerical arrays. Arrays are numerical tables or lists with one
or more dimensions.
 A one-dimensional array of ten integers, a two-dimensional
array of four rows and five columns, or a three-dimensional
array of two layers, three rows, and four columns are examples.
 NumPy allows you to perform mathematical and logical
operations on arrays. You may make arrays of various sizes,
manage their elements, and execute operations on them such
as addition, subtraction, multiplication, and division.
NUMPY
 Numpy makes it simple to construct, modify, and calculate
arrays. You may use numpy to do linear algebra, Fourier
transformations, matrix operations, and other tasks.
 Numpy is extremely quick and efficient due to the usage of
optimised code written in C or Fortran. Many additional
Python libraries that deal with scientific and data analysis
activities, such as pandas, scipy, matplotlib, scikit-learn, and
scikit-image, are built on Numpy.
FEATURES OF NUMPY
Fig 1.2 Create variable with specified data
type
FEATURES OF NUMPY
Fig 1.3 Time complexity
FEATURES OF NUMPY
Fig 1.4 Space complexity
PANDAS
• Pandas is an open-source Python Library providing high-
performance data manipulation and analysis tool using its
powerful data structures.
• The name Pandas is derived from the word Panel Data – an
Econometrics from Multidimensional data.
• Pandas lets us import, modify, and analyse data from a variety
of sources, such as CSV files, Excel spreadsheets, SQL
databases, and others.
• Pandas has two major data structures: Series (1-dimensional)
and Data Frame (2-dimensional), which enable us to store and
manipulate data in rows and columns, respectively.
• Pandas integrates nicely with numpy and other Python tools for
scientific and data analytical workloads.
PANDAS
Fig 1.5 How to install pandas
Why Use Pandas
 Pandas allows us to analyze big data and make conclusions
based on statistical theories.
 Pandas can clean messy data sets, and make them readable
and relevant.
 Relevant data is very important in data science.
Pandas Prerequisites
 You should have a basic understanding of Computer
Programming terminologies.
 A basic understanding of any of the programming languages
is a plus.
 Pandas library uses most of the functionalities of NumPy. It is
suggested that you go through numpy first
Pandas Data Structure
 Series
 DataFrame
MATPLOTLIB
 Matplotlib is a Python library used for data visualization and
plotting 2D and 3D graphs and charts.
 Matplotlib allows users to create a wide range of
visualizations including line plots, scatter plots, histograms,
bar charts, error bars, and more.
 Matplotlib is built on top of the NumPy library, which allows it
to work with arrays and matrices efficiently.
 It is widely used in data analysis, scientific research, and
visualization applications.
BASIC PLOTS IN MATPLOTLIB
Fig 1.6 Line plot Fig 1.7 Bar plot Fig 1.8 Scatter plot
SEABORN LIBRARY
 Seaborn is one of an amazing Library for visualization of the
graphical statistical plotting in Python. Seaborn provides
many color palettes and defaults beautiful styles to make the
creation of many statistical plots in Python more attractive.
Categories of Plots in Python's seaborn library
 Distribution Plots
 Relational Plots
 Regression Plots
 Categorical Plots
 Multi-plot grids
 Matrix Plots
SEABORN TYPES
Fig 1.9 scatter plot Fig 1.10 line plot
SEABORN TYPES
Fig 1.11 matrix plot Fig 1.12 histogram
OPENCV
 OpenCV is a Python library for computer vision, machine learning, and
image processing.
 Computer vision may be used to detect faces, recognise objects, track
motions, and more. The study of how to edit and change photos and
videos is known as image processing.
 Image processing, for example, can be used to resize, crop, rotate,
blur, sharpen, and improve photos and movies.
 OpenCV also includes object detection and identification features.
OpenCV, for example, can recognise faces in photos or movies and
track them as they move. OpenCV may also be used to detect
patterns, forms, and objects in photos and movies.
OpenCV
 OpenCV is a cross-platform library using which we can
develop real-time computer vision applications. It mainly
focuses on image processing, video capture and analysis
including features like face detection and object detection.
Computer Vision
 Computer Vision can be defined as a discipline that explains
how to reconstruct, interrupt, and understand a 3D scene
from its 2D images. It deals with modeling and replicating
human vision using computer software and hardware.
 Computer Vision overlaps significantly with the following
fields −
 Image Processing − It focuses on image manipulation.
 Pattern Recognition − It explains various techniques to
classify patterns.
 Photogrammetry − It is concerned with obtaining accurate
measurements from images.
Computer Vision Vs Image Processing
 Image Processing
Image processing deals with image-to-image
transformation. The input and output of image
processing are both images.
 Computer vision
Computer vision is the construction of explicit,
meaningful descriptions of physical objects from
their image. The output of computer vision is a
description or an interpretation of structures in 3D
Fig 1.13 Overview code of face mask detection
FACE MASK DETECTION USING
OPENCV
Fig 1.14 Face mask detection demo
Features of OpenCV Library
 Read and write images
 Capture and save videos
 Process images (filter, transform)
 Perform feature detection
 Detect specific objects such as faces, eyes, cars, in
the videos or images.
 Analyze the video, i.e., estimate the motion in it,
subtract the background, and track objects in it.
TKINTER
Tkinter is a standard Python library for creating graphical user
interfaces (GUIs) for desktop applications. It provides a set of
built-in GUI widgets (such as buttons, text boxes, and menus)
and functions to create windows, frames, and other GUI
components. Tkinter is based on the Tk GUI toolkit, which was
originally developed for the Tcl programming language. It is
cross-platform and can be used on Windows, macOS, and
Linux. Tkinter is widely used for creating simple to medium
complexity GUI applications, and it is particularly popular among
Python developers because of its simplicity and ease of use.
MARKSHEET GUI
Fig 1.15 GUI of mark sheet
Fig 1.16 GUI of Pycharm using tkinter library
PYCHARM GUI
APPLICATION DATA SCIENCE
 Making better business decisions.
 Measuring performance.
 Providing information to internal
finances.
 Developing better products.
 Increasing efficiency.
 Mitigating risk and fraud.
 Predicting outcomes and trends.
 Improving customer experiences
ADVANTAGES DATA SCIENCE
1.Autocomplete
AutoComplete feature is an important part of Data Science
where the user will get the facility to just type a few letters or
words, and he will get the feature of auto-completing the line.
2.In Search Engines
The most useful application of Data Science is Search
Engines. As we know when we want to search for something
on the internet, we mostly used Search engines like Google,
Yahoo, Safari, Firefox, etc. So Data Science is used to get
Searches faster.
3.In Finance
Data Science plays a key role in Financial Industries.
Financial Industries always have an issue of fraud and risk of
losses.
ADVANTAGES DATA SCIENCE
4.In E-Commerce
E-Commerce Websites like Amazon, Flipkart, etc.
uses data Science to make a better user
experience with personalized recommendations.
5.Image Recognition
Currently, Data Science is also used in Image
Recognition. For Example, When we upload our
image with our friend on Facebook, Facebook
gives suggestions Tagging who is in the picture.
This is done with the help of machine learning
and Data Science.
IRIS FLOWER CLASSIFICATION
Fig 1.17 Overview of code
IRIS FLOWER CLASSIFICATION
Fig 1.18 Accuracy using different algorithms
IRIS FLOWER CLASSIFICATION
Fig 1.19 Data sets used for classification
IPL DATA ANALYSIS
Fig 1.20 Importing matches.csv file
IPL DATA ANALYSIS
Fig 1.21 List of winner season wise
IPL DATA ANALYSIS
Fig 1.22 Toss winning prediction
IPL DATA ANALYSIS
Fig 1.23 matches.csv dataset
IPL DATA ANALYSIS
Fig 1.24 deliveries.csv dataset
CONCLUSION
Data science is a rapidly growing field
that has become an essential part of many
industries. It provides a powerful set of tools and
techniques that enable businesses to gain insights
and make data-driven decisions.
The field of data science requires a
strong background in mathematics, statistics, and
computer science, as well as the ability to
communicate complex technical concepts to both
technical and non-technical audiences. With the
continued growth of data and the increasing
demand for data-driven insights, data science is
expected to remain a critical area of focus for
businesses in the years to come.
THANK YOU

More Related Content

Similar to Adarsh_Masekar(2GP19CS003).pptx

MACHINE LEARNING WITH PYTHON PPT.pptx
MACHINE LEARNING WITH PYTHON PPT.pptxMACHINE LEARNING WITH PYTHON PPT.pptx
MACHINE LEARNING WITH PYTHON PPT.pptxSkillUp Online
 
Abhishek Training PPT.pptx
Abhishek Training PPT.pptxAbhishek Training PPT.pptx
Abhishek Training PPT.pptxKashishKashish22
 
Certified Python Business Analyst
Certified Python Business AnalystCertified Python Business Analyst
Certified Python Business AnalystAnkitSingh2134
 
Python libraries for data science
Python libraries for data sciencePython libraries for data science
Python libraries for data sciencenilashri2
 
overview of python programming language.pptx
overview of python programming language.pptxoverview of python programming language.pptx
overview of python programming language.pptxdmsidharth
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsIRJET Journal
 
Machine Learning Techniques in Python Dissertation - Phdassistance
Machine Learning Techniques in Python Dissertation - PhdassistanceMachine Learning Techniques in Python Dissertation - Phdassistance
Machine Learning Techniques in Python Dissertation - PhdassistancePhD Assistance
 
Data Science Tools and Technologies: A Comprehensive Overview
Data Science Tools and Technologies: A Comprehensive OverviewData Science Tools and Technologies: A Comprehensive Overview
Data Science Tools and Technologies: A Comprehensive Overviewsaniakhan8105
 
Breast Cancer Prediction.pdf
Breast Cancer Prediction.pdfBreast Cancer Prediction.pdf
Breast Cancer Prediction.pdfSouravNaga2
 
PPT5: Neuron Introduction
PPT5: Neuron IntroductionPPT5: Neuron Introduction
PPT5: Neuron Introductionakira-ai
 
Top Artificial Intelligence Tools & Frameworks in 2023.pdf
Top Artificial Intelligence Tools & Frameworks in 2023.pdfTop Artificial Intelligence Tools & Frameworks in 2023.pdf
Top Artificial Intelligence Tools & Frameworks in 2023.pdfYamuna5
 
Building Your Dream Machine Learning Team with Python Expertise
Building Your Dream Machine Learning Team with Python ExpertiseBuilding Your Dream Machine Learning Team with Python Expertise
Building Your Dream Machine Learning Team with Python Expertiseriyak40
 

Similar to Adarsh_Masekar(2GP19CS003).pptx (20)

MACHINE LEARNING WITH PYTHON PPT.pptx
MACHINE LEARNING WITH PYTHON PPT.pptxMACHINE LEARNING WITH PYTHON PPT.pptx
MACHINE LEARNING WITH PYTHON PPT.pptx
 
Abhishek Training PPT.pptx
Abhishek Training PPT.pptxAbhishek Training PPT.pptx
Abhishek Training PPT.pptx
 
DataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdfDataScience_RoadMap_2023.pdf
DataScience_RoadMap_2023.pdf
 
Certified Python Business Analyst
Certified Python Business AnalystCertified Python Business Analyst
Certified Python Business Analyst
 
Data science and Machine learning Booklet
Data science and Machine learning BookletData science and Machine learning Booklet
Data science and Machine learning Booklet
 
Python libraries for data science
Python libraries for data sciencePython libraries for data science
Python libraries for data science
 
overview of python programming language.pptx
overview of python programming language.pptxoverview of python programming language.pptx
overview of python programming language.pptx
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data Analytics
 
Machine Learning Techniques in Python Dissertation - Phdassistance
Machine Learning Techniques in Python Dissertation - PhdassistanceMachine Learning Techniques in Python Dissertation - Phdassistance
Machine Learning Techniques in Python Dissertation - Phdassistance
 
Data Science Tools and Technologies: A Comprehensive Overview
Data Science Tools and Technologies: A Comprehensive OverviewData Science Tools and Technologies: A Comprehensive Overview
Data Science Tools and Technologies: A Comprehensive Overview
 
Session 2
Session 2Session 2
Session 2
 
Python and data analytics
Python and data analyticsPython and data analytics
Python and data analytics
 
Python ml
Python mlPython ml
Python ml
 
Python libraries
Python librariesPython libraries
Python libraries
 
Breast Cancer Prediction.pdf
Breast Cancer Prediction.pdfBreast Cancer Prediction.pdf
Breast Cancer Prediction.pdf
 
PPT5: Neuron Introduction
PPT5: Neuron IntroductionPPT5: Neuron Introduction
PPT5: Neuron Introduction
 
Python Programming
Python ProgrammingPython Programming
Python Programming
 
Top Artificial Intelligence Tools & Frameworks in 2023.pdf
Top Artificial Intelligence Tools & Frameworks in 2023.pdfTop Artificial Intelligence Tools & Frameworks in 2023.pdf
Top Artificial Intelligence Tools & Frameworks in 2023.pdf
 
Building Your Dream Machine Learning Team with Python Expertise
Building Your Dream Machine Learning Team with Python ExpertiseBuilding Your Dream Machine Learning Team with Python Expertise
Building Your Dream Machine Learning Team with Python Expertise
 

Recently uploaded

standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhArpitMalhotra16
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIAlejandraGmez176757
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单ewymefz
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单nscud
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsalex933524
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单vcaxypu
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单enxupq
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单ewymefz
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单ewymefz
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJames Polillo
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单nscud
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBAlireza Kamrani
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单vcaxypu
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportSatyamNeelmani2
 

Recently uploaded (20)

standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
 

Adarsh_Masekar(2GP19CS003).pptx

  • 1. GOVERNMENT ENGINEERING COLLEGE MAJALI, KARWAR-581345 Department of Computer Science Engineering Internship Presentation On DATA SCIENCE Presented by ADARSH MASEKAR 2GP19CS003
  • 2. COMPANY PROFILE  Technologics Global Pvt Ltd is a company based in Bangalore, India that provides training and consulting services in the field of engineering and technology.  Their courses are designed for students, professionals, and organizations looking to enhance their skills and knowledge in these areas.  In addition to training services, Technologics Global Pvt Ltd also offers consulting services in areas such as product design, development, and testing, as well as project management and implementation.
  • 3.  To provide high-quality training and education in the field of engineering and technology, including emerging areas like Data Science, Artificial Intelligence and Robotics.  To help individuals and organizations improve their skills and knowledge in engineering and technology and stay up-to-date with the latest trends and developments.  To offer customized training and consulting solutions to meet the specific needs of clients and ensure that their training objectives are met. COMPANY PROFILE
  • 4. INTRODUCTION DATA SCIENCE  Data science is a deep study of the massive amount of data, which involves extracting meaningful insights from raw data.  Data Science is about finding patterns in data, through analysis, and make future predictions.
  • 5. FIELDS IN DATA SCIENCE Fig 1.1 Fields in data science
  • 6. By using Data Science, companies are able to make  Better decisions(should we choose A or B)  Predictive analysis (what will happen next?)  Pattern discoveries (find pattern, or maybe hidden information in the data) Where is Data Science Needed? Data Science is used in many industries in the world today, e.g. banking, consultancy, healthcare, and manufacturing. Examples of where Data Science is needed  For route planning: To discover the best routes to ship
  • 7.  Anaconda Python is a free, open-source platform that allows you to write and execute code in the programming language Python. It is by continuum.io, a company that specializes in Python development. The Anaconda platform is the most popular way to learn and use Python for scientific computing, data science, and machine learning. It is used by over thirty million people worldwide and is available for Windows, macOS, and Linux. INTRODUCTION TO ANACONDA AND JUPYTER NOTEBOOK
  • 8.  Anaconda is a distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment. The distribution includes data-science packages suitable for Windows, Linux, and macOS.  Anaconda is an open-source distribution of the Python and R programming languages for data science that aims to simplify package management and deployment.  Anaconda Python is the perfect platform for beginners who want to learn Python. It's easy to install, and you can get started quickly with the included Jupyter Notebook. Plus, Anaconda Python has many features and libraries that you can use
  • 9. PYTHON BASICS Basic Operators: Arithmetic operators, relational operators, Assignment operators, logical operators ,Bitwise operators , Identity operators and so on. Decision Making: IF statement, IF-ELSE statement, ELIF statement, Nested IF statement. Mathematical Functions: abs(x),cmp(x,y),exp(x),log(x),floor(x),max(x1,x2…),min(x1,x2…),p ow(x,y) and so on. Loops: While Loop Statements ,Using Else Statement with While Loop, For Loop Statements, Using Else Statement with For Loop, Nested Loops , Do While Loop.
  • 10. PYTHON DATA STRUCTURES  Python List: In Python, a list is a collection of items that can be of different data types, such as integers, floats, strings, and even other lists. Lists are a fundamental data structure in Python and are used to store and manipulate data.  Python Tuple: Python tuple is similar to a list, but it is an immutable data structure, which means its contents cannot be changed after creation. Tuples are often used to store related pieces of information that belong together.  Python Dictionary: In Python, a dictionary is a collection of key-value pairs. It is a fundamental data structure used to store and retrieve data in an associative manner. The keys are unique identifiers that map to their corresponding values.  Python Sets: In Python set is an unordered collection of unique items. It is a fundamental data structure used to store and manipulate data in a way that eliminates duplicates. To create a set in Python, you can use curly braces { } or the set() function and separate the items with commas.
  • 11. NUMPY  Numpy is a Python package that allows you to interact with numerical arrays. Arrays are numerical tables or lists with one or more dimensions.  A one-dimensional array of ten integers, a two-dimensional array of four rows and five columns, or a three-dimensional array of two layers, three rows, and four columns are examples.  NumPy allows you to perform mathematical and logical operations on arrays. You may make arrays of various sizes, manage their elements, and execute operations on them such as addition, subtraction, multiplication, and division.
  • 12. NUMPY  Numpy makes it simple to construct, modify, and calculate arrays. You may use numpy to do linear algebra, Fourier transformations, matrix operations, and other tasks.  Numpy is extremely quick and efficient due to the usage of optimised code written in C or Fortran. Many additional Python libraries that deal with scientific and data analysis activities, such as pandas, scipy, matplotlib, scikit-learn, and scikit-image, are built on Numpy.
  • 13. FEATURES OF NUMPY Fig 1.2 Create variable with specified data type
  • 14. FEATURES OF NUMPY Fig 1.3 Time complexity
  • 15. FEATURES OF NUMPY Fig 1.4 Space complexity
  • 16. PANDAS • Pandas is an open-source Python Library providing high- performance data manipulation and analysis tool using its powerful data structures. • The name Pandas is derived from the word Panel Data – an Econometrics from Multidimensional data. • Pandas lets us import, modify, and analyse data from a variety of sources, such as CSV files, Excel spreadsheets, SQL databases, and others. • Pandas has two major data structures: Series (1-dimensional) and Data Frame (2-dimensional), which enable us to store and manipulate data in rows and columns, respectively. • Pandas integrates nicely with numpy and other Python tools for scientific and data analytical workloads.
  • 17. PANDAS Fig 1.5 How to install pandas
  • 18. Why Use Pandas  Pandas allows us to analyze big data and make conclusions based on statistical theories.  Pandas can clean messy data sets, and make them readable and relevant.  Relevant data is very important in data science. Pandas Prerequisites  You should have a basic understanding of Computer Programming terminologies.  A basic understanding of any of the programming languages is a plus.  Pandas library uses most of the functionalities of NumPy. It is suggested that you go through numpy first Pandas Data Structure  Series  DataFrame
  • 19. MATPLOTLIB  Matplotlib is a Python library used for data visualization and plotting 2D and 3D graphs and charts.  Matplotlib allows users to create a wide range of visualizations including line plots, scatter plots, histograms, bar charts, error bars, and more.  Matplotlib is built on top of the NumPy library, which allows it to work with arrays and matrices efficiently.  It is widely used in data analysis, scientific research, and visualization applications.
  • 20. BASIC PLOTS IN MATPLOTLIB Fig 1.6 Line plot Fig 1.7 Bar plot Fig 1.8 Scatter plot
  • 21. SEABORN LIBRARY  Seaborn is one of an amazing Library for visualization of the graphical statistical plotting in Python. Seaborn provides many color palettes and defaults beautiful styles to make the creation of many statistical plots in Python more attractive. Categories of Plots in Python's seaborn library  Distribution Plots  Relational Plots  Regression Plots  Categorical Plots  Multi-plot grids  Matrix Plots
  • 22. SEABORN TYPES Fig 1.9 scatter plot Fig 1.10 line plot
  • 23. SEABORN TYPES Fig 1.11 matrix plot Fig 1.12 histogram
  • 24. OPENCV  OpenCV is a Python library for computer vision, machine learning, and image processing.  Computer vision may be used to detect faces, recognise objects, track motions, and more. The study of how to edit and change photos and videos is known as image processing.  Image processing, for example, can be used to resize, crop, rotate, blur, sharpen, and improve photos and movies.  OpenCV also includes object detection and identification features. OpenCV, for example, can recognise faces in photos or movies and track them as they move. OpenCV may also be used to detect patterns, forms, and objects in photos and movies.
  • 25. OpenCV  OpenCV is a cross-platform library using which we can develop real-time computer vision applications. It mainly focuses on image processing, video capture and analysis including features like face detection and object detection. Computer Vision  Computer Vision can be defined as a discipline that explains how to reconstruct, interrupt, and understand a 3D scene from its 2D images. It deals with modeling and replicating human vision using computer software and hardware.  Computer Vision overlaps significantly with the following fields −  Image Processing − It focuses on image manipulation.  Pattern Recognition − It explains various techniques to classify patterns.  Photogrammetry − It is concerned with obtaining accurate measurements from images.
  • 26. Computer Vision Vs Image Processing  Image Processing Image processing deals with image-to-image transformation. The input and output of image processing are both images.  Computer vision Computer vision is the construction of explicit, meaningful descriptions of physical objects from their image. The output of computer vision is a description or an interpretation of structures in 3D
  • 27. Fig 1.13 Overview code of face mask detection FACE MASK DETECTION USING OPENCV
  • 28. Fig 1.14 Face mask detection demo
  • 29. Features of OpenCV Library  Read and write images  Capture and save videos  Process images (filter, transform)  Perform feature detection  Detect specific objects such as faces, eyes, cars, in the videos or images.  Analyze the video, i.e., estimate the motion in it, subtract the background, and track objects in it.
  • 30. TKINTER Tkinter is a standard Python library for creating graphical user interfaces (GUIs) for desktop applications. It provides a set of built-in GUI widgets (such as buttons, text boxes, and menus) and functions to create windows, frames, and other GUI components. Tkinter is based on the Tk GUI toolkit, which was originally developed for the Tcl programming language. It is cross-platform and can be used on Windows, macOS, and Linux. Tkinter is widely used for creating simple to medium complexity GUI applications, and it is particularly popular among Python developers because of its simplicity and ease of use.
  • 31. MARKSHEET GUI Fig 1.15 GUI of mark sheet
  • 32. Fig 1.16 GUI of Pycharm using tkinter library PYCHARM GUI
  • 33. APPLICATION DATA SCIENCE  Making better business decisions.  Measuring performance.  Providing information to internal finances.  Developing better products.  Increasing efficiency.  Mitigating risk and fraud.  Predicting outcomes and trends.  Improving customer experiences
  • 34. ADVANTAGES DATA SCIENCE 1.Autocomplete AutoComplete feature is an important part of Data Science where the user will get the facility to just type a few letters or words, and he will get the feature of auto-completing the line. 2.In Search Engines The most useful application of Data Science is Search Engines. As we know when we want to search for something on the internet, we mostly used Search engines like Google, Yahoo, Safari, Firefox, etc. So Data Science is used to get Searches faster. 3.In Finance Data Science plays a key role in Financial Industries. Financial Industries always have an issue of fraud and risk of losses.
  • 35. ADVANTAGES DATA SCIENCE 4.In E-Commerce E-Commerce Websites like Amazon, Flipkart, etc. uses data Science to make a better user experience with personalized recommendations. 5.Image Recognition Currently, Data Science is also used in Image Recognition. For Example, When we upload our image with our friend on Facebook, Facebook gives suggestions Tagging who is in the picture. This is done with the help of machine learning and Data Science.
  • 36. IRIS FLOWER CLASSIFICATION Fig 1.17 Overview of code
  • 37. IRIS FLOWER CLASSIFICATION Fig 1.18 Accuracy using different algorithms
  • 38. IRIS FLOWER CLASSIFICATION Fig 1.19 Data sets used for classification
  • 39. IPL DATA ANALYSIS Fig 1.20 Importing matches.csv file
  • 40. IPL DATA ANALYSIS Fig 1.21 List of winner season wise
  • 41. IPL DATA ANALYSIS Fig 1.22 Toss winning prediction
  • 42. IPL DATA ANALYSIS Fig 1.23 matches.csv dataset
  • 43. IPL DATA ANALYSIS Fig 1.24 deliveries.csv dataset
  • 44. CONCLUSION Data science is a rapidly growing field that has become an essential part of many industries. It provides a powerful set of tools and techniques that enable businesses to gain insights and make data-driven decisions. The field of data science requires a strong background in mathematics, statistics, and computer science, as well as the ability to communicate complex technical concepts to both technical and non-technical audiences. With the continued growth of data and the increasing demand for data-driven insights, data science is expected to remain a critical area of focus for businesses in the years to come.