1. GOVERNMENT ENGINEERING COLLEGE
MAJALI, KARWAR-581345
Department of Computer Science
Engineering
Internship Presentation
On
DATA SCIENCE
Presented by
ADARSH MASEKAR
2GP19CS003
2. COMPANY PROFILE
Technologics Global Pvt Ltd is a company based in
Bangalore, India that provides training and consulting
services in the field of engineering and technology.
Their courses are designed for students, professionals, and
organizations looking to enhance their skills and knowledge in
these areas.
In addition to training services, Technologics Global Pvt Ltd
also offers consulting services in areas such as product
design, development, and testing, as well as project
management and implementation.
3. To provide high-quality training and education in the field of
engineering and technology, including emerging areas like
Data Science, Artificial Intelligence and Robotics.
To help individuals and organizations improve their skills and
knowledge in engineering and technology and stay up-to-date
with the latest trends and developments.
To offer customized training and consulting solutions to meet
the specific needs of clients and ensure that their training
objectives are met.
COMPANY PROFILE
4. INTRODUCTION DATA SCIENCE
Data science is a deep study of the massive
amount of data, which involves extracting
meaningful insights from raw data.
Data Science is about finding patterns in
data, through analysis, and make future
predictions.
6. By using Data Science, companies are able to make
Better decisions(should we choose A or B)
Predictive analysis (what will happen next?)
Pattern discoveries (find pattern, or maybe hidden
information in the data)
Where is Data Science Needed?
Data Science is used in many industries in the world
today,
e.g. banking, consultancy, healthcare, and
manufacturing.
Examples of where Data Science is needed
For route planning: To discover the
best routes to ship
7. Anaconda Python is a free, open-source platform that
allows you to write and execute code in the
programming language Python. It is by continuum.io,
a company that specializes in Python development.
The Anaconda platform is the most popular way to
learn and use Python for scientific computing, data
science, and machine learning. It is used by over thirty
million people worldwide and is available for Windows,
macOS, and Linux.
INTRODUCTION TO ANACONDA AND
JUPYTER NOTEBOOK
8. Anaconda is a distribution of the Python and R
programming languages for scientific computing, that
aims to simplify package management and
deployment. The distribution includes data-science
packages suitable for Windows, Linux, and macOS.
Anaconda is an open-source distribution of the
Python and R programming languages for data
science that aims to simplify package
management and deployment.
Anaconda Python is the perfect platform for
beginners who want to learn Python. It's easy to
install, and you can get started quickly with the
included Jupyter Notebook. Plus, Anaconda Python
has many features and libraries that you can use
9. PYTHON BASICS
Basic Operators: Arithmetic operators, relational operators,
Assignment operators, logical operators ,Bitwise operators ,
Identity operators and so on.
Decision Making: IF statement, IF-ELSE statement, ELIF
statement, Nested IF statement.
Mathematical Functions:
abs(x),cmp(x,y),exp(x),log(x),floor(x),max(x1,x2…),min(x1,x2…),p
ow(x,y) and so on.
Loops: While Loop Statements ,Using Else Statement with While
Loop, For Loop Statements, Using Else Statement with For Loop,
Nested Loops , Do While Loop.
10. PYTHON DATA STRUCTURES
Python List: In Python, a list is a collection of items that can be of
different data types, such as integers, floats, strings, and even other
lists. Lists are a fundamental data structure in Python and are used to
store and manipulate data.
Python Tuple: Python tuple is similar to a list, but it is an immutable
data structure, which means its contents cannot be changed after
creation. Tuples are often used to store related pieces of information that
belong together.
Python Dictionary: In Python, a dictionary is a collection of key-value
pairs. It is a fundamental data structure used to store and retrieve data in
an associative manner. The keys are unique identifiers that map to their
corresponding values.
Python Sets: In Python set is an unordered collection of unique items. It
is a fundamental data structure used to store and manipulate data in a
way that eliminates duplicates. To create a set in Python, you can use
curly braces { } or the set() function and separate the items with
commas.
11. NUMPY
Numpy is a Python package that allows you to interact with
numerical arrays. Arrays are numerical tables or lists with one
or more dimensions.
A one-dimensional array of ten integers, a two-dimensional
array of four rows and five columns, or a three-dimensional
array of two layers, three rows, and four columns are examples.
NumPy allows you to perform mathematical and logical
operations on arrays. You may make arrays of various sizes,
manage their elements, and execute operations on them such
as addition, subtraction, multiplication, and division.
12. NUMPY
Numpy makes it simple to construct, modify, and calculate
arrays. You may use numpy to do linear algebra, Fourier
transformations, matrix operations, and other tasks.
Numpy is extremely quick and efficient due to the usage of
optimised code written in C or Fortran. Many additional
Python libraries that deal with scientific and data analysis
activities, such as pandas, scipy, matplotlib, scikit-learn, and
scikit-image, are built on Numpy.
16. PANDAS
• Pandas is an open-source Python Library providing high-
performance data manipulation and analysis tool using its
powerful data structures.
• The name Pandas is derived from the word Panel Data – an
Econometrics from Multidimensional data.
• Pandas lets us import, modify, and analyse data from a variety
of sources, such as CSV files, Excel spreadsheets, SQL
databases, and others.
• Pandas has two major data structures: Series (1-dimensional)
and Data Frame (2-dimensional), which enable us to store and
manipulate data in rows and columns, respectively.
• Pandas integrates nicely with numpy and other Python tools for
scientific and data analytical workloads.
18. Why Use Pandas
Pandas allows us to analyze big data and make conclusions
based on statistical theories.
Pandas can clean messy data sets, and make them readable
and relevant.
Relevant data is very important in data science.
Pandas Prerequisites
You should have a basic understanding of Computer
Programming terminologies.
A basic understanding of any of the programming languages
is a plus.
Pandas library uses most of the functionalities of NumPy. It is
suggested that you go through numpy first
Pandas Data Structure
Series
DataFrame
19. MATPLOTLIB
Matplotlib is a Python library used for data visualization and
plotting 2D and 3D graphs and charts.
Matplotlib allows users to create a wide range of
visualizations including line plots, scatter plots, histograms,
bar charts, error bars, and more.
Matplotlib is built on top of the NumPy library, which allows it
to work with arrays and matrices efficiently.
It is widely used in data analysis, scientific research, and
visualization applications.
20. BASIC PLOTS IN MATPLOTLIB
Fig 1.6 Line plot Fig 1.7 Bar plot Fig 1.8 Scatter plot
21. SEABORN LIBRARY
Seaborn is one of an amazing Library for visualization of the
graphical statistical plotting in Python. Seaborn provides
many color palettes and defaults beautiful styles to make the
creation of many statistical plots in Python more attractive.
Categories of Plots in Python's seaborn library
Distribution Plots
Relational Plots
Regression Plots
Categorical Plots
Multi-plot grids
Matrix Plots
24. OPENCV
OpenCV is a Python library for computer vision, machine learning, and
image processing.
Computer vision may be used to detect faces, recognise objects, track
motions, and more. The study of how to edit and change photos and
videos is known as image processing.
Image processing, for example, can be used to resize, crop, rotate,
blur, sharpen, and improve photos and movies.
OpenCV also includes object detection and identification features.
OpenCV, for example, can recognise faces in photos or movies and
track them as they move. OpenCV may also be used to detect
patterns, forms, and objects in photos and movies.
25. OpenCV
OpenCV is a cross-platform library using which we can
develop real-time computer vision applications. It mainly
focuses on image processing, video capture and analysis
including features like face detection and object detection.
Computer Vision
Computer Vision can be defined as a discipline that explains
how to reconstruct, interrupt, and understand a 3D scene
from its 2D images. It deals with modeling and replicating
human vision using computer software and hardware.
Computer Vision overlaps significantly with the following
fields −
Image Processing − It focuses on image manipulation.
Pattern Recognition − It explains various techniques to
classify patterns.
Photogrammetry − It is concerned with obtaining accurate
measurements from images.
26. Computer Vision Vs Image Processing
Image Processing
Image processing deals with image-to-image
transformation. The input and output of image
processing are both images.
Computer vision
Computer vision is the construction of explicit,
meaningful descriptions of physical objects from
their image. The output of computer vision is a
description or an interpretation of structures in 3D
27. Fig 1.13 Overview code of face mask detection
FACE MASK DETECTION USING
OPENCV
29. Features of OpenCV Library
Read and write images
Capture and save videos
Process images (filter, transform)
Perform feature detection
Detect specific objects such as faces, eyes, cars, in
the videos or images.
Analyze the video, i.e., estimate the motion in it,
subtract the background, and track objects in it.
30. TKINTER
Tkinter is a standard Python library for creating graphical user
interfaces (GUIs) for desktop applications. It provides a set of
built-in GUI widgets (such as buttons, text boxes, and menus)
and functions to create windows, frames, and other GUI
components. Tkinter is based on the Tk GUI toolkit, which was
originally developed for the Tcl programming language. It is
cross-platform and can be used on Windows, macOS, and
Linux. Tkinter is widely used for creating simple to medium
complexity GUI applications, and it is particularly popular among
Python developers because of its simplicity and ease of use.
32. Fig 1.16 GUI of Pycharm using tkinter library
PYCHARM GUI
33. APPLICATION DATA SCIENCE
Making better business decisions.
Measuring performance.
Providing information to internal
finances.
Developing better products.
Increasing efficiency.
Mitigating risk and fraud.
Predicting outcomes and trends.
Improving customer experiences
34. ADVANTAGES DATA SCIENCE
1.Autocomplete
AutoComplete feature is an important part of Data Science
where the user will get the facility to just type a few letters or
words, and he will get the feature of auto-completing the line.
2.In Search Engines
The most useful application of Data Science is Search
Engines. As we know when we want to search for something
on the internet, we mostly used Search engines like Google,
Yahoo, Safari, Firefox, etc. So Data Science is used to get
Searches faster.
3.In Finance
Data Science plays a key role in Financial Industries.
Financial Industries always have an issue of fraud and risk of
losses.
35. ADVANTAGES DATA SCIENCE
4.In E-Commerce
E-Commerce Websites like Amazon, Flipkart, etc.
uses data Science to make a better user
experience with personalized recommendations.
5.Image Recognition
Currently, Data Science is also used in Image
Recognition. For Example, When we upload our
image with our friend on Facebook, Facebook
gives suggestions Tagging who is in the picture.
This is done with the help of machine learning
and Data Science.
44. CONCLUSION
Data science is a rapidly growing field
that has become an essential part of many
industries. It provides a powerful set of tools and
techniques that enable businesses to gain insights
and make data-driven decisions.
The field of data science requires a
strong background in mathematics, statistics, and
computer science, as well as the ability to
communicate complex technical concepts to both
technical and non-technical audiences. With the
continued growth of data and the increasing
demand for data-driven insights, data science is
expected to remain a critical area of focus for
businesses in the years to come.