SlideShare a Scribd company logo
EXPLORATORY ANALYTICS IN
PYTHON
Disclaimer for Course Material
• These course materials are for educational purposes only and shall not constitute
professional advice in any form to anyone
• These course materials have been designed as an integral part of the course
presentation and are intended solely for the benefit of delegates attending the
respective course(s). These course materials do not necessarily stand on their own and
are not intended to be relied upon for giving specific professional advice
• Best endeavours are used to ensure that these course materials are up-to-date when
recorded/printed. However, given the nature of the subject, professional advice should
be taken before taking any specific step in relation to any matter
• Nothing said or done by EY or its course presenters can be relied upon as professional
advice by anyone viewing/attending the eLearning(s)/course(s) or anyone else
viewing/reading these materials. Any comments made by any course presenter does
not constitute professional advice and must not be relied upon as professional advice
by anyone viewing/attending the eLearning(s)/course(s) or anyone else
• All title, intellectual property and copyrights and other rights in these materials are
owned by EY or its suppliers. All such rights are reserved and remained vested in EY or
its suppliers, and are not transferred in any way
• No part of these course materials may be reproduced in any form, in whole or in part,
for any purpose without the prior permission in writing of EY. No part of these course
materials shall be stored in any electronic knowledge-base, or data retrieval system
without the prior permission in writing of EY
CO N T E N T S
Introduction ..................................................................................................................5
Section 1: Introduction to Python Programming.............................................................6-8
1. Overview of Programming languages
• Machine Language
• Assembly L
2. History of Python
3. Installing Anaconda
Section 2: Python Programming .................................................................................9-12
1. Writing a Python program
2. Python character set and core data types
3. List,Tuple,Dictionary and sets
Section 3: Operator and Expressions.........................................................................13-21
1. Arithmetic operators
2.Operational precedence and associativity
3.BitWise operator
4. Compound Assignment Operator
5. Mini Project: GST Calculator
Section 4: Decision statement ..................................................................................22-27
Part 1: Decision making statements
1. The IF statement
2. The ELSE IF statement
3. NESTED IF statement
4. Multiway IF-ELIF-ELSE statement
Part 2: Expressions
1. Conditional Expression
2. Mini Project: Finding days in a month
Section 5: The LOOP statement................................................................................28-36
1. The WHILE loop
2. The RANGE function
3. The FOR loop
4. Nested loops and BREAK statement
5. The CONTINUE statement
6. Mini Project: Generate prime numbers using Charles Babbage Function
Section 6: String and character ................................................................................37-40
1. Comment and DOC Strings
2. Strings using Python
3. Mini Project: Generate prime numbers using Charles Babbage Function
Section 7: Functions using Python ............................................................................41-51
1. Syntax and basics of a function
2. Use of functions
3. Parameters and arguments in a function
4. the local and global scope of a variable
5. The RETURN statement and Recursive function
6. Mini Project
Section8 : Data analysis with Python libraries………………………………………………………………52-72
Part 1: Python Libraries introduction
1. How to load Python libraries
2. Panda Overview and purpose
Part 2: Reading data using Pandas
1. Reading CSV files
2. Reading excel files
3. Reading JSON files
4. Reading SQL databases
Part 3: Data frames and exercise
1. Exploring Data frames
2. Exercise and project on Pandas
Part 4: Some Python Libraries
1. NumPy
2. Matplotlib
3. Seaborn
Exercise and Projects…………………………………………………………………………………..73-82
Interview Q&A……………………………………..…………………………………………..………..83-91
I N T R O D U C T I O N
➢ Talking about Data science and Artificial Intelligence, we all have heard of Python
as the main language responsible for carrying out all the important tasks in these
areas. Python is the most popular language of 21st century that was created by
Guido Van Rossum and came in consideration in 1991 when it was released.
➢ Python is a remarkable and super advanced language for almost every problem
that is not addressed by most of the computer languages these days.
➢ Whether you want to create web applications or it is about handling big data and
complex math problems to database problems and creating workflows, Python
has it all.
In this book, we will learn everything that can be addressed through Python. We will
learn from basic programming fundamentals to advanced Python. Python libraries are
also included in the book to make you full versed with the language. By the end of the
book you will be able to program in Python with all the basics set right and knowing
what to do with the logics.
SECTION 1: GETTING STARTED
INTRODUCTION TO PYTHON PROGRAMMING
Key Objective
• Overview of Programming Language
a) Machine Language
b) Assembly Language
• History of Python
• Pre Read : Installing Anaconda
Overview of Programming Languages
Programming languages are sets of rules and instructions that are used to create software
programs, applications, and operating systems. There are various types of programming
languages, and each one serves a specific purpose.
a) Machine Language:
Machine language is the lowest-level programming language and is also known as the binary
language. It consists of instructions that can be directly executed by a computer's central
processing unit (CPU). These instructions are written in the form of binary code, which consists
of 0s and 1s.
b) Assembly Language:
Assembly language is a low-level programming language that uses symbolic instructions
instead of binary code. It is also known as Assembly or Assembler. Assembly language is one
step higher than machine language and is specific to a particular computer architecture. It is
easier to read and write than machine language, and programs written in assembly language
are usually faster and more efficient than those written in higher-level programming languages.
c) High-level Language:
High-level programming languages are languages that are designed to be easier to read, write,
and understand than low-level programming languages such as machine language and
assembly language. High-level languages are often used for software development and
programming, as they provide a simpler and more abstract way of thinking about programming
concepts. Some popular high-level languages include Python, Java, C#, Ruby, and JavaScript.
History Of Python
Python is a high-level, general-purpose programming language that was created in the
late 1980s by Guido van Rossum, a Dutch programmer. Here is a brief history of Python:
o In 1989, Guido van Rossum began working on a new programming language that
he called "Python." He was working at the National Research Institute for
Mathematics and Computer Science in the Netherlands at the time.
o The first version of Python, version 0.9.0, was released in February 1991. It was
a simple language, with only about 50 keywords.
o In 1994, Python 1.0 was released. This version added support for modules, the
lambda keyword, and a garbage collector.
o Python 2.0 was released in 2000. This version added many new features,
including list comprehensions, a garbage collector for cycles, and support for
Unicode.
o Python 3.0 was released in 2008. This version was a major revision of the
language, with many backwards-incompatible changes. The main goal of Python
3.0 was to clean up the language and remove some of the inconsistencies and
cruft that had accumulated over the years.
o Since the release of Python 3.0, the Python community has been working to
migrate the ecosystem to the new version. However, many libraries and
applications still rely on Python 2.7, which is the last version of the 2.x series.
o Python has become one of the most popular programming languages in the world,
used for web development, data analysis, artificial intelligence, scientific
computing, and more. It has a large and active community of developers who
contribute to the language and its ecosystem of libraries and tools.
• Pre Read : Installing Anaconda
Installing Anaconda on Windows
Anaconda distribution of Python is recommended for this course.
This section details the installation of the Anaconda distribution of Python on Windows 10.
Anaconda comes bundled with about 600 packages pre-installed
including NumPy, Matplotlib and SymPy.
Go to the following link: Anaconda.com/downloads
The Anaconda Downloads Page will look something
like this:
SECTION 2: PYTHON PROGRAMMING
BASIC OF PYTHON PROGRAMMING
Key Objective
• Writing our 1ST Python Program
• Python Character Set
• Python Core Data Types
a) Integer, Float, Complex Number, Boolean, String Type
• List, Tuple, Dictionary, Sets
WRITING OUR 1ST PYTHON PROGRAM
Follow these steps:
1. Open Anaconda Navigator and launch Jupyter Notebook.
2. In Jupyter Notebook, click on the "New" button in the top right corner and select
"Python 3" to create a new Python notebook.
3. In the first cell of the notebook, type the following code:
4. Click on the "Run" button in the toolbar or press "Shift + Enter" to execute the code in
the cell.
5. You should see the output "Hello, World!" displayed below the cell.
Congratulations, you've just written your first Python program!
Note: Python is an interpreted language, which means you can run code line-by-line in a
notebook like Jupyter. You can add more cells to your notebook and experiment with
different Python commands and syntax.
PYTHON CHARACTER SET
The Python Character Set refers to the set of characters that can be used in Python code.
Python supports a wide range of characters, including:
PYTHON CORE DATA TYPES
Python has several built-in core data types, which are fundamental to the language and used
extensively in programming. The following are the core data types in Python:
•Uppercase and lowercase letters (A-Z, a-z)
•Digits (0-9)
•Special characters (such as $, #, %, &, *, @, etc.)
•Whitespace characters (such as space, tab, newline, etc.)
Integer: An integer is a whole number without a decimal point. In Python, integers can be
positive or negative, and can be of any size (up to the available memory of the system). Integers
are represented using the int type. Example: x = 5
Float: A float is a number with a decimal point. In Python, floating-point numbers are
represented using the float type.
Example: y = 3.14
Complex Number: A complex number is a number with a real and imaginary part. In
Python, complex numbers are represented using the complex type. The real and
imaginary parts are separated by a + sign, and the imaginary part is suffixed with a
jExample: z = 2 + 3j
Boolean: A boolean is a binary value that represents either true or false. In Python, the
bool type is used to represent boolean values. The two possible values are True and
False.
Example: a = True
These core data types provide the building blocks for writing programs in Python and are used
extensively in most Python applications.
String: A string is a sequence of characters enclosed within quotes, either single quotes
('...') or double quotes ("..."). In Python, strings are represented using the str type.
Example: s = "Hello, World!"
LIST, TUPLE, DICTIONARY, SETS WITH EXAMPLE
▪ List: A list is a mutable sequence of elements enclosed in square brackets [ ]. Each
element in a list is separated by a comma. Here's an example:
▪ Tuple: A tuple is an immutable sequence of elements enclosed in parentheses ( ). Each
element in a tuple is separated by a comma. Here's an example:
▪ Dictionary: A dictionary is a collection of key-value pairs enclosed in curly braces { }.
Each key-value pair is separated by a colon, and the keys are unique. Here's an
example:
▪ Set: A set is an unordered collection of unique elements enclosed in curly braces { }.
Here's an example:
SECTION 3: OPERATOR AND EXPRESSIONS
INTRODUCTION OPERATOR AND
EXPRESSIONS
Operators and expressions are fundamental concepts in computer programming, including
Python. An operator is a symbol or keyword that performs an operation on one or more
operands. An expression is a combination of values, variables, operators, and function calls
that are evaluated to produce a result.
▪ ARITHMETIC OPERATORS
a) UNARY OPERATOR
b) BINARY OPERATOR
Arithmetic operators are used in Python to perform basic arithmetic operations such as
addition, subtraction, multiplication, division, modulus, and exponentiation. There are two
types of arithmetic operators based on the number of operands they take - unary and binary
operators.
A unary operator takes only one operand, whereas a binary operator takes two operands.
Unary Operators:
• The unary plus operator (+) is used to indicate that a value is positive, although it is
optional because numbers are assumed to be positive by default.
• The unary minus operator (-) is used to negate a value and make it negative.
Example:
Binary Operators:
• The addition operator (+) is used to add two operands.
• The subtraction operator (-) is used to subtract one operand from another.
• The multiplication operator (*) is used to multiply two operands.
• The division operator (/) is used to divide one operand by another.
• The modulus operator (%) is used to get the remainder of a division operation.
• The exponentiation operator (**) is used to raise one operand to the power of another.
Example:
These are some of the basic arithmetic operators in Python, and they are essential to
performing mathematical calculations in programs.
• OPERATION PRECEDENCE AND ASSOCIATIVITY
a) EXAMPLE OF OPERATOR PRECEDENCE
b) ASSOCIATIVITY
a) Operator precedence refers to the order in which operators are evaluated in an
expression. When there are multiple operators in an expression, the operator with
higher precedence is evaluated first. Here's an example:
In this expression, the multiplication operator (*) has a higher precedence than the addition
operator (+). So the expression is evaluated like this:
First, 4 * 5 is evaluated to give 20. Then, 3 + 20 is evaluated to give 23. Finally, the result 23
is assigned to the variable x.
b) Associativity refers to the order in which operators of the same precedence are
evaluated in an expression. Some operators are left-associative, meaning they are
evaluated from left to right. Others are right-associative, meaning they are evaluated
from right to left. Here's an example:
In this expression, the subtraction operator (-) has left-associativity. So the expression is
evaluated like this:
First, 10 - 5 is evaluated to give 5. Then, 5 - 3 is evaluated to give 2. Finally, the result 2 is
assigned to the variable x.
• BITWISE OPERATOR
And Operator
OR Operator
XOR Operator
Right Shift
Lift Operator
Bitwise operators are used in computer programming to manipulate the individual bits of
binary numbers. Here are the definitions of the five most common bitwise operators:
a) AND operator (&):
Explanation: The & operator performs a bitwise AND operation on the binary representations
of a and b, and stores the result in result. In this case, 5 in binary is 101 and 3 in binary is
011, so the bitwise AND of the two is 001, which is equal to 1 in decimal.
b) OR operator (|):
Explanation: The | operator performs a bitwise OR operation on the binary representations of
a and b, and stores the result in result. In this case, 5 in binary is 101 and 3 in binary is 011,
so the bitwise OR of the two is 111, which is equal to 7 in decimal.
c) XOR operator (^):
Explanation: The ^ operator performs a bitwise XOR operation on the binary representations
of a and b, and stores the result in result. In this case, 5 in binary is 101 and 3 in binary is
011, so the bitwise XOR of the two is 110, which is equal to 6 in decimal.
d) Right shift (>>):
Explanation: The >> operator performs a right shift operation on the binary representation of
a, shifting the bits two positions to the right, and stores the result in result. In this case, 16 in
binary is 10000, and shifting the bits two positions to the right gives 00100, which is equal to
4 in decimal.
e) Left shift (<<):
Explanation: The << operator performs a left shift operation on the binary representation of
a, shifting the bits two positions to the left, and stores the result in result. In this case, 4 in
binary is 00100, and shifting the bits two positions to the left gives 10000, which is equal to
16 in decimal.
THE COMPOUND ASSIGNMENT OPERATOR
In Python, the compound assignment operators are used to perform an arithmetic operation
and assign the result to the same variable in a single statement.
Here are some examples:
1. Addition and assignment:
2. Subtraction and assignment:
3. Multiplication and assignment:
4. Division and assignment:
5. Modulo and assignment:
6. Exponentiation and assignment:
7. Floor and assignment:
Mini Project:- GST Calculator
What is GST?
GST stands for Goods and Services Tax, which is a value-added tax levied on the sale of goods
and services in many countries around the world. GST is a comprehensive, multistage,
destination-based tax that is levied on every value addition in the supply chain. It is a single
tax that replaced multiple indirect taxes like excise duty, service tax, VAT, etc., in India.
Problem Statement
We all buy various goods from a store. Along with the price of the goods we wish to buy, we
also have to pay an additional tax, which is calculated as a specific percentage on the total
price of the goods. This is called GST on the productions.
Model of GST Using an Example
The GST has two components, viz. one which is levied by the central government (referred to
as centralGST or CGST), and one levied by the state government (referred to as state GST or
SGST). The rates for central GST and State GST are given as Follows:
Types of Tax Tax Rate
CGST @9%
SGST @9%
Example
Invoice of a Product
Particulars GST on Particulars
Cost of Production 5000
Add: CGST @9% 450
Add: SGST @9% 450
Total Cost of Production: 5900
Formula to Calculate Total Cost
(CGST Tax Rate on Product)+(SGST Tax Rate on Product)
Algorithm
Step 1: Read Cost of Production
Step 2: Input the CGST Tax rate
Step 3: Input the SGST tax rate
Step 4: Calculate and print the total cost of production.
Program and Outputs
SECTION 4: DECISION STATEMENT
SETS OF DECISION STATEMENT
• DECISION MAKING STATEMENT
A. THE IF STATEMENT
B. THE IF-ELSE STATEMENT
C. NESTED IF STATEMENTS
D. MULTI-WAY IF-ELIF-ELSE STATEMENT
The IF Statement:
The if statement is used to execute a block of code only if a certain condition is true. If the
condition is false, the code inside the “if block” is skipped. The syntax for the “if statement” in
Python is as follows:
Example
The IF-ELSE Statement:
The if-else statement is used to execute a block of code if the condition is true and another
block of code if the condition is false. The syntax for the if-else statement in Python is as follows:
Example:
Nested IF Statements:
Nested if statements are if statements inside other if statements. They are used when more
than one condition needs to be checked. The syntax for “nested if” statements in Python is as
follows:
Example:
Multi-way IF-ELIF-ELSE Statement:
The if-elif-else statement is used when there are more than two conditions to be checked. The
elif keyword is used for additional conditions to be checked. The syntax for the if-elif-else
statement in Python is as follows:
Example:
CONDITIONAL EXPRESSION
Conditional expressions, also known as ternary operators, are a shorthand way to write an if-
else statement in a single line. The syntax for conditional expressions in Python is as follows:
The condition is evaluated first, and if it is True, the expression returns the value_if_true. If the
condition is False, the expression returns the value_if_false.
Example:
In the above example, the if-else statement is written as a conditional expression. If x is greater
than y, the expression returns the string "x is greater than y", otherwise it returns the string "y
is greater than or equal to x".
Mini Project:-Finding the Number of Days in a Month
This mini project will make use of Programming features such as if Statement and Elif
statements. It will help a programmer to known the number days in a month.
Hint: If entered the month is 2 then read the corresponding year. To known the number of
days in month 2, check if the entered year is a leap year. If leap then Num_days=29 or not
leap then in Num_days= 28for month 2, respectively.
Leap year: A leap year is divisible by 4 but not by 100 or divisible by 400
Algorithm:
Step 1: prompt the month from the user.
Step 2: Check if the entered month is 2, i.e. February. If so then go to step 3, else go to step
4.
Step 3: if the entered month is 2 then check if the year is a leap year. If it is a leap year then
store num_days=29, else num_days=28.
Step 4: If the entered month is one of the following from the list (1,3,5,7,8,12) then stores
num_days=31. Or if the entered month is from the list (4,6,9,11) then store num_days=29. If
the entered month is different from the range (1 to 12) then display message “invalid month”.
Step 5: If the input is valid then display the message as “there are N number of days in the
month M”.
Program and output
SECTION 5: LOOP STATEMENT
LOOP CONTROL STATEMENT
• THE WHILE LOOP
a) DETAIL OF WHILE LOOP
b) SOME MORE PROGRAM ON WHILE LOOP
DETAIL OF WHILE LOOP:
The while loop is a control flow statement that allows you to execute a block of code repeatedly
as long as a specified condition is true. The general syntax of a while loop in Python is:
The condition is a boolean expression that is evaluated at the beginning of each iteration of the
loop. If the condition is True, the code block is executed. This process repeats until the condition
is False.
Example
In this example, num starts at 1, and the loop continues as long as num is less than or equal to
5. Inside the loop, we print the value of num and then increment it by 1 using the += operator.
Some More Program on while Loop
Program 1: Printing Even Number
Program 2: Calculating Factorial
Program 3: Guessing Game
Program 4: Summing numbers from 1 to 100:
Program 5: Simulating rolling a die until a certain number is rolled:
Program 6:Reversing a string using a while loop:
• THE RANGE () FUNCTION
a) EXAMPLE OF RANGE () FUNCTION
The range() function in Python is used to generate a sequence of numbers. It takes three
arguments: start, stop, and step. The start argument is optional and defaults to 0, while the
step argument is also optional and defaults to 1. The stop argument is required and specifies
the upper limit of the sequence, but this upper limit is not included in the sequence.
Example 1:Creating a list of odd numbers from 1 to 20 using the range() function.
Explanation: The range(1, 21, 2) function creates a sequence of odd numbers from 1 up to,
but not including, 21, with a step of 2. The list() function then converts this sequence into a list,
which is assigned to the variable odd_numbers. Finally, the print() function is used to display
the contents of the odd_numbers list.
• THE FOR LOOP
a) DETAILS OF FOR LOOP
b) SOME MORE PROGRAM ON FOR LOOP
The for loop is a control flow statement in Python that allows you to iterate over a sequence
of elements, such as a list, tuple, string, or range. The general syntax for a for loop in Python
is as follows:
In this syntax, variable is a temporary variable that takes on the value of each element in the
sequence on each iteration of the loop. The code block indented under the for statement is
executed once for each element in the sequence.
Here are some more examples of using the for loop in Python:
Example 1:Printing the elements of a list using a for loop.
Explanation: The fruits list contains four elements, which are iterated over in the for loop. On
each iteration, the current element is assigned to the fruit variable, which is then printed using
the print() function.
Example 2:Printing a pattern of asterisks using nested for loops.
Explanation: This example uses nested for loops to print a pattern of asterisks. The outer for
loop iterates through the numbers from 0 to 4, while the inner for loop iterates through the
numbers from 0 to the current value of i (which increases by 1 on each iteration of the outer
loop). The print() function is used to print a single asterisk on each iteration of the inner loop,
and the end parameter is used to prevent each asterisk from being printed on a new line. Finally,
the outer print() function is used to print a new line after each row of asterisks.
• NESTED LOOPS
a) SOME MORE PROGRAMS ON FOR LOOPS
Nested loops are loops that are contained within other loops. They are used when you need to
iterate over a sequence of elements multiple times, or when you need to iterate over a
sequence of sequences (such as a list of lists).
Example 1:Multiplication table using nested loops.
THE BREAK STATEMENT
In Python, the break statement is used to exit a loop prematurely, before the loop has
completed all iterations. The break statement is usually placed inside a conditional statement,
and when executed, it causes the loop to terminate immediately and execution continues with
the next statement after the loop.
Here is an example of using the break statement in Python:
In this example, we are trying to find the first prime number in a given range of numbers. We
use two nested loops: the outer loop iterates over the range of numbers, and the inner loop
checks whether each number is prime or not. If a factor is found for a number, the inner loop is
terminated prematurely using the break statement. If a prime number is found, the outer loop
is also terminated using another break statement. If no prime number is found, a message is
printed to indicate that.
Note that in the inner loop, there is an else block that is executed only if the inner loop
completes all iterations without encountering a break statement. This else block is not executed
if the loop is terminated prematurely using a break statement.
The break statement is a very useful tool in programming, as it allows us to control the flow of
the program and exit loops when certain conditions are met. However, it should be used
judiciously, as overusing it can make the code harder to read and debug.
THE CONTINUE STATEMENT
In Python, the continue statement is used to skip the current iteration of a loop and move on
to the next iteration, without executing any of the remaining statements in the loop for the
current iteration. The continue statement is usually placed inside a conditional statement, and
when executed, it causes the loop to skip the current iteration and move on to the next iteration.
Here is an example of using the continue statement in Python:
In this example, we are trying to print only the odd numbers in a given range of numbers. We
use a for loop to iterate over the range of numbers, and a conditional statement to check
whether each number is odd or even. If the number is even, the continue statement is executed,
causing the loop to skip the remaining statements for the current iteration and move on to the
next iteration. If the number is odd, the print statement is executed and the loop continues with
the next iteration.
The continue statement is a very useful tool in programming, as it allows us to skip certain
iterations of a loop and focus on the ones that are relevant. It can be used to simplify code and
improve performance in some cases. However, like the break statement, it should be used
judiciously, as overusing it can make the code harder to read and debug.
Mini Project:-Generate Prime Numbers using Charles
Babbage Function
Program Statement:
Write a Python function that generates prime numbers using Charles Babbage's sieve algorithm.
The function should take an integer as input and return a list of all prime numbers up to that
integer.
Program
SECTION 6:STRING AND CHARACTER
Algorithm:
Create a list of numbers from 2 to the given integer.
• Create a list of numbers from 2 to the given integer.
• Create a list of numbers from 2 to the given integer.
Initialize an empty list to store prime numbers.
• Create a list of numbers from 2 to the given integer.
• Create a list of numbers from 2 to the given integer.
While the list of numbers is not empty, take the first number in the list and append it to the
list of prime numbers.
• Create a list of numbers from 2 to the given integer.
• Create a list of numbers from 2 to the given integer.
Remove all multiples of the first number from the list of numbers.
Remove all multiples of the first number from the list of numbers.
• Create a list of numbers from 2 to the given integer.
• Create a list of numbers from 2 to the given integer.
Repeat steps 3 and 4 until the list of numbers is empty. Return the list of prime numbers
• Create a list of numbers from 2 to the given integer.
STRING AND CHARACTER
COMMENT AND DOC STRINGS
In Python, comments and docstrings are used to provide information about the code. They
are not executed as part of the program, but they help to explain the code and make it easier
to understand and maintain.
Comments:
Comments are used to provide short explanations or annotations to the code. Comments in
Python start with a hash (#) symbol and continue until the end of the line. Python ignores
everything in the comment after the hash symbol.processes. Cognitive computing is used
for applications such as fraud detection, personalized marketing, and virtual assistants.
Docstrings:
Docstrings are used to provide documentation for classes, functions, modules, or methods.
Docstrings are written in triple quotes (""") and can span multiple lines. They are typically used
to describe the purpose, usage, and behavior of the code.
DIVING DEEP WITH STRINGS WITH USING PYTHON
In Python, strings are used to represent textual data. They are enclosed in quotes, either single
quotes ('...') or double quotes ("..."). Here are some of the most common operations and
concepts related to strings in Python:
String Concatenation:
Strings can be concatenated using the '+' operator or by simply placing two strings next to each
other.
Example:
String Indexing:
You can access individual characters of a string by indexing. In Python, string indexes start
from 0. You can also use negative indexing to access characters from the end of the string.
Example:
String Slicing:
You can extract a part of a string by slicing it. A slice is specified by two indices separated by a
colon. The first index is included in the slice, but the second index is not.
Example:
String Formatting:
String formatting allows you to embed values in a string. There are several ways to format
strings in Python, but the most common is to use placeholders that are replaced by values using
the format() method.
Example:
String Methods:
Python provides many built-in methods for working with strings, such as upper(), lower(), split(),
strip(), replace(), and many others. These methods are called on a string object and return a
new string with the modified value.
Example:
These are just a few examples of the many operations and concepts related to strings in Python.
By mastering these basics, you'll be well on your way to becoming proficient with strings in
Python.
SECTION 7: FUNCTION
FUNCTION USING PYTHON
SYNTAX AND BASICS OF A FUNCTION
In Python, a function is a block of code that performs a specific task. Functions are defined
using the def keyword, followed by the function name, a set of parentheses, and a colon. The
body of the function is indented and contains the code that performs the task.
Here is the syntax for defining a function in Python:
Example:
In this example, we define a function called add_numbers() that takes two parameters a and b.
The function then returns the sum of a and b.
To call the function and store the result in a variable, we use result = add_numbers(2, 3). The
value of result is then printed to the console, which outputs 5.
These are just a basic examples of how to define and use functions in Python. Functions are a
powerful tool for organizing and reusing code, and they can be used to simplify complex tasks
and make your code more readable and maintainable.
USE OF FUNCTIONS
Functions are an important aspect of programming in Python, as they allow you to reuse code
and organize it into modular pieces. Here are a few examples of how functions can be used in
Python:
Addition Function:
In this example, we define a function called add_numbers() that takes two parameters a and b.
The function then returns the sum of a and b.
Multiplication Function:
In this example, we define a function called multiply_numbers() that takes two parameters a
and b. The function then returns the product of a and b.
String Reversal Function:
In this example, we define a function called reverse_string() that takes one parameter s, which
is a string. The function then returns the reverse of the string using slicing.
List Sum Function:
In this example, we define a function called sum_list() that takes one parameter lst, which is a
list of numbers. The function then returns the sum of all the elements in the list using the built-
in sum() function.
• PARAMETERS AND ARGUMENTS IN A FUNCTION
In Python, parameters and arguments are terms used to describe the values that are passed
into a function. Here are three types of parameters and arguments that can be used in a
function:
Positional Arguments:
Positional arguments are the most common type of argument in Python functions. They are
values passed to a function in a specific order, and are assigned to the function parameters in
the same order. Here is an example:
In this example, the function greet() takes in two positional arguments: name and message.
When we call the function, we pass in the values "Alice" and "Hello". These values are
assigned to the function parameters in the same order, so name gets assigned "Alice" and
message gets assigned "Hello".
Keyword Arguments:
Positional Arguments
Keywords Argument
Parameter with Default Values
Keyword arguments are used to pass values to a function using their parameter names. This
allows you to pass the values in any order, as long as you specify which parameter they are
meant to be assigned to. Here is an example:
In this example, we use keyword arguments to pass the values "Hello" and "Alice" to the
greet() function. By specifying the parameter names, we can pass the values in any order we
like.
Parameters with Default Values:
In some cases, you may want to give a parameter in a function a default value, so that it can
be omitted when the function is called. Here is an example:
In this example, we have given the message parameter a default value of "Hello". When we
call the greet() function with just "Alice", the default value is used for message. However, we
can still override the default value by passing in a different value for message when we call
the function.
These are some examples of how you can use parameters and arguments in Python functions.
By using these tools, you can make your functions more flexible and reusable, and save
yourself time and effort in the process.
THE LOCAL AND GLOBAL SCOPE OF A VARIABLES
In Python, variables can have either local or global scope. The scope of a variable refers to
the parts of the program where that variable can be accessed.
Local Variables:
Local variables are variables that are declared inside a function. They are only accessible
within the function in which they are defined. Here is an example:
In this example, the variable x is defined inside the function my_function(). This means that it
is only accessible within the function. If we try to print the value of x outside the function, we
will get an error:
Global Variables:
Global variables are variables that are declared outside of any function, and can be accessed
from anywhere in the program. Here is an example:
In this example, the variable x is defined outside of any function, which makes it a global
variable. This means that it can be accessed from anywhere in the program, including inside
the function my_function().
However, if you try to modify the value of a global variable inside a function, you will need to
use the global keyword to indicate that you want to modify the global variable, not create a
new local variable with the same name. Here is an example:
In this example, we use the global keyword to indicate that we want to modify the global
variable x inside the function my_function(). Without this keyword, Python would create a new
local variable with the same name, which would not affect the global variable. By using the
global keyword, we can modify the global variable and see the changes outside the function
as well.
These are some examples of how local and global variables work in Python. By understanding
how variable scope works, you can write more flexible and reusable code, and avoid errors
caused by variable name clashes.
THE RETURN STATEMENT
In Python, the return statement is used to return a value from a function. When a function is
called, it may perform some operations and produce a result. This result can be returned to
the caller using the return statement.
Here is an example:
In this example, the add_numbers function takes two arguments x and y, adds them together,
and stores the result in a variable called result. The function then returns the value of result
using the return statement.
When we call the function with the arguments 3 and 4, it returns the value 7, which is stored
in the variable sum. We can then print the value of sum to the console.
RECURSIVE FUNCTION
A recursive function in Python is a function that calls itself during its execution. This can be
useful when we need to perform the same task repeatedly, with slightly different inputs each
time. Recursive functions can be a powerful tool for solving complex problems, but they can
also be tricky to implement correctly.
Here's an example of a simple recursive function in Python that calculates the factorial of a
number:
In this example, the factorial function takes a single argument n. If n is equal to 0, the
function returns 1, which is the base case. Otherwise, the function multiplies n by the result
of calling factorial with n-1 as the argument. This is the recursive case.
When we call factorial(4), the function checks if n is equal to 0. Since it's not, the function
multiplies n (which is 4) by the result of calling factorial(3). To calculate factorial(3), the
function again checks if n is equal to 0 (it's not), and multiplies n (which is 3) by the result of
calling factorial(2). This process continues until we reach the base case of factorial(0), at
which point the function returns 1. The final result is the product of all the numbers from n
down to 1, which is 4 * 3 * 2 * 1 = 24.
Recursive functions can be used to solve a wide variety of problems, but they can also be
computationally expensive if not implemented carefully. It's important to make sure that a
recursive function will eventually reach the base case and terminate, and to avoid
unnecessary recursive calls that could cause the function to run for a long time or run out of
memory.
Mini Project:- Calculation of Compound Interests and
Yearly Analysis of Interests and Yearly Analysis of Interest
and Principle Amount
Problem statement:
Write a Python program to calculate the compound interest based on the user's input of
principle amount, annual interest rate, number of years, and the number of times the interest
is compounded per year. The program should also generate a yearly analysis of the interest
and principle amount for each year of the investment.
Algorithm:
• Get input from the user for principle amount, annual interest rate, number of years
and the number of times the interest is compounded per year.
• Calculate the compound interest using the formula:
A = P (1 + r/n)^(n*t)
where A is the amount after t years, P is the principle amount, r is the annual interest
rate, n is the number of times the interest is compounded per year, and t is the
number of years.
• Print the amount and the compound interest.
• Generate a yearly analysis of the interest and principle amount for each year of the
investment by using a loop to iterate over the number of years, and calculate the
interest and principle amount for each year using the formula:
interest = (P * r/n)
principle = (A - interest)
• Print the yearly analysis for each year.
Program:
SECTION 8: DATA ANALYSIS
DATA ANALYSIS WITH PYTHON LIBRARIES
Python has a variety of libraries for data analysis, each with its own strengths and
weaknesses. Here are some of the most commonly use liabraries:
These libraries are all open source and have active communities of developers who contribute
new features and bug fixes. By combining these libraries, Python provides a powerful platform
for data analysis and machine learning.
How To Load Python Libraries
To load Python libraries, you can use the import statement followed by the name of the library
you want to use. Here's a general syntax:
For example, to load the NumPy library, you would write:
If you want to use a shorthand name for the library, you can use the as keyword followed by an
abbreviation of your choice:
For example, to load the Pandas library with the abbreviation pd, you would write:
Once you have loaded a library, you can use its functions and objects in your code by prefixing
them with the library name or abbreviation. For example, to use the randint() function from the
NumPy library, you would write:
Note that you need to specify the library name or abbreviation when calling a function or object
from a library that you have loaded.
Pandas Overview
Pandas is a Python library for data manipulation and analysis. It provides data structures for
working with labeled and relational data, as well as tools for data cleaning, merging, and
aggregation. Here is an overview of some of the key features of Pandas:
Here is an example of how to use Pandas to read a CSV file and perform some basic data
analysis:
Pandas Purpose
Pandas is a Python library that provides data manipulation and analysis tools for working with
structured data, such as tabular or time-series data. Its main purpose is to make it easy to
work with data in Python by providing data structures and functions that simplify common
data analysis tasks.
Pandas provide two main data structures: Series and Data Frame. A Series is a one-
dimensional array-like object that can hold any data type, such as integers, floats, strings, and
more. A DataFrame is a two-dimensional table-like structure that consists of rows and
columns, where each column can have a different data type.
Pandas have a wide range of functions for data cleaning, data transformation, and data
analysis, such as filtering, grouping, merging, joining, reshaping, and more. It also has built-in
support for handling missing data, time-series data, and working with different file formats,
such as CSV, Excel, SQL, and more.
Overall, Pandas is widely used in data analysis, scientific research, finance, and other fields
where data manipulation and analysis are required.
Reading Data Using Pandas
Pandas is a popular open-source data manipulation library for Python. It is commonly used to
read, manipulate and analyze data. In this tutorial, we will cover how to read data using
Pandas.
First, you will need to install Pandas. You can install it using pip command:
Once you have installed Pandas, you can import it into your Python script or Jupyter
Notebook by running:
Now let's look at the different ways to read data using Pandas:
Reading CSV Files
CSV (Comma Separated Values) is a commonly used format for data storage and exchange.
You can use Pandas to read CSV files as follows:
This will read the CSV file named "file.csv" and create a Pandas Data Frame object named
"df".
Reading Excel Files
Excel files can also be read using Pandas. You can use the read_excel() function to read an
Excel file.
This will read the Excel file named "file.xlsx" and create a Pandas DataFrame object named
"df".
Reading JSON Files
JSON (JavaScript Object Notation) is a lightweight data interchange format. Pandas can also
read JSON files using the read_json() function.
This will read the JSON file named "file.json" and create a Pandas DataFrame object named
"df".
Reading SQL Databases
Pandas can also read data from SQL databases. You can use the read_sql() function to read
data from a SQL database.
This will read all the data from the "table_name" table in the SQL database named
"database.db" and create a Pandas DataFrame object named "df".
These are some of the ways you can read data using Pandas. Once you have read the data
into a DataFrame object, you can use various Pandas functions to manipulate and analyze the
data.
Exploring Data Frames:
➢ Data frames are a two-dimensional labeled data structure in pandas that can hold data
of different types (numeric, string, Boolean, etc.) in columns.
➢ To explore a data frame, you can use methods such as head(), tail(), info(), describe(),
shape, columns, dtypes, and isnull().
➢ These methods help to understand the size of the data, column names, data types,
missing values, and other relevant information.
These are some basic operations you can perform on data frames in Python using Pandas.
Pandas provide many more powerful functions and operations for working with data frames
that you can explore further.
Pandas Based Questions
1. What is a pandas data frame and how is it different from a panda’s series?
2. How do you read a CSV file into a panda’s data frame?
3. How do you drop columns and rows from a panda’s data frame?
4. How do you merge two pandas data frames?
5. How do you calculate summary statistics (such as mean, median, and standard
deviation) for a panda’s data frame?
Pandas Based Projects
Build a movie recommendation system using pandas: Load a movie dataset into a pandas
data frame, clean and preprocess the data, and then use pandas to build a
recommendation system that suggests movies based on user preferences (e.g., genre,
rating). You can use techniques such as collaborative filtering or content-based filtering.
Analyze a sales dataset using pandas: Load a sales dataset into a pandas data frame,
clean and preprocess the data (e.g., remove duplicates, handle missing values), and then
use pandas to explore and analyze the data (e.g., calculate total sales, average sales by
product, visualize sales trends).
NumPy
NumPy is a Python library for numerical computing that provides support for large, multi-
dimensional arrays and matrices, along with a large collection of high-level mathematical
functions to operate on these arrays. Here's an overview of some of the most commonly used
NumPy functions along with examples:
Creating Arrays:
• NumPy. Array(): Create a NumPy array from a Python list.
numpy.zeros(): Create an array of zeros with a specified shape.
numpy.ones(): Create an array of ones with a specified shape.
numpy.random.rand(): Create an array of random numbers with a specified shape.
Array Operation
numpy.shape(): Get the shape of an array.
NumPy. Reshape(): Reshape an array.
numpy.transpose(): Transpose an array.
numpy.concatenate(): Concatenate two or more arrays.
Mathematical Functions:
numpy.sum(): Calculate the sum of an array.
numpy.mean(): Calculate the mean of an array.
numpy.std(): Calculate the standard deviation of an array.
numpy.max(): Find the maximum value in an array.
NumPy Based Questions
1. What is NumPy, and why is it used in Python?
2. How do you create a NumPy array?
3. How do you reshape a NumPy array?
4. How do you perform element-wise multiplication of two NumPy arrays?
5. How do you find the maximum and minimum values in a NumPy array?
Matplotlib
Matplotlib is a Python library for creating visualizations, such as line plots, scatter plots, bar
charts, histograms, and more. It provides a wide range of tools for customizing plots and
adding annotations, and supports both static and interactive visualizations. Here's an
overview of some of the key features and functions of Matplotlib:
Here are some examples of Matplotlib functions:
Figures and Subplots: Matplotlib uses a Figure object to represent the entire window
or page that the plot is drawn on. Within a Figure, one or more Subplots can be
created to display different plots.
Plotting Functions: Matplotlib provides a variety of functions for creating different
types of plots. Some of the most commonly used functions include plot(), scatter(),
bar(), hist(), pie(), and imshow(), among others.
Customization: Matplotlib offers a wide range of options for customizing plots,
including changing line styles, colors, markers, fonts, labels, titles, legends, and
more. It also allows for adding annotations, such as text, arrows, and shapes, to the
plot.
Saving and Exporting: Matplotlib supports saving plots in a variety of formats, such
as PNG, PDF, SVG, and more.
1. Plot(): The plot() function is used to create a line plot. It takes x and y
arrays as arguments, and can also take optional parameters for
customizing the plot, such as color, linestyle, and marker.
Scatter(): The scatter() function is used to create a scatter plot. It takes x and y
arrays as arguments, and can also take optional parameters for customizing the
plot, such as color, size, and alpha.
Bar(): The bar() function is used to create a bar chart. It takes x and y arrays as
arguments, and can also take optional parameters for customizing the plot, such as
color, width, and align.
Hist(): The hist() function is used to create a histogram. It takes an array of values as
an argument, and can also take optional parameters for customizing the plot, such as
bins, range, and density.
Pie(): The pie() function is used to create a pie chart. It takes an array of values and
labels as arguments, and can also take optional parameters for customizing the
plot, such as colors, explode, and start angle.
Seaborn
Seaborn is a Python library for creating statistical visualizations. It is built on top of Matplotlib
and provides a higher-level interface for creating complex and informative plots. Seaborn
includes a range of statistical plotting functions, such as regression plots, distribution plots,
categorical plots, and more. Here's an overview of some of the key features and functions of
Seaborn:
Seaborn
Data Visualization: Seaborn provides functions for creating a
variety of visualizations, such as scatter plots, line plots, bar
plots, histogram, and many others.
Statistical Analysis: Seaborn also includes functions for
conducting statistical analysis, such as hypothesis testing and
descriptive statistics.
Styling: Seaborn provides a range of options for customizing
plots, including color palettes, themes, and grid styles.
Integration with Pandas: Seaborn integrates well with Pandas,
making it easy to work with data in data frames.
Here are some examples of Seaborn functions:
scatterplot(): The scatterplot() function is used to create a scatter plot. It takes x
and y variables as arguments, and can also take optional parameters for
customizing the plot, such as hue, size, and style.
lineplot(): The lineplot() function is used to create a line plot. It takes x and y
variables as arguments, and can also take optional parameters for customizing the
plot, such as hue, style, and markers.
histplot(): The histplot() function is used to create a histogram. It takes a variable
as an argument, and can also take optional parameters for customizing the plot,
such as bins, kde, and stat.
boxplot(): The boxplot() function is used to create a box plot. It takes a variable as
an argument, and can also take optional parameters for customizing the plot, such
as hue, order, and width.
catplot(): The catplot() function is used to create a categorical plot. It takes x and y
variables as arguments, and can also take optional parameters for customizing the
plot, such as kind, hue, and col.
These are just a few examples of the many functions available in Seaborn. For more
information and examples, please refer to the Seaborn documentation.
Questions
1. What is Python, and what are its key features?
2. What is a Python library, and how is it useful?
3. What is Pandas, and what are its key features?
4. What is NumPy, and what are its key features?
5. What is Matplotlib, and what are its key features?
6. What is Seaborn, and what are its key features?
7. How do you import a library in Python?
8. What are the different data structures in Python?
9. What is the difference between a list and a tuple in Python?
10.What is a Data Frame in Pandas?
11.How do you read data from a CSV file in Pandas?
12.How do you handle missing data in Pandas?
13.How do you perform data aggregation in Pandas?
14.What is a NumPy array, and how is it different from a list in Python?
15.How do you create a NumPy array?
16.What is a vectorized operation in NumPy?
17.How do you perform element-wise operations in NumPy?
18.What is the difference between a 1D, 2D, and 3D array in NumPy?
19.What is broadcasting in NumPy?
20.How do you create a histogram in Matplotlib?
21.What is a scatter plot in Matplotlib?
22.How do you create a scatter plot in Matplotlib?
23.What is a line plot in Matplotlib?
24.How do you create a line plot in Matplotlib?
25.What is a bar plot in Matplotlib?
26.How do you create a bar plot in Matplotlib?
27.What is a pie chart in Matplotlib?
28.How do you create a pie chart in Matplotlib?
29.What is a box plot in Seaborn?
30.How do you create a box plot in Seaborn?
Solution
1. Python is a high-level, interpreted programming language that is used for a wide range
of purposes, including web development, data analysis, machine learning, and more.
Its key features include easy-to-learn syntax, a large standard library, and support for
multiple programming paradigms such as object-oriented, procedural, and functional
programming.
2. A Python library is a collection of pre-written code that can be used to perform specific
tasks. Libraries provide a way to avoid writing code from scratch and allow
programmers to build on the work of others. They can be used for a wide range of
purposes, such as data analysis, web development, machine learning, and more.
3. Pandas is a popular open-source data analysis library for Python. It provides data
structures for efficiently storing and manipulating large datasets, as well as a wide
range of tools for working with data, including data cleaning, transformation, and
analysis. Key features of Pandas include powerful indexing and selection capabilities,
tools for merging and joining datasets, and support for time-series data.
4. NumPy is a Python library for working with numerical data. It provides a powerful
array data structure, as well as a wide range of tools for performing mathematical
operations on arrays, such as linear algebra, Fourier analysis, and more. Key features
of NumPy include efficient handling of large arrays, broadcasting for performing
operations on arrays of different shapes, and vectorized operations for improved
performance.
5. Matplotlib is a popular data visualization library for Python. It provides a wide range of
tools for creating visualizations, including line plots, scatter plots, bar charts, and
more. Key features of Matplotlib include support for a wide range of customization
options, the ability to create complex visualizations, and the ability to output
visualizations in a variety of formats.
6. Seaborn is a data visualization library for Python that is built on top of Matplotlib. It
provides a high-level interface for creating complex visualizations with fewer lines of
code, as well as a wide range of built-in styles and color palettes. Key features of
Seaborn include support for creating complex visualizations such as heatmaps, violin
plots, and more, and the ability to easily customize visualizations.
7. To import a library in Python, you can use the import statement, followed by the name
of the library. For example, to import the NumPy library, you would use the following
code:
This creates an alias for the NumPy library, so that you can refer to it as np in your
code.
8. The different data structures in Python include lists, tuples, sets, and dictionaries.
9. The main difference between a list and a tuple in Python is that lists are mutable,
meaning that their contents can be changed after they are created, while tuples are
immutable, meaning that their contents cannot be changed after they are created.
10.A Data Frame in Pandas is a two-dimensional table-like data structure with labeled
rows and columns, similar to a spreadsheet. It provides powerful tools for working with
structured data, including the ability to filter, sort, and manipulate data in a variety of
ways.
11.To read data from a CSV file in Pandas, you can use the read_csv() function. For
example, the following code reads a CSV file called data.csv and stores it in a Pandas
DataFrame called df:
12.To handle missing data in Pandas, we can use the .isna() function to identify missing
values in a DataFrame or Series, and then use the .fillna() function to replace the
missing values with a specified value or strategy. We can also use the .dropna()
function to remove rows or columns that contain missing values.
13.In Pandas, we can perform data aggregation using the .groupby() function, which
groups data based on one or more columns, and then applies an aggregation function
to each group to compute a summary statistic.
14.A NumPy array is a data structure that represents a multi-dimensional, homogeneous
array of values. It is different from a list in Python because it is more efficient for
numerical computations, supports vectorized operations, and has a fixed size.
15.To create a NumPy array, we can use the np.array() function and pass a list or tuple of
values as an argument. We can also create arrays with a specific shape and data type
using functions such as np.zeros(), np.ones(), np.arange(), and np.random.rand().
16.A vectorized operation in NumPy is an operation that applies to an entire array or a
subset of an array, rather than operating on individual elements. Vectorized
operations are much faster and more efficient than performing operations on
individual elements in a loop.
17.To perform element-wise operations in NumPy, we can use arithmetic operators such
as +, -, *, and / or functions such as np.add(), np.subtract(), np.multiply(), and
np.divide().
18.A 1D array in NumPy represents a single sequence of values, while a 2D array
represents a matrix with rows and columns, and a 3D array represents a cube with
multiple layers, rows, and columns.
19.Broadcasting in NumPy is a feature that allows arrays with different shapes to be used
in arithmetic operations. When arrays with different shapes are used in an operation,
NumPy automatically broadcasts the smaller array to match the shape of the larger
array.
20.To create a histogram in Matplotlib, we can use the plt.hist() function and pass a list or
array of values as an argument. We can also specify the number of bins, the range of
values, and other parameters to customize the histogram.
21.A scatter plot in Matplotlib is a visualization that displays the relationship between two
variables by plotting individual data points as points on a 2D coordinate system.
22.To create a scatter plot in Matplotlib, we can use the plt.scatter() function and pass
arrays of x and y values as arguments. We can also customize the appearance of the
scatter plot by specifying the color, size, and shape of the points.
23.A line plot in Matplotlib is a visualization that displays the relationship between two
variables by connecting individual data points with a line.
24.To create a line plot in Matplotlib, we can use the plt.plot() function and pass arrays of
x and y values as arguments. We can also customize the appearance of the line plot by
specifying the color, style, and width of the line.
25.A bar plot in Matplotlib is a visualization that displays the relationship between a
categorical variable and a numerical variable by displaying the values as bars.
26.To create a bar plot in Matplotlib, we can use the plt.bar() function and pass arrays of
x and y values as arguments. We can also customize the appearance of the bar plot by
specifying the color, width, and orientation of the bars.
27.A pie chart in Matplotlib is a visualization that displays the relative sizes of different
categories as slices of a pie.
28.To create a pie chart in Matplotlib, you can follow these steps:
• Import the Matplotlib library
• Create a figure and an axis object using the subplots method
• Define the data that you want to visualize in the pie chart
• Call the pie method on the axis object and pass the data as a parameter
• Optionally, you can add a title and legend to the chart
Here is an example code snippet that demonstrates how to create a simple pie chart:
This will create a pie chart with four slices labeled A, B, C, and D.
29.A box plot is a type of chart that is used to display the distribution of a dataset. It
shows the median, quartiles, and outliers of the data. In Seaborn, a box plot can be
created using the boxplot function.
30.To create a box plot in Seaborn, you can follow these steps:
• Import the Seaborn library
• Load the data that you want to visualize
• Create a figure and an axis object using the subplots method
• Call the boxplot function on the axis object and pass the data as a parameter
• Optionally, you can customize the appearance of the chart by setting various
parameters
Here is an example code snippet that demonstrates how to create a simple box plot in
Seaborn:
This will create a box plot of the data stored in the my_data.csv file. You can
customize the appearance of the chart by setting various parameters, such as the
color palette, whisker length, and outliers.
10 Projects
1. Analysis of Customer Reviews: Collect customer reviews from an e-commerce
website and analyze them using Pandas and Matplotlib to identify common
themes and issues.
2. Sales Forecasting: Use time-series analysis techniques from the Pandas library
to forecast future sales and visualize the results using Matplotlib.
3. Analysis of Social Media Data: Collect data from social media platforms like
Twitter or Facebook and analyze it using Pandas and Matplotlib to identify
trends and patterns.
4. Visualization of Geographic Data: Use the GeoPandas library to visualize
geographic data like maps, population density, or election results.
5. Analysis of Web Traffic: Collect data on website traffic and user behavior using
Pandas and visualize the results using Matplotlib or Seaborn to identify trends
and patterns.
6. Image Processing: Use the OpenCV library and NumPy to perform image
processing tasks like image filtering, edge detection, and object detection.
7. Network Analysis: Use the NetworkX library to analyze complex networks like
social networks, transportation networks, or power grids.
8. Visualization of Scientific Data: Use the Matplotlib library to visualize scientific
data like astronomical observations, climate data, or genetic data.
9. Text Mining: Use natural language processing techniques from the NLTK library
to analyze and classify text data like news articles, academic papers, or social
media posts.
10.Analysis of Financial Data: Collect and analyze financial data like stock prices,
exchange rates, or economic indicators using Pandas and visualize the results
using Matplotlib.
Interview Question
1. What is Exploratory Data Analysis (EDA)?
Answer: Exploratory Data Analysis (EDA) is the process of analyzing and summarizing data
sets in order to gain insights into the data. It involves using statistical and visual methods to
understand the underlying patterns and relationships within the data.
2. What are the steps involved in EDA?
Answer: The steps involved in EDA are:
- Data collection and loading
- Data cleaning and preparation
- Descriptive statistics and data visualization
- Correlation and regression analysis
- Hypothesis testing and statistical inference
3. What are some commonly used Python libraries for EDA?
Answer: Some commonly used Python libraries for EDA are:
- Pandas: for data manipulation and analysis
- Matplotlib: for data visualization
- Seaborn: for statistical data visualization
- NumPy: for numerical computing
- Scikit-learn: for machine learning
4. How do you load data into Python for EDA?
Answer: Data can be loaded into Python for EDA using various methods such as:
- Reading from a CSV file using pandas read_csv() method
- Reading from an Excel file using pandas read_excel() method
- Reading from a database using pandas read_sql() method
- Reading from a JSON file using pandas read_json() method
5. How do you check the shape of a dataset in Python?
Answer: To check the shape of a dataset in Python, you can use the shape attribute of a
pandas DataFrame. For example: `df.shape` will return the number of rows and columns in
the DataFrame.
6. How do you check the data types of the columns in a dataset?
Answer: To check the data types of the columns in a dataset, you can use the dtypes attribute
of a pandas DataFrame. For example: `df.dtypes` will return the data types of all the columns
in the DataFrame.
7. How do you handle missing values in a dataset during EDA?
Answer: There are various methods to handle missing values in a dataset during EDA, such as:
- Removing the rows or columns containing missing values
- Imputing the missing values with a fixed value such as mean or median
- Imputing the missing values using statistical models such as regression or K-nearest
neighbors (KNN) algorithm
8. How do you visualize the distribution of a numerical variable in Python?
Answer: To visualize the distribution of a numerical variable in Python, you can use various
methods such as:
- Histogram using Matplotlib or Seaborn
- Density plot using Seaborn
- Box plot using Seaborn
9. How do you visualize the relationship between two numerical variables in Python?
Answer: To visualize the relationship between two numerical variables in Python, you can use
various methods such as:
- Scatter plot using Matplotlib or Seaborn
- Line plot using Matplotlib or Seaborn
- Heatmap using Seaborn
.
10. What is a histogram?
Answer: A histogram is a graphical representation of the distribution of a numerical variable.
It shows the frequencies of different ranges or bins of values.
11. How do you create a histogram in Python?
Answer: You can use the `hist` function of pandas or the `histogram` function of numpy to
create a histogram in Python.
12. What is a boxplot?
Answer: A boxplot is a graphical representation of the distribution of a numerical variable
based on its quartiles. It shows the median, quartiles, and outliers of the variable.
13. How do you create a boxplot in Python?
Answer: You can use the `boxplot` function of pandas or the `boxplot` function of matplotlib
to create a boxplot in Python.
14. What is a scatter plot?
Answer: A scatter plot is a graphical representation of the relationship between two numerical
variables. It shows how one variable is affected by the other.
15. How do you create a scatter plot in Python?
Answer: You can use the `scatter` function of matplotlib to create a scatter plot in Python.
16. What is a heatmap?
Answer: A heatmap is a graphical representation of the distribution of a numerical variable
across two categorical variables. It shows the intensity of the variable using color scales.
17. How do you create a heatmap in Python?
Answer: You can use the `heatmap` function of seaborn to create a heatmap in Python.
18. What is a correlation matrix?
Answer: A correlation matrix is a matrix that shows the correlation coefficients between
multiple variables. It helps to identify the strength and direction of the relationships between
variables.
19. How do you create a correlation matrix in Python?
Answer: You can use the `corr` function of pandas or the `heatmap` function of seaborn to
create a correlation matrix in Python.
20. How do you detect missing values in Python?
Answer: You can use the `isnull` function of pandas to detect missing values in Python.
21. How do you handle missing values in Python?
Answer: You can handle missing values by dropping rows or columns with missing values or
by imputing missing values with mean, median, or mode.
22. What is data normalization?
Answer: Data normalization is the process of transforming data into a standard scale to
eliminate the impact of different units, ranges, and distributions of variables.
23. What is Pandas?
Answer: Pandas is a Python library that is used for data manipulation and analysis. It provides
data structures and functions for working with structured data.
24. What are the two main data structures provided by Pandas?
Answer: The two main data structures provided by Pandas are Series and DataFrame.
25. What is a Series in Pandas?
Answer: A Series is a one-dimensional array-like object that can hold any data type, such as
integers, floating-point numbers, strings, and Python objects.
26. What is a DataFrame in Pandas?
Answer: A DataFrame is a two-dimensional tabular data structure with rows and columns. It is
similar to a spreadsheet or SQL table.
27. How do you create a Series in Pandas?
Answer: You can create a Series in Pandas by passing a list or array of values to the
`pd.Series` function.
28. How do you create a DataFrame in Pandas?
Answer: You can create a DataFrame in Pandas by passing a dictionary of lists or arrays to
the `pd.DataFrame` function.
29. How do you select columns from a DataFrame in Pandas?
Answer: You can select columns from a DataFrame in Pandas by using the column name as an
index, such as `df['column_name']`.
30. How do you select rows from a DataFrame in Pandas?
Answer: You can select rows from a DataFrame in Pandas by using the `loc` or `iloc`
function, such as `df.loc[index]` or `df.iloc[row_number]`.
31. How do you rename columns in a DataFrame in Pandas?
Answer: You can rename columns in a DataFrame in Pandas by using the `rename` function
and passing a dictionary of old and new column names, such as
`df.rename(columns={'old_name': 'new_name'})`.
32. How do you drop columns from a DataFrame in Pandas?
Answer: You can drop columns from a DataFrame in Pandas by using the `drop` function and
passing the column name or index, such as `df.drop('column_name', axis=1)`.
33. How do you add columns to a DataFrame in Pandas?
Answer: You can add columns to a DataFrame in Pandas by assigning a new column as a
Series or a list, such as `df['new_column'] = [1, 2, 3]`.
34. How do you group data in a DataFrame in Pandas?
Answer: You can group data in a DataFrame in Pandas by using the `groupby` function and
passing the column name or names to group by, such as `df.groupby('column_name')`.
35. How do you calculate descriptive statistics of a DataFrame in Pandas?
Answer: You can calculate descriptive statistics of a DataFrame in Pandas by using the
`describe` function, such as `df.describe()`.
36. What is NumPy?
Answer: NumPy is a Python library used for numerical computations in scientific computing. It
provides support for large, multi-dimensional arrays and matrices, along with a large
collection of high-level mathematical functions to operate on these arrays.
37. How do you create a NumPy array?
Answer: NumPy arrays can be created in several ways, including by converting a list or tuple
to an array using the `array` function or by using functions like `zeros`, `ones`, and
`random`.
38. What is the difference between a list and a NumPy array?
Answer: Lists in Python are dynamic and can contain elements of different data types, while
NumPy arrays are homogeneous and fixed in size. NumPy arrays are also faster and more
memory-efficient than lists when working with large datasets.
39. How do you access elements of a NumPy array?
Answer: Elements of a NumPy array can be accessed using indexing or slicing. For example,
`arr[0]` would access the first element of the array, and `arr[1:3]` would access elements 1
and 2.
40. How do you find the shape and size of a NumPy array?
Answer: The `shape` attribute of a NumPy array returns a tuple containing the dimensions of
the array, while the `size` attribute returns the total number of elements in the array.
41. How do you reshape a NumPy array?
Answer: The `reshape` function can be used to change the shape of a NumPy array. For
example, `arr.reshape(2,3)` would reshape a 1D array with 6 elements into a 2D array with 2
rows and 3 columns.
42. How do you create a copy of a NumPy array?
Answer: The `copy` function can be used to create a copy of a NumPy array. For example,
`arr_copy = arr.copy()` would create a new copy of `arr`.
43. How do you concatenate two NumPy arrays?
Answer: The `concatenate` function can be used to concatenate two NumPy arrays. For
example, `np.concatenate((arr1, arr2), axis=0)` would concatenate `arr1` and `arr2` along
the rows.
44. What is broadcasting in NumPy?
Answer: Broadcasting is a feature in NumPy that allows arrays with different shapes to be
used in arithmetic operations. The smaller array is broadcast to match the shape of the larger
array, allowing the operation to be performed element-wise.
45. What is the difference between a shallow copy and a deep copy of a NumPy array?
Answer: A shallow copy of a NumPy array creates a new array object with a different memory
address, but shares the same data as the original array. A deep copy, on the other hand,
creates a new array object with a different memory address and a separate copy of the data.
46. How do you find the maximum and minimum values of a NumPy array?
Answer: The `max` and `min` functions can be used to find the maximum and minimum
values of a NumPy array. For example, `arr.max()` would return the maximum value in `arr`.
47. How do you find the sum and product of a NumPy array?
Answer: The `sum` and `prod` functions can be used to find the sum and product of the
elements in a NumPy array, respectively. For example, `arr.sum()` would return the sum of
the elements in `arr`.
48. How do you compute the mean, median, and standard deviation of a NumPy array?
Answer
49. How do you pivot a DataFrame in Pandas?
Answer: You can pivot a DataFrame in Pandas by using the `pivot` function and passing the
row, column, and value names, such as `df.pivot(index='row_name',
columns='column_name', values='value_name')`.
50. How do you melt a DataFrame in Pandas?
Answer: You can melt a DataFrame in Pandas by using the `melt` function and passing the id
variables and value variables, such as `df.melt(id_vars=['id_var'], value_vars=['value_var'])`.
Rough

More Related Content

Similar to Exploratory Analytics in Python provided by EY.pdf

Python3handson
Python3handsonPython3handson
Python3handson
VetriSelvan Nagarajan
 
Cmpe202 01 Research
Cmpe202 01 ResearchCmpe202 01 Research
Cmpe202 01 Research
vladimirkorshak
 
Python programming ppt.pptx
Python programming ppt.pptxPython programming ppt.pptx
Python programming ppt.pptx
nagendrasai12
 
Intro to Python Programming
Intro to Python ProgrammingIntro to Python Programming
Intro to Python Programming
ssuser65af26
 
Python Programming Unit1_Aditya College of Engg & Tech
Python Programming Unit1_Aditya College of Engg & TechPython Programming Unit1_Aditya College of Engg & Tech
Python Programming Unit1_Aditya College of Engg & Tech
Ramanamurthy Banda
 
Python quick guide1
Python quick guide1Python quick guide1
Python quick guide1
Kanchilug
 
Research paper on python by Rj
Research paper on python by RjResearch paper on python by Rj
Research paper on python by Rj
Shree M.L.Kakadiya MCA mahila college, Amreli
 
Introduction to Python Programming Basics
Introduction  to  Python  Programming BasicsIntroduction  to  Python  Programming Basics
Introduction to Python Programming Basics
Dhana malar
 
introduction to Python (for beginners)
introduction to Python (for beginners)introduction to Python (for beginners)
introduction to Python (for beginners)
guobichrng
 
Introduction to Python
Introduction to PythonIntroduction to Python
Introduction to Python
MuhammadBakri13
 
Summer Training Project.pdf
Summer Training Project.pdfSummer Training Project.pdf
Summer Training Project.pdf
Lovely professinal university
 
Python tutorial
Python tutorialPython tutorial
Python tutorial
Vijay Chaitanya
 
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
Prof. Wim Van Criekinge
 
Introduction to Python Unit -1 Part .pdf
Introduction to Python Unit -1 Part .pdfIntroduction to Python Unit -1 Part .pdf
Introduction to Python Unit -1 Part .pdf
VaibhavKumarSinghkal
 
Introduction to Python
Introduction to PythonIntroduction to Python
Introduction to Python
DrMohammed Qassim
 
Machine learning session 1
Machine learning session 1Machine learning session 1
Machine learning session 1
NirsandhG
 
Getting Started with Python
Getting Started with PythonGetting Started with Python
Getting Started with Python
Sankhya_Analytics
 
Programming in Civil Engineering_UNIT 1_NOTES
Programming in Civil Engineering_UNIT 1_NOTESProgramming in Civil Engineering_UNIT 1_NOTES
Programming in Civil Engineering_UNIT 1_NOTES
Rushikesh Kolhe
 
Introduction to Python.pptx
Introduction to Python.pptxIntroduction to Python.pptx
Introduction to Python.pptx
SamyakJain461
 
A Comprehensive Python Introduction .pptx
A Comprehensive Python Introduction .pptxA Comprehensive Python Introduction .pptx
A Comprehensive Python Introduction .pptx
SudhanshiBakre1
 

Similar to Exploratory Analytics in Python provided by EY.pdf (20)

Python3handson
Python3handsonPython3handson
Python3handson
 
Cmpe202 01 Research
Cmpe202 01 ResearchCmpe202 01 Research
Cmpe202 01 Research
 
Python programming ppt.pptx
Python programming ppt.pptxPython programming ppt.pptx
Python programming ppt.pptx
 
Intro to Python Programming
Intro to Python ProgrammingIntro to Python Programming
Intro to Python Programming
 
Python Programming Unit1_Aditya College of Engg & Tech
Python Programming Unit1_Aditya College of Engg & TechPython Programming Unit1_Aditya College of Engg & Tech
Python Programming Unit1_Aditya College of Engg & Tech
 
Python quick guide1
Python quick guide1Python quick guide1
Python quick guide1
 
Research paper on python by Rj
Research paper on python by RjResearch paper on python by Rj
Research paper on python by Rj
 
Introduction to Python Programming Basics
Introduction  to  Python  Programming BasicsIntroduction  to  Python  Programming Basics
Introduction to Python Programming Basics
 
introduction to Python (for beginners)
introduction to Python (for beginners)introduction to Python (for beginners)
introduction to Python (for beginners)
 
Introduction to Python
Introduction to PythonIntroduction to Python
Introduction to Python
 
Summer Training Project.pdf
Summer Training Project.pdfSummer Training Project.pdf
Summer Training Project.pdf
 
Python tutorial
Python tutorialPython tutorial
Python tutorial
 
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
 
Introduction to Python Unit -1 Part .pdf
Introduction to Python Unit -1 Part .pdfIntroduction to Python Unit -1 Part .pdf
Introduction to Python Unit -1 Part .pdf
 
Introduction to Python
Introduction to PythonIntroduction to Python
Introduction to Python
 
Machine learning session 1
Machine learning session 1Machine learning session 1
Machine learning session 1
 
Getting Started with Python
Getting Started with PythonGetting Started with Python
Getting Started with Python
 
Programming in Civil Engineering_UNIT 1_NOTES
Programming in Civil Engineering_UNIT 1_NOTESProgramming in Civil Engineering_UNIT 1_NOTES
Programming in Civil Engineering_UNIT 1_NOTES
 
Introduction to Python.pptx
Introduction to Python.pptxIntroduction to Python.pptx
Introduction to Python.pptx
 
A Comprehensive Python Introduction .pptx
A Comprehensive Python Introduction .pptxA Comprehensive Python Introduction .pptx
A Comprehensive Python Introduction .pptx
 

Recently uploaded

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 

Recently uploaded (20)

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 

Exploratory Analytics in Python provided by EY.pdf

  • 2. Disclaimer for Course Material • These course materials are for educational purposes only and shall not constitute professional advice in any form to anyone • These course materials have been designed as an integral part of the course presentation and are intended solely for the benefit of delegates attending the respective course(s). These course materials do not necessarily stand on their own and are not intended to be relied upon for giving specific professional advice • Best endeavours are used to ensure that these course materials are up-to-date when recorded/printed. However, given the nature of the subject, professional advice should be taken before taking any specific step in relation to any matter • Nothing said or done by EY or its course presenters can be relied upon as professional advice by anyone viewing/attending the eLearning(s)/course(s) or anyone else viewing/reading these materials. Any comments made by any course presenter does not constitute professional advice and must not be relied upon as professional advice by anyone viewing/attending the eLearning(s)/course(s) or anyone else • All title, intellectual property and copyrights and other rights in these materials are owned by EY or its suppliers. All such rights are reserved and remained vested in EY or its suppliers, and are not transferred in any way • No part of these course materials may be reproduced in any form, in whole or in part, for any purpose without the prior permission in writing of EY. No part of these course materials shall be stored in any electronic knowledge-base, or data retrieval system without the prior permission in writing of EY
  • 3. CO N T E N T S Introduction ..................................................................................................................5 Section 1: Introduction to Python Programming.............................................................6-8 1. Overview of Programming languages • Machine Language • Assembly L 2. History of Python 3. Installing Anaconda Section 2: Python Programming .................................................................................9-12 1. Writing a Python program 2. Python character set and core data types 3. List,Tuple,Dictionary and sets Section 3: Operator and Expressions.........................................................................13-21 1. Arithmetic operators 2.Operational precedence and associativity 3.BitWise operator 4. Compound Assignment Operator 5. Mini Project: GST Calculator Section 4: Decision statement ..................................................................................22-27 Part 1: Decision making statements 1. The IF statement 2. The ELSE IF statement 3. NESTED IF statement 4. Multiway IF-ELIF-ELSE statement Part 2: Expressions 1. Conditional Expression 2. Mini Project: Finding days in a month
  • 4. Section 5: The LOOP statement................................................................................28-36 1. The WHILE loop 2. The RANGE function 3. The FOR loop 4. Nested loops and BREAK statement 5. The CONTINUE statement 6. Mini Project: Generate prime numbers using Charles Babbage Function Section 6: String and character ................................................................................37-40 1. Comment and DOC Strings 2. Strings using Python 3. Mini Project: Generate prime numbers using Charles Babbage Function Section 7: Functions using Python ............................................................................41-51 1. Syntax and basics of a function 2. Use of functions 3. Parameters and arguments in a function 4. the local and global scope of a variable 5. The RETURN statement and Recursive function 6. Mini Project Section8 : Data analysis with Python libraries………………………………………………………………52-72 Part 1: Python Libraries introduction 1. How to load Python libraries 2. Panda Overview and purpose Part 2: Reading data using Pandas 1. Reading CSV files 2. Reading excel files 3. Reading JSON files 4. Reading SQL databases
  • 5. Part 3: Data frames and exercise 1. Exploring Data frames 2. Exercise and project on Pandas Part 4: Some Python Libraries 1. NumPy 2. Matplotlib 3. Seaborn Exercise and Projects…………………………………………………………………………………..73-82 Interview Q&A……………………………………..…………………………………………..………..83-91
  • 6. I N T R O D U C T I O N ➢ Talking about Data science and Artificial Intelligence, we all have heard of Python as the main language responsible for carrying out all the important tasks in these areas. Python is the most popular language of 21st century that was created by Guido Van Rossum and came in consideration in 1991 when it was released. ➢ Python is a remarkable and super advanced language for almost every problem that is not addressed by most of the computer languages these days. ➢ Whether you want to create web applications or it is about handling big data and complex math problems to database problems and creating workflows, Python has it all. In this book, we will learn everything that can be addressed through Python. We will learn from basic programming fundamentals to advanced Python. Python libraries are also included in the book to make you full versed with the language. By the end of the book you will be able to program in Python with all the basics set right and knowing what to do with the logics.
  • 7. SECTION 1: GETTING STARTED INTRODUCTION TO PYTHON PROGRAMMING Key Objective • Overview of Programming Language a) Machine Language b) Assembly Language • History of Python • Pre Read : Installing Anaconda
  • 8. Overview of Programming Languages Programming languages are sets of rules and instructions that are used to create software programs, applications, and operating systems. There are various types of programming languages, and each one serves a specific purpose. a) Machine Language: Machine language is the lowest-level programming language and is also known as the binary language. It consists of instructions that can be directly executed by a computer's central processing unit (CPU). These instructions are written in the form of binary code, which consists of 0s and 1s. b) Assembly Language: Assembly language is a low-level programming language that uses symbolic instructions instead of binary code. It is also known as Assembly or Assembler. Assembly language is one step higher than machine language and is specific to a particular computer architecture. It is easier to read and write than machine language, and programs written in assembly language are usually faster and more efficient than those written in higher-level programming languages. c) High-level Language: High-level programming languages are languages that are designed to be easier to read, write, and understand than low-level programming languages such as machine language and assembly language. High-level languages are often used for software development and programming, as they provide a simpler and more abstract way of thinking about programming concepts. Some popular high-level languages include Python, Java, C#, Ruby, and JavaScript.
  • 9. History Of Python Python is a high-level, general-purpose programming language that was created in the late 1980s by Guido van Rossum, a Dutch programmer. Here is a brief history of Python: o In 1989, Guido van Rossum began working on a new programming language that he called "Python." He was working at the National Research Institute for Mathematics and Computer Science in the Netherlands at the time. o The first version of Python, version 0.9.0, was released in February 1991. It was a simple language, with only about 50 keywords. o In 1994, Python 1.0 was released. This version added support for modules, the lambda keyword, and a garbage collector. o Python 2.0 was released in 2000. This version added many new features, including list comprehensions, a garbage collector for cycles, and support for Unicode. o Python 3.0 was released in 2008. This version was a major revision of the language, with many backwards-incompatible changes. The main goal of Python 3.0 was to clean up the language and remove some of the inconsistencies and cruft that had accumulated over the years. o Since the release of Python 3.0, the Python community has been working to migrate the ecosystem to the new version. However, many libraries and applications still rely on Python 2.7, which is the last version of the 2.x series. o Python has become one of the most popular programming languages in the world, used for web development, data analysis, artificial intelligence, scientific computing, and more. It has a large and active community of developers who contribute to the language and its ecosystem of libraries and tools.
  • 10. • Pre Read : Installing Anaconda Installing Anaconda on Windows Anaconda distribution of Python is recommended for this course. This section details the installation of the Anaconda distribution of Python on Windows 10. Anaconda comes bundled with about 600 packages pre-installed including NumPy, Matplotlib and SymPy. Go to the following link: Anaconda.com/downloads The Anaconda Downloads Page will look something like this:
  • 11. SECTION 2: PYTHON PROGRAMMING BASIC OF PYTHON PROGRAMMING Key Objective • Writing our 1ST Python Program • Python Character Set • Python Core Data Types a) Integer, Float, Complex Number, Boolean, String Type • List, Tuple, Dictionary, Sets
  • 12. WRITING OUR 1ST PYTHON PROGRAM Follow these steps: 1. Open Anaconda Navigator and launch Jupyter Notebook. 2. In Jupyter Notebook, click on the "New" button in the top right corner and select "Python 3" to create a new Python notebook. 3. In the first cell of the notebook, type the following code: 4. Click on the "Run" button in the toolbar or press "Shift + Enter" to execute the code in the cell. 5. You should see the output "Hello, World!" displayed below the cell. Congratulations, you've just written your first Python program! Note: Python is an interpreted language, which means you can run code line-by-line in a notebook like Jupyter. You can add more cells to your notebook and experiment with different Python commands and syntax.
  • 13. PYTHON CHARACTER SET The Python Character Set refers to the set of characters that can be used in Python code. Python supports a wide range of characters, including: PYTHON CORE DATA TYPES Python has several built-in core data types, which are fundamental to the language and used extensively in programming. The following are the core data types in Python: •Uppercase and lowercase letters (A-Z, a-z) •Digits (0-9) •Special characters (such as $, #, %, &, *, @, etc.) •Whitespace characters (such as space, tab, newline, etc.) Integer: An integer is a whole number without a decimal point. In Python, integers can be positive or negative, and can be of any size (up to the available memory of the system). Integers are represented using the int type. Example: x = 5 Float: A float is a number with a decimal point. In Python, floating-point numbers are represented using the float type. Example: y = 3.14 Complex Number: A complex number is a number with a real and imaginary part. In Python, complex numbers are represented using the complex type. The real and imaginary parts are separated by a + sign, and the imaginary part is suffixed with a jExample: z = 2 + 3j Boolean: A boolean is a binary value that represents either true or false. In Python, the bool type is used to represent boolean values. The two possible values are True and False. Example: a = True
  • 14. These core data types provide the building blocks for writing programs in Python and are used extensively in most Python applications. String: A string is a sequence of characters enclosed within quotes, either single quotes ('...') or double quotes ("..."). In Python, strings are represented using the str type. Example: s = "Hello, World!"
  • 15. LIST, TUPLE, DICTIONARY, SETS WITH EXAMPLE ▪ List: A list is a mutable sequence of elements enclosed in square brackets [ ]. Each element in a list is separated by a comma. Here's an example: ▪ Tuple: A tuple is an immutable sequence of elements enclosed in parentheses ( ). Each element in a tuple is separated by a comma. Here's an example: ▪ Dictionary: A dictionary is a collection of key-value pairs enclosed in curly braces { }. Each key-value pair is separated by a colon, and the keys are unique. Here's an example: ▪ Set: A set is an unordered collection of unique elements enclosed in curly braces { }. Here's an example:
  • 16. SECTION 3: OPERATOR AND EXPRESSIONS INTRODUCTION OPERATOR AND EXPRESSIONS Operators and expressions are fundamental concepts in computer programming, including Python. An operator is a symbol or keyword that performs an operation on one or more operands. An expression is a combination of values, variables, operators, and function calls that are evaluated to produce a result. ▪ ARITHMETIC OPERATORS a) UNARY OPERATOR b) BINARY OPERATOR
  • 17. Arithmetic operators are used in Python to perform basic arithmetic operations such as addition, subtraction, multiplication, division, modulus, and exponentiation. There are two types of arithmetic operators based on the number of operands they take - unary and binary operators. A unary operator takes only one operand, whereas a binary operator takes two operands. Unary Operators: • The unary plus operator (+) is used to indicate that a value is positive, although it is optional because numbers are assumed to be positive by default. • The unary minus operator (-) is used to negate a value and make it negative. Example: Binary Operators: • The addition operator (+) is used to add two operands. • The subtraction operator (-) is used to subtract one operand from another. • The multiplication operator (*) is used to multiply two operands. • The division operator (/) is used to divide one operand by another. • The modulus operator (%) is used to get the remainder of a division operation. • The exponentiation operator (**) is used to raise one operand to the power of another.
  • 18. Example: These are some of the basic arithmetic operators in Python, and they are essential to performing mathematical calculations in programs. • OPERATION PRECEDENCE AND ASSOCIATIVITY a) EXAMPLE OF OPERATOR PRECEDENCE b) ASSOCIATIVITY a) Operator precedence refers to the order in which operators are evaluated in an expression. When there are multiple operators in an expression, the operator with higher precedence is evaluated first. Here's an example: In this expression, the multiplication operator (*) has a higher precedence than the addition operator (+). So the expression is evaluated like this:
  • 19. First, 4 * 5 is evaluated to give 20. Then, 3 + 20 is evaluated to give 23. Finally, the result 23 is assigned to the variable x. b) Associativity refers to the order in which operators of the same precedence are evaluated in an expression. Some operators are left-associative, meaning they are evaluated from left to right. Others are right-associative, meaning they are evaluated from right to left. Here's an example: In this expression, the subtraction operator (-) has left-associativity. So the expression is evaluated like this: First, 10 - 5 is evaluated to give 5. Then, 5 - 3 is evaluated to give 2. Finally, the result 2 is assigned to the variable x. • BITWISE OPERATOR And Operator OR Operator XOR Operator Right Shift Lift Operator
  • 20. Bitwise operators are used in computer programming to manipulate the individual bits of binary numbers. Here are the definitions of the five most common bitwise operators: a) AND operator (&): Explanation: The & operator performs a bitwise AND operation on the binary representations of a and b, and stores the result in result. In this case, 5 in binary is 101 and 3 in binary is 011, so the bitwise AND of the two is 001, which is equal to 1 in decimal. b) OR operator (|): Explanation: The | operator performs a bitwise OR operation on the binary representations of a and b, and stores the result in result. In this case, 5 in binary is 101 and 3 in binary is 011, so the bitwise OR of the two is 111, which is equal to 7 in decimal. c) XOR operator (^): Explanation: The ^ operator performs a bitwise XOR operation on the binary representations of a and b, and stores the result in result. In this case, 5 in binary is 101 and 3 in binary is 011, so the bitwise XOR of the two is 110, which is equal to 6 in decimal.
  • 21. d) Right shift (>>): Explanation: The >> operator performs a right shift operation on the binary representation of a, shifting the bits two positions to the right, and stores the result in result. In this case, 16 in binary is 10000, and shifting the bits two positions to the right gives 00100, which is equal to 4 in decimal. e) Left shift (<<): Explanation: The << operator performs a left shift operation on the binary representation of a, shifting the bits two positions to the left, and stores the result in result. In this case, 4 in binary is 00100, and shifting the bits two positions to the left gives 10000, which is equal to 16 in decimal.
  • 22. THE COMPOUND ASSIGNMENT OPERATOR In Python, the compound assignment operators are used to perform an arithmetic operation and assign the result to the same variable in a single statement. Here are some examples: 1. Addition and assignment: 2. Subtraction and assignment: 3. Multiplication and assignment: 4. Division and assignment:
  • 23. 5. Modulo and assignment: 6. Exponentiation and assignment: 7. Floor and assignment:
  • 24. Mini Project:- GST Calculator What is GST? GST stands for Goods and Services Tax, which is a value-added tax levied on the sale of goods and services in many countries around the world. GST is a comprehensive, multistage, destination-based tax that is levied on every value addition in the supply chain. It is a single tax that replaced multiple indirect taxes like excise duty, service tax, VAT, etc., in India. Problem Statement We all buy various goods from a store. Along with the price of the goods we wish to buy, we also have to pay an additional tax, which is calculated as a specific percentage on the total price of the goods. This is called GST on the productions. Model of GST Using an Example The GST has two components, viz. one which is levied by the central government (referred to as centralGST or CGST), and one levied by the state government (referred to as state GST or SGST). The rates for central GST and State GST are given as Follows: Types of Tax Tax Rate CGST @9% SGST @9% Example Invoice of a Product
  • 25. Particulars GST on Particulars Cost of Production 5000 Add: CGST @9% 450 Add: SGST @9% 450 Total Cost of Production: 5900 Formula to Calculate Total Cost (CGST Tax Rate on Product)+(SGST Tax Rate on Product) Algorithm Step 1: Read Cost of Production Step 2: Input the CGST Tax rate Step 3: Input the SGST tax rate Step 4: Calculate and print the total cost of production. Program and Outputs
  • 26. SECTION 4: DECISION STATEMENT SETS OF DECISION STATEMENT • DECISION MAKING STATEMENT A. THE IF STATEMENT B. THE IF-ELSE STATEMENT C. NESTED IF STATEMENTS D. MULTI-WAY IF-ELIF-ELSE STATEMENT The IF Statement: The if statement is used to execute a block of code only if a certain condition is true. If the condition is false, the code inside the “if block” is skipped. The syntax for the “if statement” in Python is as follows: Example The IF-ELSE Statement:
  • 27. The if-else statement is used to execute a block of code if the condition is true and another block of code if the condition is false. The syntax for the if-else statement in Python is as follows: Example: Nested IF Statements: Nested if statements are if statements inside other if statements. They are used when more than one condition needs to be checked. The syntax for “nested if” statements in Python is as follows:
  • 28. Example: Multi-way IF-ELIF-ELSE Statement: The if-elif-else statement is used when there are more than two conditions to be checked. The elif keyword is used for additional conditions to be checked. The syntax for the if-elif-else statement in Python is as follows: Example:
  • 29. CONDITIONAL EXPRESSION Conditional expressions, also known as ternary operators, are a shorthand way to write an if- else statement in a single line. The syntax for conditional expressions in Python is as follows: The condition is evaluated first, and if it is True, the expression returns the value_if_true. If the condition is False, the expression returns the value_if_false. Example: In the above example, the if-else statement is written as a conditional expression. If x is greater than y, the expression returns the string "x is greater than y", otherwise it returns the string "y is greater than or equal to x".
  • 30. Mini Project:-Finding the Number of Days in a Month This mini project will make use of Programming features such as if Statement and Elif statements. It will help a programmer to known the number days in a month. Hint: If entered the month is 2 then read the corresponding year. To known the number of days in month 2, check if the entered year is a leap year. If leap then Num_days=29 or not leap then in Num_days= 28for month 2, respectively. Leap year: A leap year is divisible by 4 but not by 100 or divisible by 400 Algorithm: Step 1: prompt the month from the user. Step 2: Check if the entered month is 2, i.e. February. If so then go to step 3, else go to step 4. Step 3: if the entered month is 2 then check if the year is a leap year. If it is a leap year then store num_days=29, else num_days=28. Step 4: If the entered month is one of the following from the list (1,3,5,7,8,12) then stores num_days=31. Or if the entered month is from the list (4,6,9,11) then store num_days=29. If the entered month is different from the range (1 to 12) then display message “invalid month”. Step 5: If the input is valid then display the message as “there are N number of days in the month M”. Program and output
  • 31.
  • 32. SECTION 5: LOOP STATEMENT LOOP CONTROL STATEMENT • THE WHILE LOOP a) DETAIL OF WHILE LOOP b) SOME MORE PROGRAM ON WHILE LOOP DETAIL OF WHILE LOOP: The while loop is a control flow statement that allows you to execute a block of code repeatedly as long as a specified condition is true. The general syntax of a while loop in Python is: The condition is a boolean expression that is evaluated at the beginning of each iteration of the loop. If the condition is True, the code block is executed. This process repeats until the condition is False. Example In this example, num starts at 1, and the loop continues as long as num is less than or equal to 5. Inside the loop, we print the value of num and then increment it by 1 using the += operator.
  • 33. Some More Program on while Loop Program 1: Printing Even Number Program 2: Calculating Factorial Program 3: Guessing Game
  • 34. Program 4: Summing numbers from 1 to 100: Program 5: Simulating rolling a die until a certain number is rolled: Program 6:Reversing a string using a while loop:
  • 35. • THE RANGE () FUNCTION a) EXAMPLE OF RANGE () FUNCTION The range() function in Python is used to generate a sequence of numbers. It takes three arguments: start, stop, and step. The start argument is optional and defaults to 0, while the step argument is also optional and defaults to 1. The stop argument is required and specifies the upper limit of the sequence, but this upper limit is not included in the sequence. Example 1:Creating a list of odd numbers from 1 to 20 using the range() function. Explanation: The range(1, 21, 2) function creates a sequence of odd numbers from 1 up to, but not including, 21, with a step of 2. The list() function then converts this sequence into a list, which is assigned to the variable odd_numbers. Finally, the print() function is used to display the contents of the odd_numbers list. • THE FOR LOOP a) DETAILS OF FOR LOOP b) SOME MORE PROGRAM ON FOR LOOP The for loop is a control flow statement in Python that allows you to iterate over a sequence of elements, such as a list, tuple, string, or range. The general syntax for a for loop in Python is as follows:
  • 36. In this syntax, variable is a temporary variable that takes on the value of each element in the sequence on each iteration of the loop. The code block indented under the for statement is executed once for each element in the sequence. Here are some more examples of using the for loop in Python: Example 1:Printing the elements of a list using a for loop. Explanation: The fruits list contains four elements, which are iterated over in the for loop. On each iteration, the current element is assigned to the fruit variable, which is then printed using the print() function. Example 2:Printing a pattern of asterisks using nested for loops.
  • 37. Explanation: This example uses nested for loops to print a pattern of asterisks. The outer for loop iterates through the numbers from 0 to 4, while the inner for loop iterates through the numbers from 0 to the current value of i (which increases by 1 on each iteration of the outer loop). The print() function is used to print a single asterisk on each iteration of the inner loop, and the end parameter is used to prevent each asterisk from being printed on a new line. Finally, the outer print() function is used to print a new line after each row of asterisks. • NESTED LOOPS a) SOME MORE PROGRAMS ON FOR LOOPS Nested loops are loops that are contained within other loops. They are used when you need to iterate over a sequence of elements multiple times, or when you need to iterate over a sequence of sequences (such as a list of lists). Example 1:Multiplication table using nested loops.
  • 38. THE BREAK STATEMENT In Python, the break statement is used to exit a loop prematurely, before the loop has completed all iterations. The break statement is usually placed inside a conditional statement, and when executed, it causes the loop to terminate immediately and execution continues with the next statement after the loop. Here is an example of using the break statement in Python: In this example, we are trying to find the first prime number in a given range of numbers. We use two nested loops: the outer loop iterates over the range of numbers, and the inner loop checks whether each number is prime or not. If a factor is found for a number, the inner loop is terminated prematurely using the break statement. If a prime number is found, the outer loop is also terminated using another break statement. If no prime number is found, a message is printed to indicate that. Note that in the inner loop, there is an else block that is executed only if the inner loop completes all iterations without encountering a break statement. This else block is not executed if the loop is terminated prematurely using a break statement.
  • 39. The break statement is a very useful tool in programming, as it allows us to control the flow of the program and exit loops when certain conditions are met. However, it should be used judiciously, as overusing it can make the code harder to read and debug.
  • 40. THE CONTINUE STATEMENT In Python, the continue statement is used to skip the current iteration of a loop and move on to the next iteration, without executing any of the remaining statements in the loop for the current iteration. The continue statement is usually placed inside a conditional statement, and when executed, it causes the loop to skip the current iteration and move on to the next iteration. Here is an example of using the continue statement in Python: In this example, we are trying to print only the odd numbers in a given range of numbers. We use a for loop to iterate over the range of numbers, and a conditional statement to check whether each number is odd or even. If the number is even, the continue statement is executed, causing the loop to skip the remaining statements for the current iteration and move on to the next iteration. If the number is odd, the print statement is executed and the loop continues with the next iteration. The continue statement is a very useful tool in programming, as it allows us to skip certain iterations of a loop and focus on the ones that are relevant. It can be used to simplify code and improve performance in some cases. However, like the break statement, it should be used judiciously, as overusing it can make the code harder to read and debug.
  • 41. Mini Project:-Generate Prime Numbers using Charles Babbage Function Program Statement: Write a Python function that generates prime numbers using Charles Babbage's sieve algorithm. The function should take an integer as input and return a list of all prime numbers up to that integer. Program SECTION 6:STRING AND CHARACTER Algorithm: Create a list of numbers from 2 to the given integer. • Create a list of numbers from 2 to the given integer. • Create a list of numbers from 2 to the given integer. Initialize an empty list to store prime numbers. • Create a list of numbers from 2 to the given integer. • Create a list of numbers from 2 to the given integer. While the list of numbers is not empty, take the first number in the list and append it to the list of prime numbers. • Create a list of numbers from 2 to the given integer. • Create a list of numbers from 2 to the given integer. Remove all multiples of the first number from the list of numbers. Remove all multiples of the first number from the list of numbers. • Create a list of numbers from 2 to the given integer. • Create a list of numbers from 2 to the given integer. Repeat steps 3 and 4 until the list of numbers is empty. Return the list of prime numbers • Create a list of numbers from 2 to the given integer.
  • 42. STRING AND CHARACTER COMMENT AND DOC STRINGS In Python, comments and docstrings are used to provide information about the code. They are not executed as part of the program, but they help to explain the code and make it easier to understand and maintain. Comments: Comments are used to provide short explanations or annotations to the code. Comments in Python start with a hash (#) symbol and continue until the end of the line. Python ignores everything in the comment after the hash symbol.processes. Cognitive computing is used for applications such as fraud detection, personalized marketing, and virtual assistants. Docstrings: Docstrings are used to provide documentation for classes, functions, modules, or methods. Docstrings are written in triple quotes (""") and can span multiple lines. They are typically used to describe the purpose, usage, and behavior of the code.
  • 43.
  • 44. DIVING DEEP WITH STRINGS WITH USING PYTHON In Python, strings are used to represent textual data. They are enclosed in quotes, either single quotes ('...') or double quotes ("..."). Here are some of the most common operations and concepts related to strings in Python: String Concatenation: Strings can be concatenated using the '+' operator or by simply placing two strings next to each other. Example: String Indexing: You can access individual characters of a string by indexing. In Python, string indexes start from 0. You can also use negative indexing to access characters from the end of the string. Example: String Slicing: You can extract a part of a string by slicing it. A slice is specified by two indices separated by a colon. The first index is included in the slice, but the second index is not.
  • 45. Example: String Formatting: String formatting allows you to embed values in a string. There are several ways to format strings in Python, but the most common is to use placeholders that are replaced by values using the format() method. Example: String Methods: Python provides many built-in methods for working with strings, such as upper(), lower(), split(), strip(), replace(), and many others. These methods are called on a string object and return a new string with the modified value. Example:
  • 46. These are just a few examples of the many operations and concepts related to strings in Python. By mastering these basics, you'll be well on your way to becoming proficient with strings in Python.
  • 47. SECTION 7: FUNCTION FUNCTION USING PYTHON SYNTAX AND BASICS OF A FUNCTION In Python, a function is a block of code that performs a specific task. Functions are defined using the def keyword, followed by the function name, a set of parentheses, and a colon. The body of the function is indented and contains the code that performs the task. Here is the syntax for defining a function in Python: Example: In this example, we define a function called add_numbers() that takes two parameters a and b. The function then returns the sum of a and b. To call the function and store the result in a variable, we use result = add_numbers(2, 3). The value of result is then printed to the console, which outputs 5.
  • 48. These are just a basic examples of how to define and use functions in Python. Functions are a powerful tool for organizing and reusing code, and they can be used to simplify complex tasks and make your code more readable and maintainable.
  • 49. USE OF FUNCTIONS Functions are an important aspect of programming in Python, as they allow you to reuse code and organize it into modular pieces. Here are a few examples of how functions can be used in Python: Addition Function: In this example, we define a function called add_numbers() that takes two parameters a and b. The function then returns the sum of a and b. Multiplication Function: In this example, we define a function called multiply_numbers() that takes two parameters a and b. The function then returns the product of a and b.
  • 50. String Reversal Function: In this example, we define a function called reverse_string() that takes one parameter s, which is a string. The function then returns the reverse of the string using slicing. List Sum Function: In this example, we define a function called sum_list() that takes one parameter lst, which is a list of numbers. The function then returns the sum of all the elements in the list using the built- in sum() function. • PARAMETERS AND ARGUMENTS IN A FUNCTION
  • 51. In Python, parameters and arguments are terms used to describe the values that are passed into a function. Here are three types of parameters and arguments that can be used in a function: Positional Arguments: Positional arguments are the most common type of argument in Python functions. They are values passed to a function in a specific order, and are assigned to the function parameters in the same order. Here is an example: In this example, the function greet() takes in two positional arguments: name and message. When we call the function, we pass in the values "Alice" and "Hello". These values are assigned to the function parameters in the same order, so name gets assigned "Alice" and message gets assigned "Hello". Keyword Arguments: Positional Arguments Keywords Argument Parameter with Default Values
  • 52. Keyword arguments are used to pass values to a function using their parameter names. This allows you to pass the values in any order, as long as you specify which parameter they are meant to be assigned to. Here is an example: In this example, we use keyword arguments to pass the values "Hello" and "Alice" to the greet() function. By specifying the parameter names, we can pass the values in any order we like. Parameters with Default Values: In some cases, you may want to give a parameter in a function a default value, so that it can be omitted when the function is called. Here is an example: In this example, we have given the message parameter a default value of "Hello". When we call the greet() function with just "Alice", the default value is used for message. However, we can still override the default value by passing in a different value for message when we call the function. These are some examples of how you can use parameters and arguments in Python functions. By using these tools, you can make your functions more flexible and reusable, and save yourself time and effort in the process.
  • 53. THE LOCAL AND GLOBAL SCOPE OF A VARIABLES In Python, variables can have either local or global scope. The scope of a variable refers to the parts of the program where that variable can be accessed. Local Variables: Local variables are variables that are declared inside a function. They are only accessible within the function in which they are defined. Here is an example: In this example, the variable x is defined inside the function my_function(). This means that it is only accessible within the function. If we try to print the value of x outside the function, we will get an error: Global Variables: Global variables are variables that are declared outside of any function, and can be accessed from anywhere in the program. Here is an example:
  • 54. In this example, the variable x is defined outside of any function, which makes it a global variable. This means that it can be accessed from anywhere in the program, including inside the function my_function(). However, if you try to modify the value of a global variable inside a function, you will need to use the global keyword to indicate that you want to modify the global variable, not create a new local variable with the same name. Here is an example: In this example, we use the global keyword to indicate that we want to modify the global variable x inside the function my_function(). Without this keyword, Python would create a new local variable with the same name, which would not affect the global variable. By using the global keyword, we can modify the global variable and see the changes outside the function as well. These are some examples of how local and global variables work in Python. By understanding how variable scope works, you can write more flexible and reusable code, and avoid errors caused by variable name clashes.
  • 55. THE RETURN STATEMENT In Python, the return statement is used to return a value from a function. When a function is called, it may perform some operations and produce a result. This result can be returned to the caller using the return statement. Here is an example: In this example, the add_numbers function takes two arguments x and y, adds them together, and stores the result in a variable called result. The function then returns the value of result using the return statement. When we call the function with the arguments 3 and 4, it returns the value 7, which is stored in the variable sum. We can then print the value of sum to the console.
  • 56. RECURSIVE FUNCTION A recursive function in Python is a function that calls itself during its execution. This can be useful when we need to perform the same task repeatedly, with slightly different inputs each time. Recursive functions can be a powerful tool for solving complex problems, but they can also be tricky to implement correctly. Here's an example of a simple recursive function in Python that calculates the factorial of a number: In this example, the factorial function takes a single argument n. If n is equal to 0, the function returns 1, which is the base case. Otherwise, the function multiplies n by the result of calling factorial with n-1 as the argument. This is the recursive case. When we call factorial(4), the function checks if n is equal to 0. Since it's not, the function multiplies n (which is 4) by the result of calling factorial(3). To calculate factorial(3), the function again checks if n is equal to 0 (it's not), and multiplies n (which is 3) by the result of calling factorial(2). This process continues until we reach the base case of factorial(0), at which point the function returns 1. The final result is the product of all the numbers from n down to 1, which is 4 * 3 * 2 * 1 = 24. Recursive functions can be used to solve a wide variety of problems, but they can also be computationally expensive if not implemented carefully. It's important to make sure that a recursive function will eventually reach the base case and terminate, and to avoid unnecessary recursive calls that could cause the function to run for a long time or run out of memory.
  • 57. Mini Project:- Calculation of Compound Interests and Yearly Analysis of Interests and Yearly Analysis of Interest and Principle Amount Problem statement: Write a Python program to calculate the compound interest based on the user's input of principle amount, annual interest rate, number of years, and the number of times the interest is compounded per year. The program should also generate a yearly analysis of the interest and principle amount for each year of the investment. Algorithm: • Get input from the user for principle amount, annual interest rate, number of years and the number of times the interest is compounded per year. • Calculate the compound interest using the formula: A = P (1 + r/n)^(n*t) where A is the amount after t years, P is the principle amount, r is the annual interest rate, n is the number of times the interest is compounded per year, and t is the number of years. • Print the amount and the compound interest. • Generate a yearly analysis of the interest and principle amount for each year of the investment by using a loop to iterate over the number of years, and calculate the interest and principle amount for each year using the formula: interest = (P * r/n) principle = (A - interest) • Print the yearly analysis for each year. Program:
  • 58.
  • 59. SECTION 8: DATA ANALYSIS DATA ANALYSIS WITH PYTHON LIBRARIES
  • 60. Python has a variety of libraries for data analysis, each with its own strengths and weaknesses. Here are some of the most commonly use liabraries:
  • 61. These libraries are all open source and have active communities of developers who contribute new features and bug fixes. By combining these libraries, Python provides a powerful platform for data analysis and machine learning.
  • 62. How To Load Python Libraries To load Python libraries, you can use the import statement followed by the name of the library you want to use. Here's a general syntax: For example, to load the NumPy library, you would write: If you want to use a shorthand name for the library, you can use the as keyword followed by an abbreviation of your choice: For example, to load the Pandas library with the abbreviation pd, you would write: Once you have loaded a library, you can use its functions and objects in your code by prefixing them with the library name or abbreviation. For example, to use the randint() function from the NumPy library, you would write: Note that you need to specify the library name or abbreviation when calling a function or object from a library that you have loaded.
  • 63. Pandas Overview Pandas is a Python library for data manipulation and analysis. It provides data structures for working with labeled and relational data, as well as tools for data cleaning, merging, and aggregation. Here is an overview of some of the key features of Pandas:
  • 64. Here is an example of how to use Pandas to read a CSV file and perform some basic data analysis:
  • 65. Pandas Purpose Pandas is a Python library that provides data manipulation and analysis tools for working with structured data, such as tabular or time-series data. Its main purpose is to make it easy to work with data in Python by providing data structures and functions that simplify common data analysis tasks. Pandas provide two main data structures: Series and Data Frame. A Series is a one- dimensional array-like object that can hold any data type, such as integers, floats, strings, and more. A DataFrame is a two-dimensional table-like structure that consists of rows and columns, where each column can have a different data type. Pandas have a wide range of functions for data cleaning, data transformation, and data analysis, such as filtering, grouping, merging, joining, reshaping, and more. It also has built-in support for handling missing data, time-series data, and working with different file formats, such as CSV, Excel, SQL, and more. Overall, Pandas is widely used in data analysis, scientific research, finance, and other fields where data manipulation and analysis are required.
  • 66. Reading Data Using Pandas Pandas is a popular open-source data manipulation library for Python. It is commonly used to read, manipulate and analyze data. In this tutorial, we will cover how to read data using Pandas. First, you will need to install Pandas. You can install it using pip command: Once you have installed Pandas, you can import it into your Python script or Jupyter Notebook by running: Now let's look at the different ways to read data using Pandas: Reading CSV Files CSV (Comma Separated Values) is a commonly used format for data storage and exchange. You can use Pandas to read CSV files as follows: This will read the CSV file named "file.csv" and create a Pandas Data Frame object named "df".
  • 67. Reading Excel Files Excel files can also be read using Pandas. You can use the read_excel() function to read an Excel file. This will read the Excel file named "file.xlsx" and create a Pandas DataFrame object named "df". Reading JSON Files JSON (JavaScript Object Notation) is a lightweight data interchange format. Pandas can also read JSON files using the read_json() function. This will read the JSON file named "file.json" and create a Pandas DataFrame object named "df". Reading SQL Databases Pandas can also read data from SQL databases. You can use the read_sql() function to read data from a SQL database.
  • 68. This will read all the data from the "table_name" table in the SQL database named "database.db" and create a Pandas DataFrame object named "df". These are some of the ways you can read data using Pandas. Once you have read the data into a DataFrame object, you can use various Pandas functions to manipulate and analyze the data.
  • 69. Exploring Data Frames: ➢ Data frames are a two-dimensional labeled data structure in pandas that can hold data of different types (numeric, string, Boolean, etc.) in columns. ➢ To explore a data frame, you can use methods such as head(), tail(), info(), describe(), shape, columns, dtypes, and isnull(). ➢ These methods help to understand the size of the data, column names, data types, missing values, and other relevant information. These are some basic operations you can perform on data frames in Python using Pandas. Pandas provide many more powerful functions and operations for working with data frames that you can explore further.
  • 70. Pandas Based Questions 1. What is a pandas data frame and how is it different from a panda’s series? 2. How do you read a CSV file into a panda’s data frame? 3. How do you drop columns and rows from a panda’s data frame? 4. How do you merge two pandas data frames? 5. How do you calculate summary statistics (such as mean, median, and standard deviation) for a panda’s data frame? Pandas Based Projects Build a movie recommendation system using pandas: Load a movie dataset into a pandas data frame, clean and preprocess the data, and then use pandas to build a recommendation system that suggests movies based on user preferences (e.g., genre, rating). You can use techniques such as collaborative filtering or content-based filtering. Analyze a sales dataset using pandas: Load a sales dataset into a pandas data frame, clean and preprocess the data (e.g., remove duplicates, handle missing values), and then use pandas to explore and analyze the data (e.g., calculate total sales, average sales by product, visualize sales trends).
  • 71.
  • 72. NumPy NumPy is a Python library for numerical computing that provides support for large, multi- dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. Here's an overview of some of the most commonly used NumPy functions along with examples: Creating Arrays: • NumPy. Array(): Create a NumPy array from a Python list. numpy.zeros(): Create an array of zeros with a specified shape. numpy.ones(): Create an array of ones with a specified shape. numpy.random.rand(): Create an array of random numbers with a specified shape.
  • 73. Array Operation numpy.shape(): Get the shape of an array. NumPy. Reshape(): Reshape an array. numpy.transpose(): Transpose an array. numpy.concatenate(): Concatenate two or more arrays.
  • 74. Mathematical Functions: numpy.sum(): Calculate the sum of an array. numpy.mean(): Calculate the mean of an array. numpy.std(): Calculate the standard deviation of an array. numpy.max(): Find the maximum value in an array. NumPy Based Questions
  • 75. 1. What is NumPy, and why is it used in Python? 2. How do you create a NumPy array? 3. How do you reshape a NumPy array? 4. How do you perform element-wise multiplication of two NumPy arrays? 5. How do you find the maximum and minimum values in a NumPy array? Matplotlib Matplotlib is a Python library for creating visualizations, such as line plots, scatter plots, bar charts, histograms, and more. It provides a wide range of tools for customizing plots and adding annotations, and supports both static and interactive visualizations. Here's an overview of some of the key features and functions of Matplotlib: Here are some examples of Matplotlib functions: Figures and Subplots: Matplotlib uses a Figure object to represent the entire window or page that the plot is drawn on. Within a Figure, one or more Subplots can be created to display different plots. Plotting Functions: Matplotlib provides a variety of functions for creating different types of plots. Some of the most commonly used functions include plot(), scatter(), bar(), hist(), pie(), and imshow(), among others. Customization: Matplotlib offers a wide range of options for customizing plots, including changing line styles, colors, markers, fonts, labels, titles, legends, and more. It also allows for adding annotations, such as text, arrows, and shapes, to the plot. Saving and Exporting: Matplotlib supports saving plots in a variety of formats, such as PNG, PDF, SVG, and more. 1. Plot(): The plot() function is used to create a line plot. It takes x and y arrays as arguments, and can also take optional parameters for customizing the plot, such as color, linestyle, and marker.
  • 76. Scatter(): The scatter() function is used to create a scatter plot. It takes x and y arrays as arguments, and can also take optional parameters for customizing the plot, such as color, size, and alpha.
  • 77. Bar(): The bar() function is used to create a bar chart. It takes x and y arrays as arguments, and can also take optional parameters for customizing the plot, such as color, width, and align.
  • 78. Hist(): The hist() function is used to create a histogram. It takes an array of values as an argument, and can also take optional parameters for customizing the plot, such as bins, range, and density.
  • 79. Pie(): The pie() function is used to create a pie chart. It takes an array of values and labels as arguments, and can also take optional parameters for customizing the plot, such as colors, explode, and start angle.
  • 80. Seaborn Seaborn is a Python library for creating statistical visualizations. It is built on top of Matplotlib and provides a higher-level interface for creating complex and informative plots. Seaborn includes a range of statistical plotting functions, such as regression plots, distribution plots, categorical plots, and more. Here's an overview of some of the key features and functions of Seaborn: Seaborn Data Visualization: Seaborn provides functions for creating a variety of visualizations, such as scatter plots, line plots, bar plots, histogram, and many others. Statistical Analysis: Seaborn also includes functions for conducting statistical analysis, such as hypothesis testing and descriptive statistics. Styling: Seaborn provides a range of options for customizing plots, including color palettes, themes, and grid styles. Integration with Pandas: Seaborn integrates well with Pandas, making it easy to work with data in data frames.
  • 81. Here are some examples of Seaborn functions: scatterplot(): The scatterplot() function is used to create a scatter plot. It takes x and y variables as arguments, and can also take optional parameters for customizing the plot, such as hue, size, and style. lineplot(): The lineplot() function is used to create a line plot. It takes x and y variables as arguments, and can also take optional parameters for customizing the plot, such as hue, style, and markers.
  • 82. histplot(): The histplot() function is used to create a histogram. It takes a variable as an argument, and can also take optional parameters for customizing the plot, such as bins, kde, and stat.
  • 83. boxplot(): The boxplot() function is used to create a box plot. It takes a variable as an argument, and can also take optional parameters for customizing the plot, such as hue, order, and width. catplot(): The catplot() function is used to create a categorical plot. It takes x and y variables as arguments, and can also take optional parameters for customizing the plot, such as kind, hue, and col.
  • 84. These are just a few examples of the many functions available in Seaborn. For more information and examples, please refer to the Seaborn documentation. Questions 1. What is Python, and what are its key features? 2. What is a Python library, and how is it useful? 3. What is Pandas, and what are its key features? 4. What is NumPy, and what are its key features? 5. What is Matplotlib, and what are its key features? 6. What is Seaborn, and what are its key features? 7. How do you import a library in Python? 8. What are the different data structures in Python? 9. What is the difference between a list and a tuple in Python?
  • 85. 10.What is a Data Frame in Pandas? 11.How do you read data from a CSV file in Pandas? 12.How do you handle missing data in Pandas? 13.How do you perform data aggregation in Pandas? 14.What is a NumPy array, and how is it different from a list in Python? 15.How do you create a NumPy array? 16.What is a vectorized operation in NumPy? 17.How do you perform element-wise operations in NumPy? 18.What is the difference between a 1D, 2D, and 3D array in NumPy? 19.What is broadcasting in NumPy? 20.How do you create a histogram in Matplotlib? 21.What is a scatter plot in Matplotlib? 22.How do you create a scatter plot in Matplotlib? 23.What is a line plot in Matplotlib? 24.How do you create a line plot in Matplotlib? 25.What is a bar plot in Matplotlib? 26.How do you create a bar plot in Matplotlib? 27.What is a pie chart in Matplotlib? 28.How do you create a pie chart in Matplotlib?
  • 86. 29.What is a box plot in Seaborn? 30.How do you create a box plot in Seaborn?
  • 87. Solution 1. Python is a high-level, interpreted programming language that is used for a wide range of purposes, including web development, data analysis, machine learning, and more. Its key features include easy-to-learn syntax, a large standard library, and support for multiple programming paradigms such as object-oriented, procedural, and functional programming. 2. A Python library is a collection of pre-written code that can be used to perform specific tasks. Libraries provide a way to avoid writing code from scratch and allow programmers to build on the work of others. They can be used for a wide range of purposes, such as data analysis, web development, machine learning, and more. 3. Pandas is a popular open-source data analysis library for Python. It provides data structures for efficiently storing and manipulating large datasets, as well as a wide range of tools for working with data, including data cleaning, transformation, and analysis. Key features of Pandas include powerful indexing and selection capabilities, tools for merging and joining datasets, and support for time-series data. 4. NumPy is a Python library for working with numerical data. It provides a powerful array data structure, as well as a wide range of tools for performing mathematical operations on arrays, such as linear algebra, Fourier analysis, and more. Key features of NumPy include efficient handling of large arrays, broadcasting for performing operations on arrays of different shapes, and vectorized operations for improved performance. 5. Matplotlib is a popular data visualization library for Python. It provides a wide range of tools for creating visualizations, including line plots, scatter plots, bar charts, and more. Key features of Matplotlib include support for a wide range of customization options, the ability to create complex visualizations, and the ability to output visualizations in a variety of formats.
  • 88. 6. Seaborn is a data visualization library for Python that is built on top of Matplotlib. It provides a high-level interface for creating complex visualizations with fewer lines of code, as well as a wide range of built-in styles and color palettes. Key features of Seaborn include support for creating complex visualizations such as heatmaps, violin plots, and more, and the ability to easily customize visualizations. 7. To import a library in Python, you can use the import statement, followed by the name of the library. For example, to import the NumPy library, you would use the following code: This creates an alias for the NumPy library, so that you can refer to it as np in your code. 8. The different data structures in Python include lists, tuples, sets, and dictionaries. 9. The main difference between a list and a tuple in Python is that lists are mutable, meaning that their contents can be changed after they are created, while tuples are immutable, meaning that their contents cannot be changed after they are created. 10.A Data Frame in Pandas is a two-dimensional table-like data structure with labeled rows and columns, similar to a spreadsheet. It provides powerful tools for working with structured data, including the ability to filter, sort, and manipulate data in a variety of ways. 11.To read data from a CSV file in Pandas, you can use the read_csv() function. For example, the following code reads a CSV file called data.csv and stores it in a Pandas DataFrame called df:
  • 89. 12.To handle missing data in Pandas, we can use the .isna() function to identify missing values in a DataFrame or Series, and then use the .fillna() function to replace the missing values with a specified value or strategy. We can also use the .dropna() function to remove rows or columns that contain missing values. 13.In Pandas, we can perform data aggregation using the .groupby() function, which groups data based on one or more columns, and then applies an aggregation function to each group to compute a summary statistic. 14.A NumPy array is a data structure that represents a multi-dimensional, homogeneous array of values. It is different from a list in Python because it is more efficient for numerical computations, supports vectorized operations, and has a fixed size. 15.To create a NumPy array, we can use the np.array() function and pass a list or tuple of values as an argument. We can also create arrays with a specific shape and data type using functions such as np.zeros(), np.ones(), np.arange(), and np.random.rand(). 16.A vectorized operation in NumPy is an operation that applies to an entire array or a subset of an array, rather than operating on individual elements. Vectorized operations are much faster and more efficient than performing operations on individual elements in a loop. 17.To perform element-wise operations in NumPy, we can use arithmetic operators such as +, -, *, and / or functions such as np.add(), np.subtract(), np.multiply(), and np.divide(). 18.A 1D array in NumPy represents a single sequence of values, while a 2D array represents a matrix with rows and columns, and a 3D array represents a cube with multiple layers, rows, and columns.
  • 90. 19.Broadcasting in NumPy is a feature that allows arrays with different shapes to be used in arithmetic operations. When arrays with different shapes are used in an operation, NumPy automatically broadcasts the smaller array to match the shape of the larger array. 20.To create a histogram in Matplotlib, we can use the plt.hist() function and pass a list or array of values as an argument. We can also specify the number of bins, the range of values, and other parameters to customize the histogram. 21.A scatter plot in Matplotlib is a visualization that displays the relationship between two variables by plotting individual data points as points on a 2D coordinate system. 22.To create a scatter plot in Matplotlib, we can use the plt.scatter() function and pass arrays of x and y values as arguments. We can also customize the appearance of the scatter plot by specifying the color, size, and shape of the points. 23.A line plot in Matplotlib is a visualization that displays the relationship between two variables by connecting individual data points with a line. 24.To create a line plot in Matplotlib, we can use the plt.plot() function and pass arrays of x and y values as arguments. We can also customize the appearance of the line plot by specifying the color, style, and width of the line. 25.A bar plot in Matplotlib is a visualization that displays the relationship between a categorical variable and a numerical variable by displaying the values as bars. 26.To create a bar plot in Matplotlib, we can use the plt.bar() function and pass arrays of x and y values as arguments. We can also customize the appearance of the bar plot by specifying the color, width, and orientation of the bars. 27.A pie chart in Matplotlib is a visualization that displays the relative sizes of different categories as slices of a pie. 28.To create a pie chart in Matplotlib, you can follow these steps:
  • 91. • Import the Matplotlib library • Create a figure and an axis object using the subplots method • Define the data that you want to visualize in the pie chart • Call the pie method on the axis object and pass the data as a parameter • Optionally, you can add a title and legend to the chart Here is an example code snippet that demonstrates how to create a simple pie chart: This will create a pie chart with four slices labeled A, B, C, and D. 29.A box plot is a type of chart that is used to display the distribution of a dataset. It shows the median, quartiles, and outliers of the data. In Seaborn, a box plot can be created using the boxplot function. 30.To create a box plot in Seaborn, you can follow these steps: • Import the Seaborn library
  • 92. • Load the data that you want to visualize • Create a figure and an axis object using the subplots method • Call the boxplot function on the axis object and pass the data as a parameter • Optionally, you can customize the appearance of the chart by setting various parameters Here is an example code snippet that demonstrates how to create a simple box plot in Seaborn: This will create a box plot of the data stored in the my_data.csv file. You can customize the appearance of the chart by setting various parameters, such as the color palette, whisker length, and outliers.
  • 93. 10 Projects 1. Analysis of Customer Reviews: Collect customer reviews from an e-commerce website and analyze them using Pandas and Matplotlib to identify common themes and issues. 2. Sales Forecasting: Use time-series analysis techniques from the Pandas library to forecast future sales and visualize the results using Matplotlib. 3. Analysis of Social Media Data: Collect data from social media platforms like Twitter or Facebook and analyze it using Pandas and Matplotlib to identify trends and patterns. 4. Visualization of Geographic Data: Use the GeoPandas library to visualize geographic data like maps, population density, or election results. 5. Analysis of Web Traffic: Collect data on website traffic and user behavior using Pandas and visualize the results using Matplotlib or Seaborn to identify trends and patterns. 6. Image Processing: Use the OpenCV library and NumPy to perform image processing tasks like image filtering, edge detection, and object detection. 7. Network Analysis: Use the NetworkX library to analyze complex networks like social networks, transportation networks, or power grids.
  • 94. 8. Visualization of Scientific Data: Use the Matplotlib library to visualize scientific data like astronomical observations, climate data, or genetic data. 9. Text Mining: Use natural language processing techniques from the NLTK library to analyze and classify text data like news articles, academic papers, or social media posts. 10.Analysis of Financial Data: Collect and analyze financial data like stock prices, exchange rates, or economic indicators using Pandas and visualize the results using Matplotlib.
  • 95.
  • 96. Interview Question 1. What is Exploratory Data Analysis (EDA)? Answer: Exploratory Data Analysis (EDA) is the process of analyzing and summarizing data sets in order to gain insights into the data. It involves using statistical and visual methods to understand the underlying patterns and relationships within the data. 2. What are the steps involved in EDA? Answer: The steps involved in EDA are: - Data collection and loading - Data cleaning and preparation - Descriptive statistics and data visualization - Correlation and regression analysis - Hypothesis testing and statistical inference 3. What are some commonly used Python libraries for EDA? Answer: Some commonly used Python libraries for EDA are: - Pandas: for data manipulation and analysis - Matplotlib: for data visualization - Seaborn: for statistical data visualization - NumPy: for numerical computing - Scikit-learn: for machine learning 4. How do you load data into Python for EDA? Answer: Data can be loaded into Python for EDA using various methods such as: - Reading from a CSV file using pandas read_csv() method - Reading from an Excel file using pandas read_excel() method - Reading from a database using pandas read_sql() method - Reading from a JSON file using pandas read_json() method 5. How do you check the shape of a dataset in Python?
  • 97. Answer: To check the shape of a dataset in Python, you can use the shape attribute of a pandas DataFrame. For example: `df.shape` will return the number of rows and columns in the DataFrame. 6. How do you check the data types of the columns in a dataset? Answer: To check the data types of the columns in a dataset, you can use the dtypes attribute of a pandas DataFrame. For example: `df.dtypes` will return the data types of all the columns in the DataFrame. 7. How do you handle missing values in a dataset during EDA? Answer: There are various methods to handle missing values in a dataset during EDA, such as: - Removing the rows or columns containing missing values - Imputing the missing values with a fixed value such as mean or median - Imputing the missing values using statistical models such as regression or K-nearest neighbors (KNN) algorithm 8. How do you visualize the distribution of a numerical variable in Python? Answer: To visualize the distribution of a numerical variable in Python, you can use various methods such as: - Histogram using Matplotlib or Seaborn - Density plot using Seaborn - Box plot using Seaborn 9. How do you visualize the relationship between two numerical variables in Python? Answer: To visualize the relationship between two numerical variables in Python, you can use various methods such as: - Scatter plot using Matplotlib or Seaborn - Line plot using Matplotlib or Seaborn - Heatmap using Seaborn
  • 98. . 10. What is a histogram? Answer: A histogram is a graphical representation of the distribution of a numerical variable. It shows the frequencies of different ranges or bins of values. 11. How do you create a histogram in Python? Answer: You can use the `hist` function of pandas or the `histogram` function of numpy to create a histogram in Python. 12. What is a boxplot? Answer: A boxplot is a graphical representation of the distribution of a numerical variable based on its quartiles. It shows the median, quartiles, and outliers of the variable. 13. How do you create a boxplot in Python? Answer: You can use the `boxplot` function of pandas or the `boxplot` function of matplotlib to create a boxplot in Python. 14. What is a scatter plot? Answer: A scatter plot is a graphical representation of the relationship between two numerical variables. It shows how one variable is affected by the other. 15. How do you create a scatter plot in Python? Answer: You can use the `scatter` function of matplotlib to create a scatter plot in Python. 16. What is a heatmap? Answer: A heatmap is a graphical representation of the distribution of a numerical variable across two categorical variables. It shows the intensity of the variable using color scales. 17. How do you create a heatmap in Python? Answer: You can use the `heatmap` function of seaborn to create a heatmap in Python.
  • 99. 18. What is a correlation matrix? Answer: A correlation matrix is a matrix that shows the correlation coefficients between multiple variables. It helps to identify the strength and direction of the relationships between variables. 19. How do you create a correlation matrix in Python? Answer: You can use the `corr` function of pandas or the `heatmap` function of seaborn to create a correlation matrix in Python. 20. How do you detect missing values in Python? Answer: You can use the `isnull` function of pandas to detect missing values in Python. 21. How do you handle missing values in Python? Answer: You can handle missing values by dropping rows or columns with missing values or by imputing missing values with mean, median, or mode. 22. What is data normalization? Answer: Data normalization is the process of transforming data into a standard scale to eliminate the impact of different units, ranges, and distributions of variables. 23. What is Pandas? Answer: Pandas is a Python library that is used for data manipulation and analysis. It provides data structures and functions for working with structured data. 24. What are the two main data structures provided by Pandas? Answer: The two main data structures provided by Pandas are Series and DataFrame. 25. What is a Series in Pandas? Answer: A Series is a one-dimensional array-like object that can hold any data type, such as integers, floating-point numbers, strings, and Python objects.
  • 100. 26. What is a DataFrame in Pandas? Answer: A DataFrame is a two-dimensional tabular data structure with rows and columns. It is similar to a spreadsheet or SQL table. 27. How do you create a Series in Pandas? Answer: You can create a Series in Pandas by passing a list or array of values to the `pd.Series` function. 28. How do you create a DataFrame in Pandas? Answer: You can create a DataFrame in Pandas by passing a dictionary of lists or arrays to the `pd.DataFrame` function. 29. How do you select columns from a DataFrame in Pandas? Answer: You can select columns from a DataFrame in Pandas by using the column name as an index, such as `df['column_name']`. 30. How do you select rows from a DataFrame in Pandas? Answer: You can select rows from a DataFrame in Pandas by using the `loc` or `iloc` function, such as `df.loc[index]` or `df.iloc[row_number]`. 31. How do you rename columns in a DataFrame in Pandas? Answer: You can rename columns in a DataFrame in Pandas by using the `rename` function and passing a dictionary of old and new column names, such as `df.rename(columns={'old_name': 'new_name'})`. 32. How do you drop columns from a DataFrame in Pandas? Answer: You can drop columns from a DataFrame in Pandas by using the `drop` function and passing the column name or index, such as `df.drop('column_name', axis=1)`. 33. How do you add columns to a DataFrame in Pandas?
  • 101. Answer: You can add columns to a DataFrame in Pandas by assigning a new column as a Series or a list, such as `df['new_column'] = [1, 2, 3]`. 34. How do you group data in a DataFrame in Pandas? Answer: You can group data in a DataFrame in Pandas by using the `groupby` function and passing the column name or names to group by, such as `df.groupby('column_name')`. 35. How do you calculate descriptive statistics of a DataFrame in Pandas? Answer: You can calculate descriptive statistics of a DataFrame in Pandas by using the `describe` function, such as `df.describe()`. 36. What is NumPy? Answer: NumPy is a Python library used for numerical computations in scientific computing. It provides support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. 37. How do you create a NumPy array? Answer: NumPy arrays can be created in several ways, including by converting a list or tuple to an array using the `array` function or by using functions like `zeros`, `ones`, and `random`. 38. What is the difference between a list and a NumPy array? Answer: Lists in Python are dynamic and can contain elements of different data types, while NumPy arrays are homogeneous and fixed in size. NumPy arrays are also faster and more memory-efficient than lists when working with large datasets. 39. How do you access elements of a NumPy array? Answer: Elements of a NumPy array can be accessed using indexing or slicing. For example, `arr[0]` would access the first element of the array, and `arr[1:3]` would access elements 1 and 2.
  • 102. 40. How do you find the shape and size of a NumPy array? Answer: The `shape` attribute of a NumPy array returns a tuple containing the dimensions of the array, while the `size` attribute returns the total number of elements in the array. 41. How do you reshape a NumPy array? Answer: The `reshape` function can be used to change the shape of a NumPy array. For example, `arr.reshape(2,3)` would reshape a 1D array with 6 elements into a 2D array with 2 rows and 3 columns. 42. How do you create a copy of a NumPy array? Answer: The `copy` function can be used to create a copy of a NumPy array. For example, `arr_copy = arr.copy()` would create a new copy of `arr`. 43. How do you concatenate two NumPy arrays? Answer: The `concatenate` function can be used to concatenate two NumPy arrays. For example, `np.concatenate((arr1, arr2), axis=0)` would concatenate `arr1` and `arr2` along the rows. 44. What is broadcasting in NumPy? Answer: Broadcasting is a feature in NumPy that allows arrays with different shapes to be used in arithmetic operations. The smaller array is broadcast to match the shape of the larger array, allowing the operation to be performed element-wise. 45. What is the difference between a shallow copy and a deep copy of a NumPy array? Answer: A shallow copy of a NumPy array creates a new array object with a different memory address, but shares the same data as the original array. A deep copy, on the other hand, creates a new array object with a different memory address and a separate copy of the data. 46. How do you find the maximum and minimum values of a NumPy array? Answer: The `max` and `min` functions can be used to find the maximum and minimum values of a NumPy array. For example, `arr.max()` would return the maximum value in `arr`.
  • 103. 47. How do you find the sum and product of a NumPy array? Answer: The `sum` and `prod` functions can be used to find the sum and product of the elements in a NumPy array, respectively. For example, `arr.sum()` would return the sum of the elements in `arr`. 48. How do you compute the mean, median, and standard deviation of a NumPy array? Answer 49. How do you pivot a DataFrame in Pandas? Answer: You can pivot a DataFrame in Pandas by using the `pivot` function and passing the row, column, and value names, such as `df.pivot(index='row_name', columns='column_name', values='value_name')`. 50. How do you melt a DataFrame in Pandas? Answer: You can melt a DataFrame in Pandas by using the `melt` function and passing the id variables and value variables, such as `df.melt(id_vars=['id_var'], value_vars=['value_var'])`. Rough