SlideShare a Scribd company logo
HTML files
Library known as beautifulsoup. Using this library, we can search for the values of
html tags and get specific data like title of the page and the list of headers in the
page.
Install Beautifulsoup
Use the Anaconda package manager to install the required package and its
dependent packages.
conda install Beaustifulsoap
Reading the HTML file
In the below example we make a request to an url to be loaded into the python
environment. Then use the html parser parameter to read the entire html file.
Next, we print first few lines of the html page.
• import urllib2
• from BeautifulSoup4 import BeautifulSoup
• # Fetch the html file
• response = urllib2.urlopen(https://avanthimca.ac.in/')
• html_doc = response.read()
• # Parse the html file
• soup = BeautifulSoup(html_doc, 'html.parser')
• # Format the parsed html file
• strhtm = soup.prettify()
• # Print the first few characters
• print (strhtm[:225])
Extracting Tag Value
• We can extract tag value from the first instance of the tag using the following code.
• import urllib2
• from bs4 import BeautifulSoup
• response = urllib2.urlopen(https://www.osmania.ac.in/examination-results0.php')
• html_doc = response.read()
• soup = BeautifulSoup(html_doc, 'html.parser')
• print (soup.title)
• print(soup.title.string)
• print(soup.a.string)
• print(soup.b.string)
Extracting All Tags
• We can extract tag value from all the instances of a tag using the following code.
• import urllib2
• from bs4 import BeautifulSoup
• response =
urllib2.urlopen('http://tutorialspoint.com/python/python_overview.htm')
• html_doc = response.read()
• soup = BeautifulSoup(html_doc, 'html.parser')
• for x in soup.find_all('b'): print(x.string)
• Creating an HTML file in python
• We will be storing HTML tags in a multi-line Python string and saving
the contents to a new file. This file will be saved with a .html
extension rather than a .txt extension.
• f = open('GFG.html', 'w')
•
• # the html code which will go in the file GFG.html
• html_template = """<html>
• <head>
• <title>Title</title>
• </head>
• <body>
• <h2>Welcome To GFG</h2>
•
• <p>Default code has been loaded into the Editor.</p>
•
• </body>
• </html>
• """
•
• # writing the code into the file
• f.write(html_template)
•
• # close the file
• f.close()
Viewing the HTML source file
• In order to display the HTML file as a python output, we will be using
the codecs library. This library is used to open files which have a
certain encoding. It takes a parameter encoding which makes it
different from the built-in open() function. The open() function does
not contain any parameter to specify the file encoding, which most of
the time makes it difficult for viewing files which are not ASCII but
UTF-8.
• # import module
• import codecs
•
• # to open/create a new html file in the write mode
• f = open('GFG.html', 'w')
•
• # the html code which will go in the file GFG.html
• html_template = """
• <html>
• <head></head>
• <body>
• <p>Hello World! </p>
•
• </body>
• </html>
• """
•
• # writing the code into the file
• f.write(html_template)
•
• # close the file
• f.close()
•
• # viewing html files
• # below code creates a
• # codecs.StreamReaderWriter object
• file = codecs.open("GFG.html", 'r', "utf-8")
•
• # using .read method to view the html
• # code from our object
• print(file.read())
Viewing the HTML web file
• In Python, webbrowser module provides a high-level interface which
allows displaying Web-based documents to users.
The webbrowser module can be used to launch a browser in a
platform-independent manner as shown below:
• # import module
• import webbrowser
•
• # open html file
• webbrowser.open('GFG.html')
Regular Expressions with GLOB module
• With the help of the Python glob module, we can search for all the
path names which are looking for files matching a specific pattern
(which is defined by us).
• The specified pattern for file matching is defined according to the
rules dictated by the Unix shell.
• The result obtained by following these rules for a specific pattern file
matching is returned in the arbitrary order in the output of the
program.
• While using the file matching pattern, we have to fulfil some
requirements of the glob module because the module can travel
through the list of the files at some location in our local disk.
Glob Module Functions
• Now, we will discuss various more functions of the glob module and
understand their working inside a Python program.
• We will also learn that how these functions help us in the pattern
matching task.
• Look at the following list of functions that we have in the glob
module, and with the help of these functions, we can carry out the
task of filename pattern matching very smoothly:
• iglob()
• glob()
• escape()
• 1. iglob() Function: The iglob() function of the glob module is very
helpful in yielding the arbitrary values of the list of files in the output.
• We can create a Python generator with the iglob() method. We can
use the Python generator created by the glob module to list down the
files under a given directory.
• This function also returns an iterator when called, and the iterator
returned by it yields the values (list of files) without storing all of the
filenames simultaneously.
• Syntax: Following is the syntax for using the iglob() function of glob
module inside a Python program:
• iglob(pathname, *, recursive=False)
• # Import glob module in the program
• import glob as gb
• # Initialize a variable
• inVar = gb.iglob("*.py") # Set Pattern in iglob() function
• # Returning class type of variable
• print(type(inVar))
• # Printing list of names of all files that matched the pattern
• print("List of the all the files in the directory having extension .py: ")
• for py in inVar:
• print(py)
glob() Function:
• With the help of the glob() function, we can also get the list of files
that matching a specific pattern (We have to define that specific
pattern inside the function).
• The list returned by the glob() function will be a string that should
contain a path specification according to the path we have defined
inside the function.
• The string or iterator for glob() function actually returns the same
value as returned by the iglob() function without actually storing
these values (filenames) in it.
• glob(pathname, *, recursive = True
• # Import glob module in the program
• import glob as gb
• # Initialize a variable
• genVar = gb.glob("*.py") # Set Pattern in glob() function
• # Printing list of names of all files that matched the pattern
• print("List of the all the files in the directory having extension .py: ")
• for py in genVar:
• print(py)
escape() Function:
• The escape() becomes very impactful as it allows us to escape the
given character sequence, which we defined in the function.
• The escape() function is very handy for locating files that having
certain characters (as we will define in the function) in their file
names.
• It will match the sequence by matching an arbitrary literal string in
the file names with that special character in them.
• escape(pathname)
• # Import glob module in the program
• import glob as gb
• # Initialize a variable
• charSeq = "-_#"
• print("Following is the list of filenames that match the special character seq
uence of escape function: ")
• # Using nested for loop to get the filenames
• for splChar in charSeq:
• # Pathname for the glob() function
• escSet = "*" + gb.escape(splChar) + "*" + ".py"
• # Printing list of filenames with glob() function
• for py in (gb.glob(escSet)):
• print(py)

More Related Content

Similar to HTML files in python.pptx

Python Programming for ArcGIS: Part I
Python Programming for ArcGIS: Part IPython Programming for ArcGIS: Part I
Python Programming for ArcGIS: Part I
DUSPviz
 
Getting Started with Go
Getting Started with GoGetting Started with Go
Getting Started with Go
Steven Francia
 
Mongo db eveningschemadesign
Mongo db eveningschemadesignMongo db eveningschemadesign
Mongo db eveningschemadesignMongoDB APAC
 
Golang
GolangGolang
Golang
Felipe Mamud
 
C-Programming C LIBRARIES AND USER DEFINED LIBRARIES.pptx
C-Programming  C LIBRARIES AND USER DEFINED LIBRARIES.pptxC-Programming  C LIBRARIES AND USER DEFINED LIBRARIES.pptx
C-Programming C LIBRARIES AND USER DEFINED LIBRARIES.pptx
SKUP1
 
C-Programming C LIBRARIES AND USER DEFINED LIBRARIES.pptx
C-Programming  C LIBRARIES AND USER DEFINED LIBRARIES.pptxC-Programming  C LIBRARIES AND USER DEFINED LIBRARIES.pptx
C-Programming C LIBRARIES AND USER DEFINED LIBRARIES.pptx
LECO9
 
Learn Python The Hard Way Presentation
Learn Python The Hard Way PresentationLearn Python The Hard Way Presentation
Learn Python The Hard Way Presentation
Amira ElSharkawy
 
Untangling fall2017 week2_try2
Untangling fall2017 week2_try2Untangling fall2017 week2_try2
Untangling fall2017 week2_try2
Derek Jacoby
 
Untangling fall2017 week2
Untangling fall2017 week2Untangling fall2017 week2
Untangling fall2017 week2
Derek Jacoby
 
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
Diego Freniche Brito
 
Fedora Developer's Conference 2014 Talk
Fedora Developer's Conference 2014 TalkFedora Developer's Conference 2014 Talk
Fedora Developer's Conference 2014 Talk
Rainer Gerhards
 
Managing Drupal interface translation
Managing Drupal interface translationManaging Drupal interface translation
Managing Drupal interface translation
LimoenGroen
 
Infinum android talks_10_getting groovy on android
Infinum android talks_10_getting groovy on androidInfinum android talks_10_getting groovy on android
Infinum android talks_10_getting groovy on android
Infinum
 
Git - Some tips to do it better
Git - Some tips to do it betterGit - Some tips to do it better
Git - Some tips to do it better
Jonas De Smet
 
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
AboutYouGmbH
 
Intro To C++ - Class #21: Files
Intro To C++ - Class #21: FilesIntro To C++ - Class #21: Files
Intro To C++ - Class #21: Files
Blue Elephant Consulting
 
Advanced Rational Robot A Tribute (http://www.geektester.blogspot.com)
Advanced Rational Robot   A Tribute (http://www.geektester.blogspot.com)Advanced Rational Robot   A Tribute (http://www.geektester.blogspot.com)
Advanced Rational Robot A Tribute (http://www.geektester.blogspot.com)
raj.kamal13
 
Git hub
Git hubGit hub
Git hub
Nitin Goel
 
Translate word press to your language
Translate word press to your languageTranslate word press to your language
Translate word press to your language
mbigul
 

Similar to HTML files in python.pptx (20)

Python Programming for ArcGIS: Part I
Python Programming for ArcGIS: Part IPython Programming for ArcGIS: Part I
Python Programming for ArcGIS: Part I
 
Getting Started with Go
Getting Started with GoGetting Started with Go
Getting Started with Go
 
Mongo db eveningschemadesign
Mongo db eveningschemadesignMongo db eveningschemadesign
Mongo db eveningschemadesign
 
Golang
GolangGolang
Golang
 
C-Programming C LIBRARIES AND USER DEFINED LIBRARIES.pptx
C-Programming  C LIBRARIES AND USER DEFINED LIBRARIES.pptxC-Programming  C LIBRARIES AND USER DEFINED LIBRARIES.pptx
C-Programming C LIBRARIES AND USER DEFINED LIBRARIES.pptx
 
C-Programming C LIBRARIES AND USER DEFINED LIBRARIES.pptx
C-Programming  C LIBRARIES AND USER DEFINED LIBRARIES.pptxC-Programming  C LIBRARIES AND USER DEFINED LIBRARIES.pptx
C-Programming C LIBRARIES AND USER DEFINED LIBRARIES.pptx
 
Learn Python The Hard Way Presentation
Learn Python The Hard Way PresentationLearn Python The Hard Way Presentation
Learn Python The Hard Way Presentation
 
Untangling fall2017 week2_try2
Untangling fall2017 week2_try2Untangling fall2017 week2_try2
Untangling fall2017 week2_try2
 
Untangling fall2017 week2
Untangling fall2017 week2Untangling fall2017 week2
Untangling fall2017 week2
 
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
 
Python Tutorial Part 2
Python Tutorial Part 2Python Tutorial Part 2
Python Tutorial Part 2
 
Fedora Developer's Conference 2014 Talk
Fedora Developer's Conference 2014 TalkFedora Developer's Conference 2014 Talk
Fedora Developer's Conference 2014 Talk
 
Managing Drupal interface translation
Managing Drupal interface translationManaging Drupal interface translation
Managing Drupal interface translation
 
Infinum android talks_10_getting groovy on android
Infinum android talks_10_getting groovy on androidInfinum android talks_10_getting groovy on android
Infinum android talks_10_getting groovy on android
 
Git - Some tips to do it better
Git - Some tips to do it betterGit - Some tips to do it better
Git - Some tips to do it better
 
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
 
Intro To C++ - Class #21: Files
Intro To C++ - Class #21: FilesIntro To C++ - Class #21: Files
Intro To C++ - Class #21: Files
 
Advanced Rational Robot A Tribute (http://www.geektester.blogspot.com)
Advanced Rational Robot   A Tribute (http://www.geektester.blogspot.com)Advanced Rational Robot   A Tribute (http://www.geektester.blogspot.com)
Advanced Rational Robot A Tribute (http://www.geektester.blogspot.com)
 
Git hub
Git hubGit hub
Git hub
 
Translate word press to your language
Translate word press to your languageTranslate word press to your language
Translate word press to your language
 

More from Ramakrishna Reddy Bijjam

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
Ramakrishna Reddy Bijjam
 
Arrays to arrays and pointers with arrays.pptx
Arrays to arrays and pointers with arrays.pptxArrays to arrays and pointers with arrays.pptx
Arrays to arrays and pointers with arrays.pptx
Ramakrishna Reddy Bijjam
 
Auxiliary, Cache and Virtual memory.pptx
Auxiliary, Cache and Virtual memory.pptxAuxiliary, Cache and Virtual memory.pptx
Auxiliary, Cache and Virtual memory.pptx
Ramakrishna Reddy Bijjam
 
Python With MongoDB in advanced Python.pptx
Python With MongoDB in advanced Python.pptxPython With MongoDB in advanced Python.pptx
Python With MongoDB in advanced Python.pptx
Ramakrishna Reddy Bijjam
 
Pointers and single &multi dimentionalarrays.pptx
Pointers and single &multi dimentionalarrays.pptxPointers and single &multi dimentionalarrays.pptx
Pointers and single &multi dimentionalarrays.pptx
Ramakrishna Reddy Bijjam
 
Certinity Factor and Dempster-shafer theory .pptx
Certinity Factor and Dempster-shafer theory .pptxCertinity Factor and Dempster-shafer theory .pptx
Certinity Factor and Dempster-shafer theory .pptx
Ramakrishna Reddy Bijjam
 
Auxiliary Memory in computer Architecture.pptx
Auxiliary Memory in computer Architecture.pptxAuxiliary Memory in computer Architecture.pptx
Auxiliary Memory in computer Architecture.pptx
Ramakrishna Reddy Bijjam
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
Ramakrishna Reddy Bijjam
 
K Means Clustering in ML.pptx
K Means Clustering in ML.pptxK Means Clustering in ML.pptx
K Means Clustering in ML.pptx
Ramakrishna Reddy Bijjam
 
Pandas.pptx
Pandas.pptxPandas.pptx
Python With MongoDB.pptx
Python With MongoDB.pptxPython With MongoDB.pptx
Python With MongoDB.pptx
Ramakrishna Reddy Bijjam
 
Python with MySql.pptx
Python with MySql.pptxPython with MySql.pptx
Python with MySql.pptx
Ramakrishna Reddy Bijjam
 
PYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdfPYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdf
Ramakrishna Reddy Bijjam
 
BInary file Operations.pptx
BInary file Operations.pptxBInary file Operations.pptx
BInary file Operations.pptx
Ramakrishna Reddy Bijjam
 
Data Science in Python.pptx
Data Science in Python.pptxData Science in Python.pptx
Data Science in Python.pptx
Ramakrishna Reddy Bijjam
 
CSV JSON and XML files in Python.pptx
CSV JSON and XML files in Python.pptxCSV JSON and XML files in Python.pptx
CSV JSON and XML files in Python.pptx
Ramakrishna Reddy Bijjam
 
Regular Expressions in Python.pptx
Regular Expressions in Python.pptxRegular Expressions in Python.pptx
Regular Expressions in Python.pptx
Ramakrishna Reddy Bijjam
 
datareprersentation 1.pptx
datareprersentation 1.pptxdatareprersentation 1.pptx
datareprersentation 1.pptx
Ramakrishna Reddy Bijjam
 
Apriori.pptx
Apriori.pptxApriori.pptx
Eclat.pptx
Eclat.pptxEclat.pptx

More from Ramakrishna Reddy Bijjam (20)

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Arrays to arrays and pointers with arrays.pptx
Arrays to arrays and pointers with arrays.pptxArrays to arrays and pointers with arrays.pptx
Arrays to arrays and pointers with arrays.pptx
 
Auxiliary, Cache and Virtual memory.pptx
Auxiliary, Cache and Virtual memory.pptxAuxiliary, Cache and Virtual memory.pptx
Auxiliary, Cache and Virtual memory.pptx
 
Python With MongoDB in advanced Python.pptx
Python With MongoDB in advanced Python.pptxPython With MongoDB in advanced Python.pptx
Python With MongoDB in advanced Python.pptx
 
Pointers and single &multi dimentionalarrays.pptx
Pointers and single &multi dimentionalarrays.pptxPointers and single &multi dimentionalarrays.pptx
Pointers and single &multi dimentionalarrays.pptx
 
Certinity Factor and Dempster-shafer theory .pptx
Certinity Factor and Dempster-shafer theory .pptxCertinity Factor and Dempster-shafer theory .pptx
Certinity Factor and Dempster-shafer theory .pptx
 
Auxiliary Memory in computer Architecture.pptx
Auxiliary Memory in computer Architecture.pptxAuxiliary Memory in computer Architecture.pptx
Auxiliary Memory in computer Architecture.pptx
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
 
K Means Clustering in ML.pptx
K Means Clustering in ML.pptxK Means Clustering in ML.pptx
K Means Clustering in ML.pptx
 
Pandas.pptx
Pandas.pptxPandas.pptx
Pandas.pptx
 
Python With MongoDB.pptx
Python With MongoDB.pptxPython With MongoDB.pptx
Python With MongoDB.pptx
 
Python with MySql.pptx
Python with MySql.pptxPython with MySql.pptx
Python with MySql.pptx
 
PYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdfPYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdf
 
BInary file Operations.pptx
BInary file Operations.pptxBInary file Operations.pptx
BInary file Operations.pptx
 
Data Science in Python.pptx
Data Science in Python.pptxData Science in Python.pptx
Data Science in Python.pptx
 
CSV JSON and XML files in Python.pptx
CSV JSON and XML files in Python.pptxCSV JSON and XML files in Python.pptx
CSV JSON and XML files in Python.pptx
 
Regular Expressions in Python.pptx
Regular Expressions in Python.pptxRegular Expressions in Python.pptx
Regular Expressions in Python.pptx
 
datareprersentation 1.pptx
datareprersentation 1.pptxdatareprersentation 1.pptx
datareprersentation 1.pptx
 
Apriori.pptx
Apriori.pptxApriori.pptx
Apriori.pptx
 
Eclat.pptx
Eclat.pptxEclat.pptx
Eclat.pptx
 

Recently uploaded

Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
goswamiyash170123
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
chanes7
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Marketing internship report file for MBA
Marketing internship report file for MBAMarketing internship report file for MBA
Marketing internship report file for MBA
gb193092
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
 

Recently uploaded (20)

Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdfMASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
MASS MEDIA STUDIES-835-CLASS XI Resource Material.pdf
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
Digital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion DesignsDigital Artifact 2 - Investigating Pavilion Designs
Digital Artifact 2 - Investigating Pavilion Designs
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Marketing internship report file for MBA
Marketing internship report file for MBAMarketing internship report file for MBA
Marketing internship report file for MBA
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
 

HTML files in python.pptx

  • 1. HTML files Library known as beautifulsoup. Using this library, we can search for the values of html tags and get specific data like title of the page and the list of headers in the page. Install Beautifulsoup Use the Anaconda package manager to install the required package and its dependent packages. conda install Beaustifulsoap Reading the HTML file In the below example we make a request to an url to be loaded into the python environment. Then use the html parser parameter to read the entire html file. Next, we print first few lines of the html page.
  • 2. • import urllib2 • from BeautifulSoup4 import BeautifulSoup • # Fetch the html file • response = urllib2.urlopen(https://avanthimca.ac.in/') • html_doc = response.read() • # Parse the html file • soup = BeautifulSoup(html_doc, 'html.parser') • # Format the parsed html file • strhtm = soup.prettify() • # Print the first few characters • print (strhtm[:225])
  • 3. Extracting Tag Value • We can extract tag value from the first instance of the tag using the following code. • import urllib2 • from bs4 import BeautifulSoup • response = urllib2.urlopen(https://www.osmania.ac.in/examination-results0.php') • html_doc = response.read() • soup = BeautifulSoup(html_doc, 'html.parser') • print (soup.title) • print(soup.title.string) • print(soup.a.string) • print(soup.b.string)
  • 4. Extracting All Tags • We can extract tag value from all the instances of a tag using the following code. • import urllib2 • from bs4 import BeautifulSoup • response = urllib2.urlopen('http://tutorialspoint.com/python/python_overview.htm') • html_doc = response.read() • soup = BeautifulSoup(html_doc, 'html.parser') • for x in soup.find_all('b'): print(x.string)
  • 5. • Creating an HTML file in python • We will be storing HTML tags in a multi-line Python string and saving the contents to a new file. This file will be saved with a .html extension rather than a .txt extension.
  • 6. • f = open('GFG.html', 'w') • • # the html code which will go in the file GFG.html • html_template = """<html> • <head> • <title>Title</title> • </head> • <body> • <h2>Welcome To GFG</h2> • • <p>Default code has been loaded into the Editor.</p> • • </body> • </html> • """ • • # writing the code into the file • f.write(html_template) • • # close the file • f.close()
  • 7. Viewing the HTML source file • In order to display the HTML file as a python output, we will be using the codecs library. This library is used to open files which have a certain encoding. It takes a parameter encoding which makes it different from the built-in open() function. The open() function does not contain any parameter to specify the file encoding, which most of the time makes it difficult for viewing files which are not ASCII but UTF-8.
  • 8. • # import module • import codecs • • # to open/create a new html file in the write mode • f = open('GFG.html', 'w') • • # the html code which will go in the file GFG.html • html_template = """ • <html> • <head></head> • <body> • <p>Hello World! </p> • • </body> • </html> • """ • • # writing the code into the file
  • 9. • f.write(html_template) • • # close the file • f.close() • • # viewing html files • # below code creates a • # codecs.StreamReaderWriter object • file = codecs.open("GFG.html", 'r', "utf-8") • • # using .read method to view the html • # code from our object • print(file.read())
  • 10. Viewing the HTML web file • In Python, webbrowser module provides a high-level interface which allows displaying Web-based documents to users. The webbrowser module can be used to launch a browser in a platform-independent manner as shown below: • # import module • import webbrowser • • # open html file • webbrowser.open('GFG.html')
  • 11. Regular Expressions with GLOB module • With the help of the Python glob module, we can search for all the path names which are looking for files matching a specific pattern (which is defined by us). • The specified pattern for file matching is defined according to the rules dictated by the Unix shell. • The result obtained by following these rules for a specific pattern file matching is returned in the arbitrary order in the output of the program. • While using the file matching pattern, we have to fulfil some requirements of the glob module because the module can travel through the list of the files at some location in our local disk.
  • 12. Glob Module Functions • Now, we will discuss various more functions of the glob module and understand their working inside a Python program. • We will also learn that how these functions help us in the pattern matching task. • Look at the following list of functions that we have in the glob module, and with the help of these functions, we can carry out the task of filename pattern matching very smoothly: • iglob() • glob() • escape()
  • 13. • 1. iglob() Function: The iglob() function of the glob module is very helpful in yielding the arbitrary values of the list of files in the output. • We can create a Python generator with the iglob() method. We can use the Python generator created by the glob module to list down the files under a given directory. • This function also returns an iterator when called, and the iterator returned by it yields the values (list of files) without storing all of the filenames simultaneously. • Syntax: Following is the syntax for using the iglob() function of glob module inside a Python program: • iglob(pathname, *, recursive=False)
  • 14. • # Import glob module in the program • import glob as gb • # Initialize a variable • inVar = gb.iglob("*.py") # Set Pattern in iglob() function • # Returning class type of variable • print(type(inVar)) • # Printing list of names of all files that matched the pattern • print("List of the all the files in the directory having extension .py: ") • for py in inVar: • print(py)
  • 15. glob() Function: • With the help of the glob() function, we can also get the list of files that matching a specific pattern (We have to define that specific pattern inside the function). • The list returned by the glob() function will be a string that should contain a path specification according to the path we have defined inside the function. • The string or iterator for glob() function actually returns the same value as returned by the iglob() function without actually storing these values (filenames) in it. • glob(pathname, *, recursive = True
  • 16. • # Import glob module in the program • import glob as gb • # Initialize a variable • genVar = gb.glob("*.py") # Set Pattern in glob() function • # Printing list of names of all files that matched the pattern • print("List of the all the files in the directory having extension .py: ") • for py in genVar: • print(py)
  • 17. escape() Function: • The escape() becomes very impactful as it allows us to escape the given character sequence, which we defined in the function. • The escape() function is very handy for locating files that having certain characters (as we will define in the function) in their file names. • It will match the sequence by matching an arbitrary literal string in the file names with that special character in them. • escape(pathname)
  • 18. • # Import glob module in the program • import glob as gb • # Initialize a variable • charSeq = "-_#" • print("Following is the list of filenames that match the special character seq uence of escape function: ") • # Using nested for loop to get the filenames • for splChar in charSeq: • # Pathname for the glob() function • escSet = "*" + gb.escape(splChar) + "*" + ".py" • # Printing list of filenames with glob() function • for py in (gb.glob(escSet)): • print(py)