UiPath Studio Web workshop series - Day 6

Hands-On with
UiPath Studio Web
Practical Task Demonstrations

2
Vajrang Billlakurthi
Transformation Leader
@Vajrang IT Services Pvt Ltd
Swathi Nelakurthi
Associate Automation Developer
@Vajrang IT Services Pvt Ltd

3
• Overview:
- The "PDF Automation Overview" project provides a detailed exploration of PDF automation functionalities in UiPath
Studio Web.
- It encompasses a variety of PDF-related activities, including downloading, text extraction, merging, page range
extraction, image extraction, password protection, and file uploading to Orchestrator Storage..
• Variables Used:
- SamplePDF (Type: File): Variable for the downloaded Sample PDF file.
- ScannedPDF (Type: File): Variable for the downloaded Scanned PDF file.
- SamplePDFText (Type: Text): Variable to store text extracted from SamplePDF.
- ScannedPDFText (Type: Text): Variable to store text extracted from ScannedPDF.
- PDFPageCount (Type: Int32): Variable to store the page count of a PDF.
- MergedPDF (Type: File): Variable to store the merged PDF file.
- ExtractedPDFPageRange (Type: File): Variable to store pages extracted from a PDF based on a specified range.
- ExtractedPDFImages (Type: IEnumerable<ILocalResource): Variable to store the list of extracted images from a PDF.
- PasswordProtectedPDF(Type: File): Variable to store the password-protected PDF file.
1. PDF Automation Overview

4
• Create storage buckets to manage PDF files
- PDF Inputs: Store input PDF files.
- PDF Outputs: Store output files generated during PDF automation.
• Upload files to respective buckets:
- Upload Sample.pdf to the PDF Inputs bucket.
- Upload Scanned.pdf to the PDF Inputs bucket
• Activities:
- Utilize Orchestrator to create and manage storage buckets.
- Use Upload new File to upload files to the designated buckets.
Workflow Overview

5
• Text Extraction from Native PDF
• Extract text from Sample PDF:
- Utilize the "Extract PDF text" activity to extract text from the provided PDF file
- Store the extracted text in a variable named "SamplePDFText".
• Write extracted text from Sample PDF to storage:
- Use the "Write Storage Text" activity to save the extracted text to a file.
- File Path*: Specify a file path with a ".txt" extension, such as "SamplePdf.txt."
• If the file already exists, the extracted text will be written to it. If not, a new text file will be created in the
specified storage bucket, and the extracted text will be written into it
• Activities:
- Extract PDF text: This activity is used to extract text from the PDF file.
- Write Storage Text: This activity writes the extracted text to a file in the specified storage bucket.

6
• Text Extraction from Scanned PDF using OCR
• Extract text from Scanned PDF:
- Use the "Extract PDF text" activity to retrieve text from the scanned PDF file named "Scanned.pdf".
- Toggle on the option to apply OCR to extract text from images within the PDF.
- Store the OCR-extracted text in a variable named "ScannedPDFText".
• Write extracted text to storage:
- Use the "Write Storage Text" activity to save the OCR-extracted text to a file.
- File Path*: Specify a file path with a ".txt" extension, such as "ScannedPdf.txt."
• If the file already exists, the extracted text will be written to it. If not, a new text file will be created in the
specified storage bucket, and the OCR-extracted text will be written into it
• Activities:
- Extract PDF text: This activity retrieves text from the PDF file, including text from images using OCR.
- Apply OCR: This option is toggled on to extract text from images within the PDF.
- Write Storage Text: This activity saves the OCR-extracted text to a file in the specified storage bucket

7
• PDF Page Count
• Get total number of pages in PDF:
- Utilize the "Get PDF Page Count" activity to retrieve the total page count of the PDF.
• PDF file *: Specify the PDF file from which you want to obtain the page count (e.g., SamplePDF).
• The page count will be stored in the autogenerated variable "Page Count".
• Log PDF Page Count
- Use the "Log Message" activity to log the total page count.
• Message: "Total Pages in "+SamplePDFfile.FullName+" is "+pageCount.ToString
• Activities:
- Get PDF Page Count: This activity retrieves the total number of pages in the specified PDF document and stores it
in a variable.
- Log Message: Logs the total page count for reference or debugging purposes, indicating the PDF file name and its
total page count..

8
• Image Extraction from PDF
• Extract images from PDF:
- Use the "Extract PDF Images" activity to extract images from the PDF.
• Create a new variable named "ExtractedPDFImages" to store the extracted images for better understanding and
clarity.
• This activity will generate an autogenerated variable to hold the extracted images, but creating a new variable
with a proper naming convention enhances understanding.
• Upload Each Extracted Image to PDF Outputs bucket
- Utilize a "For Each" activity to iterate through each extracted image.
• The autogenerated variable "CurrentItem" holds the current image being processed.
- Inside For Each Loop
• Use the "Upload Storage File" activity to upload each image file to the "PDF Outputs" bucket.
• Activities:
- Extract PDF Images: Extracts images from the PDF document
- For Each: To iterate through each image file from the variable storing the extracted images
- Upload Storage File: Uploads each extracted image to the specified storage bucket

9
• Page Range Extraction
• Extract specific page range from PDF:
- Utilize the "Extract PDF Page Range" activity to generate a new PDF with specified page ranges
• Provide the original PDF file from which the new PDF will be generated.
• Specify the page range to extract (e.g., "1,3-5" to extract pages 1, 3, 4, and 5)
• Create a new variable named "ExtractedPDFPageRange" to store the newly generated PDF for clarity.
• This activity will generate an autogenerated variable to hold the newly exported PDF, but creating a new
variable with a proper naming convention enhances understanding.
• Upload Newly Generated PDF:
- Upload the newly generated PDF to the "PDF Outputs" bucket.
• Specify the file to be uploaded as the "ExtractedPDFPageRange" variable.
• Define the path where you want to upload the file in the storage bucket as "PDFByRange.pdf".
• Activities:
- Extract PDF Page Range: This activity extracts a specified page range from the PDF and creates a new PDF
file.
- Upload Storage File: This activity uploads the generated PDF to the specified storage bucket.

10
• Password Protected PDF File
• Create password protected PDF:
- Utilize Set PDF Password activity to encrypt Sample.pdf with a password.
• Provide the original PDF file from which the new Password Protected PDF will be generated
• In Show Additional Options in New open password, specify the password (Ex: 123456)
• Create a new variable named "PasswordProtectedPDF" to store the newly generated PDF for clarity.
• This activity will generate an autogenerated variable to hold the newly generated PDF, but creating a
new variable with a proper naming convention enhances understanding.
• Upload the password protected PDF
- Use Upload Storage File activity to upload the password-protected PDF to the PDF Outputs bucket.
• Specify the file to be uploaded as the "PasswordProtectedPDF" variable.
• Define the path where you want to upload the file in the storage bucket as "PasswordProtectedPDF.pdf"
• Activities:
- Set PDF Password: This activity encrypts the specified PDF file with a password.
- Upload Storage File: This activity uploads the generated PDF to the specified storage bucket.

11
• Merge PDF Files
• Merge Mutiple PDF Files
- Utilize the "Merge PDF" activity to merge the "Sample PDF" and "Scanned PDF" into a single PDF.
- In the connection builder, add multiple PDF files to generate the new merged PDF by merging the original
files.
• Create a new variable named "MergedPDF" to store the newly generated PDF for clarity.
• This activity will generate an autogenerated variable to hold the newly generated PDF, but creating
a new variable with a proper naming convention enhances understanding.
• Upload the merged PDF
- Use Upload Storage File activity to upload the merged PDF to the PDF Outputs bucket.
• Specify the file to be uploaded as the "MergedPDF" variable.
• Define the path where you want to upload the file in the storage bucket as "MergedPDF.pdf"
- Activities:
- Merge PDF files: This activity merges multiple PDF files into a single PDF document.
- Upload Storage File: This activity uploads the merged PDF to the specified storage bucket.

12
• Summary and Conclusion
• Summary:
- PDF automation in UiPath Studio Web enables efficient handling of PDF files.
- Various activities facilitate text extraction, image extraction, page manipulation, and security
enhancements.
- Orchestrator integration simplifies storage and management of PDF files.
• Conclusion:
- With PDF automation capabilities, UiPath Studio Web empowers users to streamline PDF processing
tasks.
- Explore the range of PDF activities to enhance document management workflows and increase
productivity.

13
Studio Web Tenant Access
https://forms.office.com/r/pMKwTRDkkw

UiPath Studio Web workshop series - Day 6

Recommended

Recommended

More Related Content

Similar to UiPath Studio Web workshop series - Day 6

Similar to UiPath Studio Web workshop series - Day 6 (20)

More from DianaGray10

More from DianaGray10 (20)

Recently uploaded

Recently uploaded (20)

UiPath Studio Web workshop series - Day 6