SlideShare a Scribd company logo
| Web Scraping and Automation With Outsystems
No API? No Problem! Let
the Robot Do Your Work!
Web Scraping and Automation With Outsystems
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
Miguel
Antunes
OutSystems MVP - Tech Lead | Do iT Lean
@
in
miguel.antunes@doitlean.com
/antunes-miguel
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
we ♥ APIs,
but… we don’t always have them
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
Pulling data straight out
of HTML – otherwise
known as web scraping.
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
Any content that can be
viewed on a webpage
can be scraped.
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
but… Why You Should
Scrape?
| Web Scraping and Automation With Outsystems
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
No Rate-Limiting
| Web Scraping and Automation With Outsystems
Anonymous Access
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
The Data’s Already in
Your Face
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
Let’s Get to Scraping
| Web Scraping and Automation With Outsystems
No matter what language you’re
into, there’s a great scraping
library for your project:
● BeautifulSoup or Scrapy,
Python
● Upton or Wombat or
Nokogiri, Ruby
● Scraperjs or X-ray, Node
● Scrape, Go
● Jaunt, Java
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
+ Text and HTML
Processing
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
Leonardo Fernandes
Head of Delivery OutSystems, MVP | Phoenix Services
| Web Scraping and Automation With Outsystems
Extract information from plain text data with regular
expressions, or from HTML with CSS selectors.
Manipulate HTML documents with ease, and sanitize user
input against HTML injection.
| Web Scraping and Automation With Outsystems
The Plan
● Pinpoint your target: a simple
html website
● Design your scraping scheme
● Run & let the magic operate
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
Hands-on!
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
What about Enterprise
usage?
You may ask...
| Web Scraping and Automation With Outsystems
| Web Scraping and Automation With Outsystems
Frankort & Koning needs
● Check if Product/Producers is
certified
● Do that multiple times per
day, multiple times per
product
Global Gap problems
● No API available
● All the checks needs to be
done manually
| Web Scraping and Automation With Outsystems
How does it work…
You want to know which farm
produced your product?
● On the packaging of several products, you can find a 13-digit GLOBALG.A.P. Number
(GGN).
This number identifies the producer or producer group that has farmed your
product.
● As a consumer, you can use it to verify whether the product is from a certified
producer or not in the GLOBALG.A.P. Database.
● Retailers also use this number for business-to-business traceability to ensure that
products–especially fresh fruit and vegetables–come from a certified origin and that
the production is safe and sustainable.
| Web Scraping and Automation With Outsystems
| Web Scraping and Automation With Outsystems
OutSystems + Selenium + Chrome
● Automate user
interactions
● Extract HTML
● Parse HTML as before
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
Let’s see it in action...
| Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems
700+
Producers
160+
Products
900+
Certificates
*estimating that each certificate would take 1 minute to check manually
~15h
Manually*
~2h
Automatically
| Web Scraping and Automation With Outsystems
Thank You!
@
in
miguel.antunes@doitlean.co
m
/antunes-miguel

More Related Content

What's hot

Html5で変わるいろんなこと
Html5で変わるいろんなことHtml5で変わるいろんなこと
Html5で変わるいろんなことMasakazu Muraoka
 
Rencore Webinar: SharePoint Customizations - the most overlooked road block t...
Rencore Webinar: SharePoint Customizations - the most overlooked road block t...Rencore Webinar: SharePoint Customizations - the most overlooked road block t...
Rencore Webinar: SharePoint Customizations - the most overlooked road block t...
Rencore
 
BDD for RIAs with JavaScript - Skills Matter
BDD for RIAs with JavaScript - Skills MatterBDD for RIAs with JavaScript - Skills Matter
BDD for RIAs with JavaScript - Skills MatterCarlos Ble
 
End to-end test automation at scale
End to-end test automation at scaleEnd to-end test automation at scale
End to-end test automation at scale
mabl
 
Intro to Web Development with Microsoft Technologies
Intro to Web Development with Microsoft TechnologiesIntro to Web Development with Microsoft Technologies
Intro to Web Development with Microsoft Technologies
Bilal Amjad
 
Headless cms architecture
Headless cms architectureHeadless cms architecture
Headless cms architecture
Kevin Wenger
 
HCM Scrum Breakfast – The real life of Scrumban team
HCM Scrum Breakfast – The real life of Scrumban teamHCM Scrum Breakfast – The real life of Scrumban team
HCM Scrum Breakfast – The real life of Scrumban team
Scrum Breakfast Vietnam
 
Workflows - The Rise of the Machines
Workflows - The Rise of the MachinesWorkflows - The Rise of the Machines
Workflows - The Rise of the Machines
Kevin Wenger
 
mabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platformmabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platform
Joseph Lust
 
Building Office 365 solutions with React
Building Office 365 solutions with ReactBuilding Office 365 solutions with React
Building Office 365 solutions with React
Waldek Mastykarz
 
Episode 17 - Handling Events in Lightning Web Component
Episode 17 - Handling Events in Lightning Web ComponentEpisode 17 - Handling Events in Lightning Web Component
Episode 17 - Handling Events in Lightning Web Component
Jitendra Zaa
 
Scrum refresh
Scrum refreshScrum refresh
Server Side Responsive Layouts for ASP.NET WebForms using Telerik UI for ASP....
Server Side Responsive Layouts for ASP.NET WebForms using Telerik UI for ASP....Server Side Responsive Layouts for ASP.NET WebForms using Telerik UI for ASP....
Server Side Responsive Layouts for ASP.NET WebForms using Telerik UI for ASP....
Lohith Goudagere Nagaraj
 
Episode 19 - Asynchronous Apex - Batch apex & schedulers
Episode 19 - Asynchronous Apex - Batch apex & schedulersEpisode 19 - Asynchronous Apex - Batch apex & schedulers
Episode 19 - Asynchronous Apex - Batch apex & schedulers
Jitendra Zaa
 
MOB PROGRAMMING
MOB PROGRAMMINGMOB PROGRAMMING
MOB PROGRAMMING
Scrum Breakfast Vietnam
 
Episode 16 - Introduction to LWC
Episode 16 - Introduction to LWCEpisode 16 - Introduction to LWC
Episode 16 - Introduction to LWC
Jitendra Zaa
 
Agile BDD
Agile BDDAgile BDD
Mobilizing Your SAP Data with Kendo UI Mobile
Mobilizing Your SAP Data with Kendo UI MobileMobilizing Your SAP Data with Kendo UI Mobile
Mobilizing Your SAP Data with Kendo UI Mobile
Lohith Goudagere Nagaraj
 
Bridge the gap with Chat Automation
Bridge the gap with Chat AutomationBridge the gap with Chat Automation
Bridge the gap with Chat Automation
Jaap Brasser
 
Episode 22 - Design Pattern 2
Episode 22 - Design Pattern 2Episode 22 - Design Pattern 2
Episode 22 - Design Pattern 2
Jitendra Zaa
 

What's hot (20)

Html5で変わるいろんなこと
Html5で変わるいろんなことHtml5で変わるいろんなこと
Html5で変わるいろんなこと
 
Rencore Webinar: SharePoint Customizations - the most overlooked road block t...
Rencore Webinar: SharePoint Customizations - the most overlooked road block t...Rencore Webinar: SharePoint Customizations - the most overlooked road block t...
Rencore Webinar: SharePoint Customizations - the most overlooked road block t...
 
BDD for RIAs with JavaScript - Skills Matter
BDD for RIAs with JavaScript - Skills MatterBDD for RIAs with JavaScript - Skills Matter
BDD for RIAs with JavaScript - Skills Matter
 
End to-end test automation at scale
End to-end test automation at scaleEnd to-end test automation at scale
End to-end test automation at scale
 
Intro to Web Development with Microsoft Technologies
Intro to Web Development with Microsoft TechnologiesIntro to Web Development with Microsoft Technologies
Intro to Web Development with Microsoft Technologies
 
Headless cms architecture
Headless cms architectureHeadless cms architecture
Headless cms architecture
 
HCM Scrum Breakfast – The real life of Scrumban team
HCM Scrum Breakfast – The real life of Scrumban teamHCM Scrum Breakfast – The real life of Scrumban team
HCM Scrum Breakfast – The real life of Scrumban team
 
Workflows - The Rise of the Machines
Workflows - The Rise of the MachinesWorkflows - The Rise of the Machines
Workflows - The Rise of the Machines
 
mabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platformmabl's Machine Learning Implementation on Google Cloud Platform
mabl's Machine Learning Implementation on Google Cloud Platform
 
Building Office 365 solutions with React
Building Office 365 solutions with ReactBuilding Office 365 solutions with React
Building Office 365 solutions with React
 
Episode 17 - Handling Events in Lightning Web Component
Episode 17 - Handling Events in Lightning Web ComponentEpisode 17 - Handling Events in Lightning Web Component
Episode 17 - Handling Events in Lightning Web Component
 
Scrum refresh
Scrum refreshScrum refresh
Scrum refresh
 
Server Side Responsive Layouts for ASP.NET WebForms using Telerik UI for ASP....
Server Side Responsive Layouts for ASP.NET WebForms using Telerik UI for ASP....Server Side Responsive Layouts for ASP.NET WebForms using Telerik UI for ASP....
Server Side Responsive Layouts for ASP.NET WebForms using Telerik UI for ASP....
 
Episode 19 - Asynchronous Apex - Batch apex & schedulers
Episode 19 - Asynchronous Apex - Batch apex & schedulersEpisode 19 - Asynchronous Apex - Batch apex & schedulers
Episode 19 - Asynchronous Apex - Batch apex & schedulers
 
MOB PROGRAMMING
MOB PROGRAMMINGMOB PROGRAMMING
MOB PROGRAMMING
 
Episode 16 - Introduction to LWC
Episode 16 - Introduction to LWCEpisode 16 - Introduction to LWC
Episode 16 - Introduction to LWC
 
Agile BDD
Agile BDDAgile BDD
Agile BDD
 
Mobilizing Your SAP Data with Kendo UI Mobile
Mobilizing Your SAP Data with Kendo UI MobileMobilizing Your SAP Data with Kendo UI Mobile
Mobilizing Your SAP Data with Kendo UI Mobile
 
Bridge the gap with Chat Automation
Bridge the gap with Chat AutomationBridge the gap with Chat Automation
Bridge the gap with Chat Automation
 
Episode 22 - Design Pattern 2
Episode 22 - Design Pattern 2Episode 22 - Design Pattern 2
Episode 22 - Design Pattern 2
 

Similar to No API? No Problem! Let the Robot Do Your Work! Web Scraping and Automation With Outsystems

Introduction-To-RPA_1.pptx
Introduction-To-RPA_1.pptxIntroduction-To-RPA_1.pptx
Introduction-To-RPA_1.pptx
Rohit Radhakrishnan
 
Alexandr Vronskiy "Evolution of Ecommerce Application"
Alexandr Vronskiy "Evolution of Ecommerce Application"Alexandr Vronskiy "Evolution of Ecommerce Application"
Alexandr Vronskiy "Evolution of Ecommerce Application"
Fwdays
 
Create Amazing Reports in OutSystems
Create Amazing Reports in OutSystemsCreate Amazing Reports in OutSystems
Create Amazing Reports in OutSystems
OutSystems
 
SE - Lecture 9 n 10 Intro Robotic Process Automation.pptx
SE - Lecture 9 n 10 Intro Robotic Process Automation.pptxSE - Lecture 9 n 10 Intro Robotic Process Automation.pptx
SE - Lecture 9 n 10 Intro Robotic Process Automation.pptx
TangZhiSiang
 
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentation
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentationvue-storefront - PWA eCommerce for Magento2 MM17NYC presentation
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentation
Divante
 
Robotic process automation Introduction
Robotic process automation IntroductionRobotic process automation Introduction
Robotic process automation Introduction
Priyab Satoshi
 
Techniques for building robust machine learning systems
Techniques for building robust machine learning systemsTechniques for building robust machine learning systems
Techniques for building robust machine learning systems
Stephen Whitworth
 
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...
Chris Sparshott
 
Introduction to RPA and UI Path
Introduction to RPA and UI PathIntroduction to RPA and UI Path
Introduction to RPA and UI Path
Aishwaryagangyada
 
Introduction to web analytics and the Google analytics platform pdf
Introduction to web analytics and the Google analytics platform pdfIntroduction to web analytics and the Google analytics platform pdf
Introduction to web analytics and the Google analytics platform pdf
Martin Bloomfield
 
Single Source of Truth for Network Automation
Single Source of Truth for Network AutomationSingle Source of Truth for Network Automation
Single Source of Truth for Network Automation
Andy Davidson
 
How to Monitor Your Java & .NET Applications with eG Enterprise
How to Monitor Your Java & .NET Applications with eG EnterpriseHow to Monitor Your Java & .NET Applications with eG Enterprise
How to Monitor Your Java & .NET Applications with eG Enterprise
eG Innovations
 
Measure Customer Experience of Your OutSystems Web and Mobile Applications
Measure Customer Experience of Your OutSystems Web and Mobile ApplicationsMeasure Customer Experience of Your OutSystems Web and Mobile Applications
Measure Customer Experience of Your OutSystems Web and Mobile Applications
OutSystems
 
Top 13 web scraping tools in 2022
Top 13 web scraping tools in 2022Top 13 web scraping tools in 2022
Top 13 web scraping tools in 2022
Aparna Sharma
 
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetupH2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
PyData Piraeus
 
Robotic Process Automation-RPA
Robotic Process Automation-RPARobotic Process Automation-RPA
Robotic Process Automation-RPA
Sandeep Maurya 8800719707
 
RPA-2020 to 2021.pptx
RPA-2020 to 2021.pptxRPA-2020 to 2021.pptx
RPA-2020 to 2021.pptx
ksrce2
 
Robotic Process Automation - Introduction
Robotic Process Automation - IntroductionRobotic Process Automation - Introduction
Robotic Process Automation - Introduction
JothikaS18
 
Stapling and patching the web of now - ForwardJS3, San Francisco
Stapling and patching the web of now - ForwardJS3, San FranciscoStapling and patching the web of now - ForwardJS3, San Francisco
Stapling and patching the web of now - ForwardJS3, San Francisco
Christian Heilmann
 

Similar to No API? No Problem! Let the Robot Do Your Work! Web Scraping and Automation With Outsystems (20)

Introduction-To-RPA_1.pptx
Introduction-To-RPA_1.pptxIntroduction-To-RPA_1.pptx
Introduction-To-RPA_1.pptx
 
Alexandr Vronskiy "Evolution of Ecommerce Application"
Alexandr Vronskiy "Evolution of Ecommerce Application"Alexandr Vronskiy "Evolution of Ecommerce Application"
Alexandr Vronskiy "Evolution of Ecommerce Application"
 
Create Amazing Reports in OutSystems
Create Amazing Reports in OutSystemsCreate Amazing Reports in OutSystems
Create Amazing Reports in OutSystems
 
SE - Lecture 9 n 10 Intro Robotic Process Automation.pptx
SE - Lecture 9 n 10 Intro Robotic Process Automation.pptxSE - Lecture 9 n 10 Intro Robotic Process Automation.pptx
SE - Lecture 9 n 10 Intro Robotic Process Automation.pptx
 
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentation
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentationvue-storefront - PWA eCommerce for Magento2 MM17NYC presentation
vue-storefront - PWA eCommerce for Magento2 MM17NYC presentation
 
Robotic process automation Introduction
Robotic process automation IntroductionRobotic process automation Introduction
Robotic process automation Introduction
 
Techniques for building robust machine learning systems
Techniques for building robust machine learning systemsTechniques for building robust machine learning systems
Techniques for building robust machine learning systems
 
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...
Integrating IBM Web Sphere Portal With Web Analytic Hosted And Non Hosted Sit...
 
Introduction to RPA and UI Path
Introduction to RPA and UI PathIntroduction to RPA and UI Path
Introduction to RPA and UI Path
 
Introduction to web analytics and the Google analytics platform pdf
Introduction to web analytics and the Google analytics platform pdfIntroduction to web analytics and the Google analytics platform pdf
Introduction to web analytics and the Google analytics platform pdf
 
Single Source of Truth for Network Automation
Single Source of Truth for Network AutomationSingle Source of Truth for Network Automation
Single Source of Truth for Network Automation
 
How to Monitor Your Java & .NET Applications with eG Enterprise
How to Monitor Your Java & .NET Applications with eG EnterpriseHow to Monitor Your Java & .NET Applications with eG Enterprise
How to Monitor Your Java & .NET Applications with eG Enterprise
 
Measure Customer Experience of Your OutSystems Web and Mobile Applications
Measure Customer Experience of Your OutSystems Web and Mobile ApplicationsMeasure Customer Experience of Your OutSystems Web and Mobile Applications
Measure Customer Experience of Your OutSystems Web and Mobile Applications
 
Top 13 web scraping tools in 2022
Top 13 web scraping tools in 2022Top 13 web scraping tools in 2022
Top 13 web scraping tools in 2022
 
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetupH2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
 
Robotic Process Automation-RPA
Robotic Process Automation-RPARobotic Process Automation-RPA
Robotic Process Automation-RPA
 
RPA-2020 to 2021.pptx
RPA-2020 to 2021.pptxRPA-2020 to 2021.pptx
RPA-2020 to 2021.pptx
 
Robotic Process Automation - Introduction
Robotic Process Automation - IntroductionRobotic Process Automation - Introduction
Robotic Process Automation - Introduction
 
Stapling and patching the web of now - ForwardJS3, San Francisco
Stapling and patching the web of now - ForwardJS3, San FranciscoStapling and patching the web of now - ForwardJS3, San Francisco
Stapling and patching the web of now - ForwardJS3, San Francisco
 
RPA
RPARPA
RPA
 

More from OutSystems

Innovating at the Speed of Business in the High-Bandwidth World of Digital Media
Innovating at the Speed of Business in the High-Bandwidth World of Digital MediaInnovating at the Speed of Business in the High-Bandwidth World of Digital Media
Innovating at the Speed of Business in the High-Bandwidth World of Digital Media
OutSystems
 
Beyond “Location”: Informing Real-Estate Decisions Through Innovative Technology
Beyond “Location”: Informing Real-Estate Decisions Through Innovative TechnologyBeyond “Location”: Informing Real-Estate Decisions Through Innovative Technology
Beyond “Location”: Informing Real-Estate Decisions Through Innovative Technology
OutSystems
 
Beyond Digital Transformation: A Mandate for Disruptive Innovation in the Age...
Beyond Digital Transformation: A Mandate for Disruptive Innovation in the Age...Beyond Digital Transformation: A Mandate for Disruptive Innovation in the Age...
Beyond Digital Transformation: A Mandate for Disruptive Innovation in the Age...
OutSystems
 
From Core Systems to Mobile Apps: Digital Transformation from the Inside-Out
From Core Systems to Mobile Apps: Digital Transformation from the Inside-OutFrom Core Systems to Mobile Apps: Digital Transformation from the Inside-Out
From Core Systems to Mobile Apps: Digital Transformation from the Inside-Out
OutSystems
 
Orchestrating the Art of the Impossible Using Low-Code to Automate Manual Wor...
Orchestrating the Art of the Impossible Using Low-Code to Automate Manual Wor...Orchestrating the Art of the Impossible Using Low-Code to Automate Manual Wor...
Orchestrating the Art of the Impossible Using Low-Code to Automate Manual Wor...
OutSystems
 
Fast and Furious: Modernizing Clinical Application
Fast and Furious: Modernizing Clinical ApplicationFast and Furious: Modernizing Clinical Application
Fast and Furious: Modernizing Clinical Application
OutSystems
 
What Is Light BPT and How Can You Use it for Parallel Processing?
What Is Light BPT and How Can You Use it for Parallel Processing?What Is Light BPT and How Can You Use it for Parallel Processing?
What Is Light BPT and How Can You Use it for Parallel Processing?
OutSystems
 
Enrich Visually Google Map Information With Layers
Enrich Visually Google Map Information With LayersEnrich Visually Google Map Information With Layers
Enrich Visually Google Map Information With Layers
OutSystems
 
Using Processes and Timers for Long-Running Asynchronous Tasks
Using Processes and Timers for Long-Running Asynchronous TasksUsing Processes and Timers for Long-Running Asynchronous Tasks
Using Processes and Timers for Long-Running Asynchronous Tasks
OutSystems
 
Unattended OutSystems Installation
Unattended OutSystems InstallationUnattended OutSystems Installation
Unattended OutSystems Installation
OutSystems
 
The 4-Layer Architecture in Practice
The 4-Layer Architecture in PracticeThe 4-Layer Architecture in Practice
The 4-Layer Architecture in Practice
OutSystems
 
Service Actions
Service ActionsService Actions
Service Actions
OutSystems
 
Reactive Web Best Practices
Reactive Web Best PracticesReactive Web Best Practices
Reactive Web Best Practices
OutSystems
 
RADS - Rapid Application Design Sprint
RADS - Rapid Application Design SprintRADS - Rapid Application Design Sprint
RADS - Rapid Application Design Sprint
OutSystems
 
Pragmatic Innovation
Pragmatic InnovationPragmatic Innovation
Pragmatic Innovation
OutSystems
 
Troubleshooting Dashboard Performance
Troubleshooting Dashboard PerformanceTroubleshooting Dashboard Performance
Troubleshooting Dashboard Performance
OutSystems
 
Neo in Wonderland: Essential Tools for an Outsystems Architect
Neo in Wonderland: Essential Tools for an Outsystems ArchitectNeo in Wonderland: Essential Tools for an Outsystems Architect
Neo in Wonderland: Essential Tools for an Outsystems Architect
OutSystems
 
Link Users to Your Specific Page in a Mobile App With Deeplinks
Link Users to Your Specific Page in a Mobile App With DeeplinksLink Users to Your Specific Page in a Mobile App With Deeplinks
Link Users to Your Specific Page in a Mobile App With Deeplinks
OutSystems
 
Launching a BPT Process on Entity Update
Launching a BPT Process on Entity UpdateLaunching a BPT Process on Entity Update
Launching a BPT Process on Entity Update
OutSystems
 
Testing With OutSystems
Testing With OutSystemsTesting With OutSystems
Testing With OutSystems
OutSystems
 

More from OutSystems (20)

Innovating at the Speed of Business in the High-Bandwidth World of Digital Media
Innovating at the Speed of Business in the High-Bandwidth World of Digital MediaInnovating at the Speed of Business in the High-Bandwidth World of Digital Media
Innovating at the Speed of Business in the High-Bandwidth World of Digital Media
 
Beyond “Location”: Informing Real-Estate Decisions Through Innovative Technology
Beyond “Location”: Informing Real-Estate Decisions Through Innovative TechnologyBeyond “Location”: Informing Real-Estate Decisions Through Innovative Technology
Beyond “Location”: Informing Real-Estate Decisions Through Innovative Technology
 
Beyond Digital Transformation: A Mandate for Disruptive Innovation in the Age...
Beyond Digital Transformation: A Mandate for Disruptive Innovation in the Age...Beyond Digital Transformation: A Mandate for Disruptive Innovation in the Age...
Beyond Digital Transformation: A Mandate for Disruptive Innovation in the Age...
 
From Core Systems to Mobile Apps: Digital Transformation from the Inside-Out
From Core Systems to Mobile Apps: Digital Transformation from the Inside-OutFrom Core Systems to Mobile Apps: Digital Transformation from the Inside-Out
From Core Systems to Mobile Apps: Digital Transformation from the Inside-Out
 
Orchestrating the Art of the Impossible Using Low-Code to Automate Manual Wor...
Orchestrating the Art of the Impossible Using Low-Code to Automate Manual Wor...Orchestrating the Art of the Impossible Using Low-Code to Automate Manual Wor...
Orchestrating the Art of the Impossible Using Low-Code to Automate Manual Wor...
 
Fast and Furious: Modernizing Clinical Application
Fast and Furious: Modernizing Clinical ApplicationFast and Furious: Modernizing Clinical Application
Fast and Furious: Modernizing Clinical Application
 
What Is Light BPT and How Can You Use it for Parallel Processing?
What Is Light BPT and How Can You Use it for Parallel Processing?What Is Light BPT and How Can You Use it for Parallel Processing?
What Is Light BPT and How Can You Use it for Parallel Processing?
 
Enrich Visually Google Map Information With Layers
Enrich Visually Google Map Information With LayersEnrich Visually Google Map Information With Layers
Enrich Visually Google Map Information With Layers
 
Using Processes and Timers for Long-Running Asynchronous Tasks
Using Processes and Timers for Long-Running Asynchronous TasksUsing Processes and Timers for Long-Running Asynchronous Tasks
Using Processes and Timers for Long-Running Asynchronous Tasks
 
Unattended OutSystems Installation
Unattended OutSystems InstallationUnattended OutSystems Installation
Unattended OutSystems Installation
 
The 4-Layer Architecture in Practice
The 4-Layer Architecture in PracticeThe 4-Layer Architecture in Practice
The 4-Layer Architecture in Practice
 
Service Actions
Service ActionsService Actions
Service Actions
 
Reactive Web Best Practices
Reactive Web Best PracticesReactive Web Best Practices
Reactive Web Best Practices
 
RADS - Rapid Application Design Sprint
RADS - Rapid Application Design SprintRADS - Rapid Application Design Sprint
RADS - Rapid Application Design Sprint
 
Pragmatic Innovation
Pragmatic InnovationPragmatic Innovation
Pragmatic Innovation
 
Troubleshooting Dashboard Performance
Troubleshooting Dashboard PerformanceTroubleshooting Dashboard Performance
Troubleshooting Dashboard Performance
 
Neo in Wonderland: Essential Tools for an Outsystems Architect
Neo in Wonderland: Essential Tools for an Outsystems ArchitectNeo in Wonderland: Essential Tools for an Outsystems Architect
Neo in Wonderland: Essential Tools for an Outsystems Architect
 
Link Users to Your Specific Page in a Mobile App With Deeplinks
Link Users to Your Specific Page in a Mobile App With DeeplinksLink Users to Your Specific Page in a Mobile App With Deeplinks
Link Users to Your Specific Page in a Mobile App With Deeplinks
 
Launching a BPT Process on Entity Update
Launching a BPT Process on Entity UpdateLaunching a BPT Process on Entity Update
Launching a BPT Process on Entity Update
 
Testing With OutSystems
Testing With OutSystemsTesting With OutSystems
Testing With OutSystems
 

Recently uploaded

Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Jay Das
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 

Recently uploaded (20)

Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 

No API? No Problem! Let the Robot Do Your Work! Web Scraping and Automation With Outsystems

  • 1. | Web Scraping and Automation With Outsystems No API? No Problem! Let the Robot Do Your Work! Web Scraping and Automation With Outsystems
  • 2. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems Miguel Antunes OutSystems MVP - Tech Lead | Do iT Lean @ in miguel.antunes@doitlean.com /antunes-miguel
  • 3. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems we ♥ APIs, but… we don’t always have them
  • 4. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems Pulling data straight out of HTML – otherwise known as web scraping.
  • 5. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems Any content that can be viewed on a webpage can be scraped.
  • 6. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems but… Why You Should Scrape?
  • 7. | Web Scraping and Automation With Outsystems
  • 8. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems No Rate-Limiting
  • 9. | Web Scraping and Automation With Outsystems Anonymous Access
  • 10. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems The Data’s Already in Your Face
  • 11. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems Let’s Get to Scraping
  • 12. | Web Scraping and Automation With Outsystems No matter what language you’re into, there’s a great scraping library for your project: ● BeautifulSoup or Scrapy, Python ● Upton or Wombat or Nokogiri, Ruby ● Scraperjs or X-ray, Node ● Scrape, Go ● Jaunt, Java
  • 13. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems + Text and HTML Processing
  • 14. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems Leonardo Fernandes Head of Delivery OutSystems, MVP | Phoenix Services
  • 15. | Web Scraping and Automation With Outsystems Extract information from plain text data with regular expressions, or from HTML with CSS selectors. Manipulate HTML documents with ease, and sanitize user input against HTML injection.
  • 16. | Web Scraping and Automation With Outsystems The Plan ● Pinpoint your target: a simple html website ● Design your scraping scheme ● Run & let the magic operate
  • 17. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems Hands-on!
  • 18. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems What about Enterprise usage? You may ask...
  • 19. | Web Scraping and Automation With Outsystems
  • 20. | Web Scraping and Automation With Outsystems Frankort & Koning needs ● Check if Product/Producers is certified ● Do that multiple times per day, multiple times per product Global Gap problems ● No API available ● All the checks needs to be done manually
  • 21. | Web Scraping and Automation With Outsystems How does it work… You want to know which farm produced your product? ● On the packaging of several products, you can find a 13-digit GLOBALG.A.P. Number (GGN). This number identifies the producer or producer group that has farmed your product. ● As a consumer, you can use it to verify whether the product is from a certified producer or not in the GLOBALG.A.P. Database. ● Retailers also use this number for business-to-business traceability to ensure that products–especially fresh fruit and vegetables–come from a certified origin and that the production is safe and sustainable.
  • 22. | Web Scraping and Automation With Outsystems
  • 23. | Web Scraping and Automation With Outsystems OutSystems + Selenium + Chrome ● Automate user interactions ● Extract HTML ● Parse HTML as before
  • 24. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems Let’s see it in action...
  • 25. | Web Scraping and Automation With Outsystems| Web Scraping and Automation With Outsystems 700+ Producers 160+ Products 900+ Certificates *estimating that each certificate would take 1 minute to check manually ~15h Manually* ~2h Automatically
  • 26. | Web Scraping and Automation With Outsystems Thank You! @ in miguel.antunes@doitlean.co m /antunes-miguel

Editor's Notes

  1. Thank you all for being here, let me also thanks OutSystems to let be here on the stage talking about a topic that I really like. Which is Web Scraping and Automation with OutSystems.
  2. They’re like the one ring of programming, enabling you to pull info and perform actions from different services. I doubt Slack would be nearly as popular is it is without all those cool API integrations. Now, considering how popular APIs are these days, it’s frustrating to run into a service or site without one. But, it’s actually quite common. Netflix shut down it’s API years ago. My bank doesn’t have one. Most news sources don’t either. Bottom line, many apps & data aren’t designed for programmatic access. But don’t let that discourage you from building your next big thing. If you need to collect data or perform an action on the web without access to an API, there are a couple ways you can hack it.
  3. If a website provides a way for a visitor’s browser to download content and render that content in a structured way, then almost by definition, that content can be accessed programmatically. In this presentation, I’ll show you how. Over the past few years, I’ve scraped dozens of websites – from cinema blogs, models agencies to cooking recipes sites, undocumented JSON endpoints that I found by inspecting network traffic in my browser, you name it. There are some tricks that site owners will use to thwart this type of access – which we’ll dive into later – but they almost all have simple work-arounds.
  4. Let me share with you some good point why web scraping is a good thing. Of course, the first one is when we don’t have an API.
  5. Site owners generally care way more about maintaining their public-facing visitor website than they do about their structured data feeds. We’ve seen it very publicly with Twitter clamping down on their developer ecosystem, and I’ve seen it multiple times in my projects where APIs change or feeds move without warning. Sometimes it’s deliberate, but most of the time these sorts of problems happen because no one at the organization really cares or maintains the structured data. If it goes offline or gets horribly mangled, no one really notices. One the other hand, if the website goes down or is having issues, that’s a more of an in-your-face, drop-everything-until-this-is-fixed kind of problem, and gets dealt with quickly.
  6. Another thing to think about is that the concept of rate-limiting is virtually non-existent for public websites. Aside from the occasional captchas on sign up pages, most businesses generally don’t build a lot of defenses against automated access. I’ve scraped a single site for over 4 hours at a time and not seen any issues. Unless you’re making an high amount of concurrent requests, you probably won’t be viewed as a DDOS attack, you’ll just show up as a super-avid visitor in the logs, in case anyone’s looking.
  7. There are also fewer ways for the website’s administrators to track your behavior, which can be useful if you want gather data more privately. With APIs, you often have to register to get a key and then send along that key with every request. But with simple HTTP requests, you’re basically anonymous besides your IP address and cookies, which can be easily spoofed.
  8. Web scraping is also universally available, as I mentioned earlier. You don’t have to wait for a site to come up with an API or even contact anyone at the organization to ask for it. Just spend some time browsing the site until you find the data you need and figure out some basic access patterns – which we’ll talk about next.
  9. OutSystems has some libraries too, they are at the forge ready to be downloaded and to be used. I personally like Text and HTML Processing
  10. It was created by Leonardo, kudos to him.
  11. It is probably a horrible idea to try parsing the HTML of the page as a long string, right? (although there are times I’ve needed to fall back on that). A good library will read in the HTML that you pull in using some HTTP request and turn it into an object that you can iterate over to your heart’s content, similar to a JSON object. And this component just does that perfectly. The key to web scraping is figuring out how to identify the exact elements you’re looking for. This could be by looking for element types (divs, list items), particular ids or classes, or by doing regex / XPath searches.
  12. So the first thing you’re going to need to do is fetch the data. You’ll need to start by finding your “endpoints” – the URL or URLs that return the data you need. If you know you need your information organized in a certain way – or only need a specific subset of it – you can browse through the site using their navigation. Pay attention to the URLs and how they change as you click between sections and drill down into sub-sections. The other option for getting started is to go straight to the site’s search functionality. Try typing in a few different terms and again, pay attention to the URL and how it changes depending on what you search for. You’ll probably see a GET parameter like q= that always changes based on you search term. Try removing other unnecessary GET parameters from the URL, until you’re left with only the ones you need to load your data. Make sure that there’s always a beginning ? to start the query string and a & between each key/value pair.
  13. I’m going to share with you on real scenario where we used Web Scraping to overcome the problem of not having an API to interact with a third party system.
  14. Frankort & Koning is an international organisation that trades in fruit and vegetables. Let me try do simplify their business process. They buy from the producers and sell to the markets. And since we’re talking about fruits and vegetables, they’re really worried about the freshness of the products they trade. And this is an area with a lot of regulations, for example they can only sold products that came from certified producers.
  15. Global GAP is the organization that certify the producers. Frankort needed to have a way to know for each product that comes to the warehouse was certified for that specific producer. The problem is that Global Gab, doesn't have an api to cross check that information, everything is done manually.
  16. And I can we do that cross check? There’s a 13-digit code on the packaging of the products. Like a traditional barcode that we are used to see. With this code we can check the global gap database for the product certification.
  17. This is how we can check the certification for a GGN number on the Global Gap database. After inserting the producer GGN It will show us what this producer is certified to produce and sell, for this case he is certified to do it for cucumbers. Now imagine this… Frankort receives thousands of products daily, checking all the products manually would make someone jump of a bridge for sure!
  18. What we did, was to combine outsystems with selenium and then selenium with google chrome. This way we can automate user actions, extract the HTML that resulted from that interaction and parse it as we did before.