SlideShare a Scribd company logo
Web scraping for non programmers 
ITNIG | 25th September 2014 
@algonpaje - www.quadrigram.com
Goal: Introduce non programmers to APIs and 
scraping concepts (*) 
(*) In a simple way….. 
@algonpaje - www.quadrigram.com
How?: Using few modules of a visual programming 
language called “Quadrigram” 
@algonpaje - www.quadrigram.com
> Quadrigram is a computer software designed to make the 
practice of data analysis and data visualization more universal 
> It is designed to gather, shape, and share data 
> It enables to prototype and share ideas rapidly, as well as 
produce compelling solutions with data in the forms of interactive 
visualizations, animations or dashboards 
> The Quadrigram approach to data analysis and visualization is 
based on a visual programming language composed of around 500 
modules 
@algonpaje - www.quadrigram.com
Example 1: Getting financial information in real time 
@algonpaje - www.quadrigram.com
> Data source: http://finance.yahoo.com/ 
@algonpaje - www.quadrigram.com 
Stock Ticker Input Box
> Base URL: http://finance.yahoo.com/q?s=TEF.MC&ql=1/ 
1.- http://finance.yahoo.com/q?s= 
2.- ticker (TEF.MC) 
3.- &ql=1/ 
@algonpaje - www.quadrigram.com 
1 + 2 + 3 = Base URL
1.- Building base URL using Quadrigram 
1.1.- Module “Text” (String): “http://finance.yahoo.com/q?s=” 
1.2.- Module “Text Entry Box”: Input the stock ticker (eg: TEF.MC) 
1.3.- Module “Text” (String): “&ql=1/” 
1.4.- Module “Addition of 5 objects” concatenating 1, 2 and 3 
…. result = “http://finance.yahoo.com/q?s=TEF.MC&ql=1/” 
@algonpaje - www.quadrigram.com
2.- Querying data 
2.1.- Connect the output of “Addition of 5 Objects” (“http://finance. 
yahoo.com/q?s=TEF.MC&ql=1/”) to module “Query HTTP GET” 
2.2.- Connect a “Periodic Pulse” module to “Query HTTP GET” to 
query data each “X” seconds 
…. and so we get our HTML code ready to be scraped 
@algonpaje - www.quadrigram.com
3.- Scraping data 
3.1.- Analyse the code and look for a “left - content - right” pattern. 
In this case, the pattern we are looking for is: 
left = <span id="yfs_l84_tef.mc"> 
content = stock price (* real time when market is opened) 
right = </span> 
@algonpaje - www.quadrigram.com
3.- Scraping data 
@algonpaje - www.quadrigram.com
3.- Scraping data 
3.2.- Use “Scrape Text” module to extract data 
“Scrape Text” inlets: 
source text = HTML code (output of Query HTTP GET) 
start sequence = <span id="yfs_l84_tef.mc"> 
end sequence = </span> 
3.3.- Extract the stock price using “Extract Object from List” module 
@algonpaje - www.quadrigram.com
@algonpaje - www.quadrigram.com
Example 2: Build a network of similarities using “The 
Echonest” API 
@algonpaje - www.quadrigram.com
>Data source: 
http://developer.echonest.com/raw_tutorials/artist_api/raw_artist_02.html 
@algonpaje - www.quadrigram.com
>BaseURL: 
http://developer.echonest.com/api/v4/artist/similar?api_key=J1OPQ9MJ8G8FC19FH&name=stones 
1.- http://developer.echonest.com/api/v4/artist/similar?api_key=J1OPQ9MJ8G8FC19FH&name= 
2.- artist´s name (“strokes”) 
@algonpaje - www.quadrigram.com 
1 + 2 = Base URL
1.- Building base URL using Quadrigram 
1.1.- Module “Text” (String): “http://developer.echonest.com/api/v4/artist/similar? 
api_key=J1OPQ9MJ8G8FC19FH&name=” 
1.2.- Module “Text Entry Box”: Input the artist´s name (eg: strokes) 
1.3.- Module “Addition of 5 objects” concatenating 1 and 2 
…. result = “http://developer.echonest.com/api/v4/artist/similar? 
api_key=J1OPQ9MJ8G8FC19FH&name=strokes” 
@algonpaje - www.quadrigram.com
2.- Querying data 
2.1.- Connect the output of “Addition of 5 Objects” 
(“http://developer.echonest.com/api/v4/artist/similar?api_key=J1OPQ9MJ8G8FC19FH&name=strokes”) 
to module “Query HTTP GET” 
…. and so we get our HTML code 
@algonpaje - www.quadrigram.com
3.- Scraping data 
3.2.- Use “Scrape Text” module to extract data 
“Scrape Text” inlets: 
source text = HTML code (output of Query HTTP GET) 
start sequence = "name": " 
end sequence = "}, 
… and we obtain the list with similar artists to our query name 
@algonpaje - www.quadrigram.com
4.- Build a Network of similarities 
4.1.- Use “Length of List” module to count how many similar 
artists the are 
4.2.- Use “Create List with repeated Object” module to create as 
many “strokes” as similar artists are 
4.3.- Create a Pair Table using “Create Custom Data Structure” 
module 
4.4.- Conver the Pair Table to a Network using “Convert PairTable 
to Network” module 
@algonpaje - www.quadrigram.com
@algonpaje - www.quadrigram.com
More information: www.quadrigram.com 
@algonpaje - www.quadrigram.com
Thank you!!! 
@algonpaje - www.quadrigram.com

More Related Content

Viewers also liked

«Архитектор». Создай свой доход!
«Архитектор». Создай свой доход!«Архитектор». Создай свой доход!
«Архитектор». Создай свой доход!
TianDe
 
Google Analytics and Google AdWords for the Online Marketer
Google Analytics and Google AdWords for the Online MarketerGoogle Analytics and Google AdWords for the Online Marketer
Google Analytics and Google AdWords for the Online Marketer
Elias Dabbas
 
Yet Another Keynote Speech
Yet Another Keynote SpeechYet Another Keynote Speech
Yet Another Keynote Speech
John Anderson
 
EY O viziune a cresterii - editia de toamna 2016
EY O viziune a cresterii - editia de toamna 2016EY O viziune a cresterii - editia de toamna 2016
EY O viziune a cresterii - editia de toamna 2016
Mihaela Matei
 
Entrepreneurial lesson number 1 : Why so many entreprenerial businesses fail
Entrepreneurial lesson number 1 : Why so many entreprenerial businesses fail Entrepreneurial lesson number 1 : Why so many entreprenerial businesses fail
Entrepreneurial lesson number 1 : Why so many entreprenerial businesses fail
Clive Butkow
 
4.5 Super Tested Secrets To Tweeter Your Way To Facebook Glory
4.5 Super Tested Secrets To Tweeter Your Way To Facebook Glory4.5 Super Tested Secrets To Tweeter Your Way To Facebook Glory
4.5 Super Tested Secrets To Tweeter Your Way To Facebook Glory
Dan Waldschmidt
 
EY Business Barometer - O viziune a creșterii - ediția de toamnă 2016
EY Business Barometer - O viziune a creșterii - ediția de toamnă 2016EY Business Barometer - O viziune a creșterii - ediția de toamnă 2016
EY Business Barometer - O viziune a creșterii - ediția de toamnă 2016
Mihaela Matei
 
Segundo ind
Segundo indSegundo ind
Segundo indTaTa Rey
 
Grado 11 p iii - actividades orientadoras de desempeños
Grado 11   p iii - actividades orientadoras de desempeñosGrado 11   p iii - actividades orientadoras de desempeños
Grado 11 p iii - actividades orientadoras de desempeños
mkciencias
 
Grado 10 orientaciones trabajo final feb 16 2015
Grado 10 orientaciones trabajo final feb 16 2015Grado 10 orientaciones trabajo final feb 16 2015
Grado 10 orientaciones trabajo final feb 16 2015
mkciencias
 
Hofstede video
Hofstede videoHofstede video
Hofstede video
Lauren Finnie
 
Five habits for the front-end of innovation
Five habits for the front-end of innovationFive habits for the front-end of innovation
Five habits for the front-end of innovation
Plan
 
Kotlinこんなん出ましたけど
Kotlinこんなん出ましたけどKotlinこんなん出ましたけど
Kotlinこんなん出ましたけど
yy yank
 
Views Toward Nutrition and Healthful Eating Among Millennials
Views Toward Nutrition and Healthful Eating Among MillennialsViews Toward Nutrition and Healthful Eating Among Millennials
Views Toward Nutrition and Healthful Eating Among Millennials
Food Insight
 
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...
Energy Digital Summit
 
Info piii-actividades oreintadoras
Info piii-actividades oreintadorasInfo piii-actividades oreintadoras
Info piii-actividades oreintadoras
mkciencias
 
11 piii - hidrocarburos
11  piii - hidrocarburos11  piii - hidrocarburos
11 piii - hidrocarburos
mkciencias
 

Viewers also liked (17)

«Архитектор». Создай свой доход!
«Архитектор». Создай свой доход!«Архитектор». Создай свой доход!
«Архитектор». Создай свой доход!
 
Google Analytics and Google AdWords for the Online Marketer
Google Analytics and Google AdWords for the Online MarketerGoogle Analytics and Google AdWords for the Online Marketer
Google Analytics and Google AdWords for the Online Marketer
 
Yet Another Keynote Speech
Yet Another Keynote SpeechYet Another Keynote Speech
Yet Another Keynote Speech
 
EY O viziune a cresterii - editia de toamna 2016
EY O viziune a cresterii - editia de toamna 2016EY O viziune a cresterii - editia de toamna 2016
EY O viziune a cresterii - editia de toamna 2016
 
Entrepreneurial lesson number 1 : Why so many entreprenerial businesses fail
Entrepreneurial lesson number 1 : Why so many entreprenerial businesses fail Entrepreneurial lesson number 1 : Why so many entreprenerial businesses fail
Entrepreneurial lesson number 1 : Why so many entreprenerial businesses fail
 
4.5 Super Tested Secrets To Tweeter Your Way To Facebook Glory
4.5 Super Tested Secrets To Tweeter Your Way To Facebook Glory4.5 Super Tested Secrets To Tweeter Your Way To Facebook Glory
4.5 Super Tested Secrets To Tweeter Your Way To Facebook Glory
 
EY Business Barometer - O viziune a creșterii - ediția de toamnă 2016
EY Business Barometer - O viziune a creșterii - ediția de toamnă 2016EY Business Barometer - O viziune a creșterii - ediția de toamnă 2016
EY Business Barometer - O viziune a creșterii - ediția de toamnă 2016
 
Segundo ind
Segundo indSegundo ind
Segundo ind
 
Grado 11 p iii - actividades orientadoras de desempeños
Grado 11   p iii - actividades orientadoras de desempeñosGrado 11   p iii - actividades orientadoras de desempeños
Grado 11 p iii - actividades orientadoras de desempeños
 
Grado 10 orientaciones trabajo final feb 16 2015
Grado 10 orientaciones trabajo final feb 16 2015Grado 10 orientaciones trabajo final feb 16 2015
Grado 10 orientaciones trabajo final feb 16 2015
 
Hofstede video
Hofstede videoHofstede video
Hofstede video
 
Five habits for the front-end of innovation
Five habits for the front-end of innovationFive habits for the front-end of innovation
Five habits for the front-end of innovation
 
Kotlinこんなん出ましたけど
Kotlinこんなん出ましたけどKotlinこんなん出ましたけど
Kotlinこんなん出ましたけど
 
Views Toward Nutrition and Healthful Eating Among Millennials
Views Toward Nutrition and Healthful Eating Among MillennialsViews Toward Nutrition and Healthful Eating Among Millennials
Views Toward Nutrition and Healthful Eating Among Millennials
 
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...
Brands & Publishers: A Symbiotic Relationship for the Digital Age - Stacy Mar...
 
Info piii-actividades oreintadoras
Info piii-actividades oreintadorasInfo piii-actividades oreintadoras
Info piii-actividades oreintadoras
 
11 piii - hidrocarburos
11  piii - hidrocarburos11  piii - hidrocarburos
11 piii - hidrocarburos
 

Similar to Web Scraping for Non Programmers

Zotonic tutorial EUC 2013
Zotonic tutorial EUC 2013Zotonic tutorial EUC 2013
Zotonic tutorial EUC 2013
Arjan
 
Oracle Application Express as add-on for Google Apps
Oracle Application Express as add-on for Google AppsOracle Application Express as add-on for Google Apps
Oracle Application Express as add-on for Google Apps
Sergei Martens
 
MVC & SQL_In_1_Hour
MVC & SQL_In_1_HourMVC & SQL_In_1_Hour
MVC & SQL_In_1_HourDilip Patel
 
Introduction to Django
Introduction to DjangoIntroduction to Django
Introduction to Django
Joaquim Rocha
 
Strigil - lightning talks
Strigil - lightning talksStrigil - lightning talks
Strigil - lightning talks
zviri
 
Semantic technologies in practice - KULeuven 2016
Semantic technologies in practice - KULeuven 2016Semantic technologies in practice - KULeuven 2016
Semantic technologies in practice - KULeuven 2016
Aad Versteden
 
Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013
Jimmy Lai
 
Microsoft Graph community call-October 2018
Microsoft Graph community call-October 2018Microsoft Graph community call-October 2018
Microsoft Graph community call-October 2018
Microsoft 365 Developer
 
Agile Data Science
Agile Data ScienceAgile Data Science
Agile Data Science
Russell Jurney
 
BP204 - Take a REST and put your data to work with APIs!
BP204 - Take a REST and put your data to work with APIs!BP204 - Take a REST and put your data to work with APIs!
BP204 - Take a REST and put your data to work with APIs!
Craig Schumann
 
Deploying Machine Learning Models to Production
Deploying Machine Learning Models to ProductionDeploying Machine Learning Models to Production
Deploying Machine Learning Models to Production
Anass Bensrhir - Senior Data Scientist
 
angular-concepts-introduction-slides.pptx
angular-concepts-introduction-slides.pptxangular-concepts-introduction-slides.pptx
angular-concepts-introduction-slides.pptx
shekharmpatil1309
 
SgCodeJam24 Workshop Extract
SgCodeJam24 Workshop ExtractSgCodeJam24 Workshop Extract
SgCodeJam24 Workshop Extract
remko caprio
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0
Russell Jurney
 
Angular2 inter3
Angular2 inter3Angular2 inter3
Angular2 inter3
Oswald Campesato
 
Polymer Code Lab in Dart - DevFest Kraków 2014
Polymer Code Lab in Dart - DevFest Kraków 2014Polymer Code Lab in Dart - DevFest Kraków 2014
Polymer Code Lab in Dart - DevFest Kraków 2014
jskvara
 
API Technical Writing
API Technical WritingAPI Technical Writing
API Technical Writing
Sarah Maddox
 
Introduction to Polymer
Introduction to PolymerIntroduction to Polymer
Introduction to Polymer
Egor Miasnikov
 
Sitecore SPEAK3 presentation
Sitecore SPEAK3 presentationSitecore SPEAK3 presentation
Sitecore SPEAK3 presentation
Mihály Árvai
 
How to start SPEAK3 development
How to start SPEAK3 developmentHow to start SPEAK3 development
How to start SPEAK3 development
Mihály Árvai
 

Similar to Web Scraping for Non Programmers (20)

Zotonic tutorial EUC 2013
Zotonic tutorial EUC 2013Zotonic tutorial EUC 2013
Zotonic tutorial EUC 2013
 
Oracle Application Express as add-on for Google Apps
Oracle Application Express as add-on for Google AppsOracle Application Express as add-on for Google Apps
Oracle Application Express as add-on for Google Apps
 
MVC & SQL_In_1_Hour
MVC & SQL_In_1_HourMVC & SQL_In_1_Hour
MVC & SQL_In_1_Hour
 
Introduction to Django
Introduction to DjangoIntroduction to Django
Introduction to Django
 
Strigil - lightning talks
Strigil - lightning talksStrigil - lightning talks
Strigil - lightning talks
 
Semantic technologies in practice - KULeuven 2016
Semantic technologies in practice - KULeuven 2016Semantic technologies in practice - KULeuven 2016
Semantic technologies in practice - KULeuven 2016
 
Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013
 
Microsoft Graph community call-October 2018
Microsoft Graph community call-October 2018Microsoft Graph community call-October 2018
Microsoft Graph community call-October 2018
 
Agile Data Science
Agile Data ScienceAgile Data Science
Agile Data Science
 
BP204 - Take a REST and put your data to work with APIs!
BP204 - Take a REST and put your data to work with APIs!BP204 - Take a REST and put your data to work with APIs!
BP204 - Take a REST and put your data to work with APIs!
 
Deploying Machine Learning Models to Production
Deploying Machine Learning Models to ProductionDeploying Machine Learning Models to Production
Deploying Machine Learning Models to Production
 
angular-concepts-introduction-slides.pptx
angular-concepts-introduction-slides.pptxangular-concepts-introduction-slides.pptx
angular-concepts-introduction-slides.pptx
 
SgCodeJam24 Workshop Extract
SgCodeJam24 Workshop ExtractSgCodeJam24 Workshop Extract
SgCodeJam24 Workshop Extract
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0
 
Angular2 inter3
Angular2 inter3Angular2 inter3
Angular2 inter3
 
Polymer Code Lab in Dart - DevFest Kraków 2014
Polymer Code Lab in Dart - DevFest Kraków 2014Polymer Code Lab in Dart - DevFest Kraków 2014
Polymer Code Lab in Dart - DevFest Kraków 2014
 
API Technical Writing
API Technical WritingAPI Technical Writing
API Technical Writing
 
Introduction to Polymer
Introduction to PolymerIntroduction to Polymer
Introduction to Polymer
 
Sitecore SPEAK3 presentation
Sitecore SPEAK3 presentationSitecore SPEAK3 presentation
Sitecore SPEAK3 presentation
 
How to start SPEAK3 development
How to start SPEAK3 developmentHow to start SPEAK3 development
How to start SPEAK3 development
 

More from itnig

Presentation of the project "Startups Made in Spain" · On Friday, January 9 a...
Presentation of the project "Startups Made in Spain" · On Friday, January 9 a...Presentation of the project "Startups Made in Spain" · On Friday, January 9 a...
Presentation of the project "Startups Made in Spain" · On Friday, January 9 a...
itnig
 
Hands-On Prototyping Without Code
Hands-On Prototyping Without CodeHands-On Prototyping Without Code
Hands-On Prototyping Without Codeitnig
 
Essentials Every Non-Technical Person Need To Know To Build The Best Tech-Tea...
Essentials Every Non-Technical Person Need To Know To Build The Best Tech-Tea...Essentials Every Non-Technical Person Need To Know To Build The Best Tech-Tea...
Essentials Every Non-Technical Person Need To Know To Build The Best Tech-Tea...
itnig
 
Die Another Day: Scaling from 0 to 4 million daily requests as a lone develop...
Die Another Day: Scaling from 0 to 4 million daily requests as a lone develop...Die Another Day: Scaling from 0 to 4 million daily requests as a lone develop...
Die Another Day: Scaling from 0 to 4 million daily requests as a lone develop...
itnig
 
Data Tools cosystem_for_non_programmers
Data Tools cosystem_for_non_programmersData Tools cosystem_for_non_programmers
Data Tools cosystem_for_non_programmersitnig
 
The Black Magic of Ruby Metaprogramming
The Black Magic of Ruby MetaprogrammingThe Black Magic of Ruby Metaprogramming
The Black Magic of Ruby Metaprogramming
itnig
 
Futurology For Entrepreneurs: 7 Ways To Spot The Opportunities Of Tomorrow
Futurology For Entrepreneurs: 7 Ways To Spot The Opportunities Of TomorrowFuturology For Entrepreneurs: 7 Ways To Spot The Opportunities Of Tomorrow
Futurology For Entrepreneurs: 7 Ways To Spot The Opportunities Of Tomorrowitnig
 
Visualizing large datasets (BIG DATA itnig friday)
Visualizing large datasets (BIG DATA itnig friday)Visualizing large datasets (BIG DATA itnig friday)
Visualizing large datasets (BIG DATA itnig friday)itnig
 
Make your own Open Source transition with CocoaPods
Make your own Open Source transition with CocoaPodsMake your own Open Source transition with CocoaPods
Make your own Open Source transition with CocoaPodsitnig
 
"El boom del Consumo Colaborativo" by Albert Cañigueral
"El boom del Consumo Colaborativo" by Albert Cañigueral"El boom del Consumo Colaborativo" by Albert Cañigueral
"El boom del Consumo Colaborativo" by Albert Cañigueral
itnig
 
Control Your Life - The Startup Way
Control Your Life - The Startup WayControl Your Life - The Startup Way
Control Your Life - The Startup Way
itnig
 
Analítica Ágil - De la Sobrecarga a la Evidencia de los Datos
Analítica Ágil - De la Sobrecarga a la Evidencia de los DatosAnalítica Ágil - De la Sobrecarga a la Evidencia de los Datos
Analítica Ágil - De la Sobrecarga a la Evidencia de los Datos
itnig
 
Ser público en internet lo es todo.
Ser público en internet lo es todo.Ser público en internet lo es todo.
Ser público en internet lo es todo.
itnig
 
Performance marketingonline enterategratis_
Performance marketingonline enterategratis_Performance marketingonline enterategratis_
Performance marketingonline enterategratis_itnig
 
SEO para ecommerce by Alfonso Moure
SEO para ecommerce by Alfonso MoureSEO para ecommerce by Alfonso Moure
SEO para ecommerce by Alfonso Moure
itnig
 
Hablar en Público by Marion Chevalier
Hablar en Público by Marion ChevalierHablar en Público by Marion Chevalier
Hablar en Público by Marion Chevalieritnig
 
Collecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsDCollecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsD
itnig
 
La burbuja publicitaria
La burbuja publicitariaLa burbuja publicitaria
La burbuja publicitariaitnig
 
Analisis de las empresas del Ibex35
Analisis de las empresas del Ibex35Analisis de las empresas del Ibex35
Analisis de las empresas del Ibex35itnig
 
One graph to rule them all - Facebook
One graph to rule them all - FacebookOne graph to rule them all - Facebook
One graph to rule them all - Facebook
itnig
 

More from itnig (20)

Presentation of the project "Startups Made in Spain" · On Friday, January 9 a...
Presentation of the project "Startups Made in Spain" · On Friday, January 9 a...Presentation of the project "Startups Made in Spain" · On Friday, January 9 a...
Presentation of the project "Startups Made in Spain" · On Friday, January 9 a...
 
Hands-On Prototyping Without Code
Hands-On Prototyping Without CodeHands-On Prototyping Without Code
Hands-On Prototyping Without Code
 
Essentials Every Non-Technical Person Need To Know To Build The Best Tech-Tea...
Essentials Every Non-Technical Person Need To Know To Build The Best Tech-Tea...Essentials Every Non-Technical Person Need To Know To Build The Best Tech-Tea...
Essentials Every Non-Technical Person Need To Know To Build The Best Tech-Tea...
 
Die Another Day: Scaling from 0 to 4 million daily requests as a lone develop...
Die Another Day: Scaling from 0 to 4 million daily requests as a lone develop...Die Another Day: Scaling from 0 to 4 million daily requests as a lone develop...
Die Another Day: Scaling from 0 to 4 million daily requests as a lone develop...
 
Data Tools cosystem_for_non_programmers
Data Tools cosystem_for_non_programmersData Tools cosystem_for_non_programmers
Data Tools cosystem_for_non_programmers
 
The Black Magic of Ruby Metaprogramming
The Black Magic of Ruby MetaprogrammingThe Black Magic of Ruby Metaprogramming
The Black Magic of Ruby Metaprogramming
 
Futurology For Entrepreneurs: 7 Ways To Spot The Opportunities Of Tomorrow
Futurology For Entrepreneurs: 7 Ways To Spot The Opportunities Of TomorrowFuturology For Entrepreneurs: 7 Ways To Spot The Opportunities Of Tomorrow
Futurology For Entrepreneurs: 7 Ways To Spot The Opportunities Of Tomorrow
 
Visualizing large datasets (BIG DATA itnig friday)
Visualizing large datasets (BIG DATA itnig friday)Visualizing large datasets (BIG DATA itnig friday)
Visualizing large datasets (BIG DATA itnig friday)
 
Make your own Open Source transition with CocoaPods
Make your own Open Source transition with CocoaPodsMake your own Open Source transition with CocoaPods
Make your own Open Source transition with CocoaPods
 
"El boom del Consumo Colaborativo" by Albert Cañigueral
"El boom del Consumo Colaborativo" by Albert Cañigueral"El boom del Consumo Colaborativo" by Albert Cañigueral
"El boom del Consumo Colaborativo" by Albert Cañigueral
 
Control Your Life - The Startup Way
Control Your Life - The Startup WayControl Your Life - The Startup Way
Control Your Life - The Startup Way
 
Analítica Ágil - De la Sobrecarga a la Evidencia de los Datos
Analítica Ágil - De la Sobrecarga a la Evidencia de los DatosAnalítica Ágil - De la Sobrecarga a la Evidencia de los Datos
Analítica Ágil - De la Sobrecarga a la Evidencia de los Datos
 
Ser público en internet lo es todo.
Ser público en internet lo es todo.Ser público en internet lo es todo.
Ser público en internet lo es todo.
 
Performance marketingonline enterategratis_
Performance marketingonline enterategratis_Performance marketingonline enterategratis_
Performance marketingonline enterategratis_
 
SEO para ecommerce by Alfonso Moure
SEO para ecommerce by Alfonso MoureSEO para ecommerce by Alfonso Moure
SEO para ecommerce by Alfonso Moure
 
Hablar en Público by Marion Chevalier
Hablar en Público by Marion ChevalierHablar en Público by Marion Chevalier
Hablar en Público by Marion Chevalier
 
Collecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsDCollecting metrics with Graphite and StatsD
Collecting metrics with Graphite and StatsD
 
La burbuja publicitaria
La burbuja publicitariaLa burbuja publicitaria
La burbuja publicitaria
 
Analisis de las empresas del Ibex35
Analisis de las empresas del Ibex35Analisis de las empresas del Ibex35
Analisis de las empresas del Ibex35
 
One graph to rule them all - Facebook
One graph to rule them all - FacebookOne graph to rule them all - Facebook
One graph to rule them all - Facebook
 

Recently uploaded

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 

Recently uploaded (20)

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 

Web Scraping for Non Programmers

  • 1. Web scraping for non programmers ITNIG | 25th September 2014 @algonpaje - www.quadrigram.com
  • 2. Goal: Introduce non programmers to APIs and scraping concepts (*) (*) In a simple way….. @algonpaje - www.quadrigram.com
  • 3.
  • 4. How?: Using few modules of a visual programming language called “Quadrigram” @algonpaje - www.quadrigram.com
  • 5. > Quadrigram is a computer software designed to make the practice of data analysis and data visualization more universal > It is designed to gather, shape, and share data > It enables to prototype and share ideas rapidly, as well as produce compelling solutions with data in the forms of interactive visualizations, animations or dashboards > The Quadrigram approach to data analysis and visualization is based on a visual programming language composed of around 500 modules @algonpaje - www.quadrigram.com
  • 6. Example 1: Getting financial information in real time @algonpaje - www.quadrigram.com
  • 7. > Data source: http://finance.yahoo.com/ @algonpaje - www.quadrigram.com Stock Ticker Input Box
  • 8. > Base URL: http://finance.yahoo.com/q?s=TEF.MC&ql=1/ 1.- http://finance.yahoo.com/q?s= 2.- ticker (TEF.MC) 3.- &ql=1/ @algonpaje - www.quadrigram.com 1 + 2 + 3 = Base URL
  • 9. 1.- Building base URL using Quadrigram 1.1.- Module “Text” (String): “http://finance.yahoo.com/q?s=” 1.2.- Module “Text Entry Box”: Input the stock ticker (eg: TEF.MC) 1.3.- Module “Text” (String): “&ql=1/” 1.4.- Module “Addition of 5 objects” concatenating 1, 2 and 3 …. result = “http://finance.yahoo.com/q?s=TEF.MC&ql=1/” @algonpaje - www.quadrigram.com
  • 10. 2.- Querying data 2.1.- Connect the output of “Addition of 5 Objects” (“http://finance. yahoo.com/q?s=TEF.MC&ql=1/”) to module “Query HTTP GET” 2.2.- Connect a “Periodic Pulse” module to “Query HTTP GET” to query data each “X” seconds …. and so we get our HTML code ready to be scraped @algonpaje - www.quadrigram.com
  • 11. 3.- Scraping data 3.1.- Analyse the code and look for a “left - content - right” pattern. In this case, the pattern we are looking for is: left = <span id="yfs_l84_tef.mc"> content = stock price (* real time when market is opened) right = </span> @algonpaje - www.quadrigram.com
  • 12. 3.- Scraping data @algonpaje - www.quadrigram.com
  • 13. 3.- Scraping data 3.2.- Use “Scrape Text” module to extract data “Scrape Text” inlets: source text = HTML code (output of Query HTTP GET) start sequence = <span id="yfs_l84_tef.mc"> end sequence = </span> 3.3.- Extract the stock price using “Extract Object from List” module @algonpaje - www.quadrigram.com
  • 15. Example 2: Build a network of similarities using “The Echonest” API @algonpaje - www.quadrigram.com
  • 17. >BaseURL: http://developer.echonest.com/api/v4/artist/similar?api_key=J1OPQ9MJ8G8FC19FH&name=stones 1.- http://developer.echonest.com/api/v4/artist/similar?api_key=J1OPQ9MJ8G8FC19FH&name= 2.- artist´s name (“strokes”) @algonpaje - www.quadrigram.com 1 + 2 = Base URL
  • 18. 1.- Building base URL using Quadrigram 1.1.- Module “Text” (String): “http://developer.echonest.com/api/v4/artist/similar? api_key=J1OPQ9MJ8G8FC19FH&name=” 1.2.- Module “Text Entry Box”: Input the artist´s name (eg: strokes) 1.3.- Module “Addition of 5 objects” concatenating 1 and 2 …. result = “http://developer.echonest.com/api/v4/artist/similar? api_key=J1OPQ9MJ8G8FC19FH&name=strokes” @algonpaje - www.quadrigram.com
  • 19. 2.- Querying data 2.1.- Connect the output of “Addition of 5 Objects” (“http://developer.echonest.com/api/v4/artist/similar?api_key=J1OPQ9MJ8G8FC19FH&name=strokes”) to module “Query HTTP GET” …. and so we get our HTML code @algonpaje - www.quadrigram.com
  • 20. 3.- Scraping data 3.2.- Use “Scrape Text” module to extract data “Scrape Text” inlets: source text = HTML code (output of Query HTTP GET) start sequence = "name": " end sequence = "}, … and we obtain the list with similar artists to our query name @algonpaje - www.quadrigram.com
  • 21. 4.- Build a Network of similarities 4.1.- Use “Length of List” module to count how many similar artists the are 4.2.- Use “Create List with repeated Object” module to create as many “strokes” as similar artists are 4.3.- Create a Pair Table using “Create Custom Data Structure” module 4.4.- Conver the Pair Table to a Network using “Convert PairTable to Network” module @algonpaje - www.quadrigram.com
  • 23. More information: www.quadrigram.com @algonpaje - www.quadrigram.com
  • 24. Thank you!!! @algonpaje - www.quadrigram.com