SlideShare a Scribd company logo
1 of 51
Download to read offline
Access Open Data 
with Open Source 
Software Tools 
Sammy Fung 
sammy@sammy.hk
Sammy Fung 
● Developer 
● Founder, JobFOL 
● President of Open Source Hong Kong
Creating 
values to us 
and community
Open Data
Open Data 
● Discoverable 
– Available and Searchable on Internet. 
● Structured 
– Open and Machine-readable Format. 
● Unconditional 
– Legal Framework allows to reproduce an repurpose 
the data.
Open Source
Open Source 
● Software Development Model 
● Free Software (1985) 
– Free = Freedom 
– Run the program (Freedom 0) 
– Study the source code and change it (Freedom 1) 
– Redistribute copies (Freeom 2) 
– Distribute your modified version in same license (Freedom 
3) 
● Open Source (1998)
Open Source Web Application 
Software Stack 
● LAMP 
– Linux (1991): Operating System 
– Apache (1995): Web Server 
– MySQL (1995): Database Server 
– PHP (1995): Server-side Scripting Language 
● Other Alternatives: 
– LNMP: Replacing Apache with Nginx 
– Another M of LAMP: MariaDB, MongoDB
Python 
● Programming Language 
– Since 1991 
– Widely used general purpose 
– High-level 
– Open Source 
● Another P of LAMP
My Open Data related Projects 
● TV Timetable of Live Football Matches (2004) 
● Weather Information (2006) 
● Public Transportation Information (2006) 
● LegCo Vote Information (2013) 
● Air Quality Information (2014) 
● Restaurant Information (2014)
TCTrack 
● Plot a map of typhoon path of different observation 
agencies 
● Google Map API 
– First Typhoon Map in HK using Google API 
– Sammy.HK TCTrack → Weather Underground → Hong Kong 
Observatory 
● Twitter API 
– Posting typhoon updates from any potential formation of 
tropcial cyclone in Northwest Pacific Ocean. 
● Data Sources: HKO, JTWC.
Interview by MetroPop in 2009
Open Data on 
Hong Kong 
Restaurant & 
Food Licenses
Licensed Restaurants in Hong Kong 
● Open Data from Data.One PSI 
● Open Source Software Tools 
– Python 
– Scrapy Web Scraping Framework 
● Source Codes are released on GitHub 
– https://github.com/sammyfung/LP_Restaurants_Scr 
apy
Creating environment of 
a Scrapy project 
● Requirements 
– Python, Python-Dev, virtualenv, pip 
● Creating a virtual enviornment for python 
project 
– virtualenv ~/env 
– source ~/env/bin/activate 
– pip install scrapy
Creating a Scrapy project 
● Creating a new Scrapy project with spider 
– scrapy startproject LP_Restaurants_Scrapy 
– cd LP_Restaurants_Scrapy 
– scrapy genspider rlxml fehd.gov.hk 
● Creating a scrapy data model 
● Doing some tests with scrapy shell. 
– scrapy shell <URL> 
– http://www.fehd.gov.hk/english/licensing/license/text/LP_Restaurants_EN.XML 
● Writing the parse function of a scrapy spider. 
● Try and test the spider 
– scrapy crawl rlxml -t json -o restaurant_licenses.json
Open Data
Open Source
Creating 
values to us 
and community

More Related Content

Similar to Access Open Data with Open Source Software Tools

Use open source software to develop ideas at work
Use open source software to develop ideas at workUse open source software to develop ideas at work
Use open source software to develop ideas at workSammy Fung
 
Software Freedom and Community
Software Freedom and CommunitySoftware Freedom and Community
Software Freedom and CommunitySammy Fung
 
Open Source Weather Information Project with OpenStack Object Storage
Open Source Weather Information Project with OpenStack Object StorageOpen Source Weather Information Project with OpenStack Object Storage
Open Source Weather Information Project with OpenStack Object StorageSammy Fung
 
How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)Sammy Fung
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionSammy Fung
 
Open source, What | Why | How
Open source, What | Why | How Open source, What | Why | How
Open source, What | Why | How Nikhil Agrawal
 
Workshop For pycon13
Workshop For pycon13Workshop For pycon13
Workshop For pycon13Steven Pousty
 
Linux Seminar for Beginners
Linux Seminar for BeginnersLinux Seminar for Beginners
Linux Seminar for BeginnersNAILBITER
 
Python 101 For The Net Developer
Python 101 For The Net DeveloperPython 101 For The Net Developer
Python 101 For The Net DeveloperSarah Dutkiewicz
 
Software Freedom and Open Source Community
Software Freedom and Open Source CommunitySoftware Freedom and Open Source Community
Software Freedom and Open Source CommunitySammy Fung
 
Embedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowEmbedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowOpersys inc.
 
Embedded Android Workshop with Lollipop
Embedded Android Workshop with LollipopEmbedded Android Workshop with Lollipop
Embedded Android Workshop with LollipopOpersys inc.
 
DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)Francesco Fiore
 
Embedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowEmbedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowKarim Yaghmour
 
Embedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowEmbedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowOpersys inc.
 
Open Source Concepts
Open Source ConceptsOpen Source Concepts
Open Source ConceptsRituBhargava7
 
Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)Sammy Fung
 
Embedded Android Workshop with Lollipop
Embedded Android Workshop with LollipopEmbedded Android Workshop with Lollipop
Embedded Android Workshop with LollipopOpersys inc.
 
Embedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowEmbedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowOpersys inc.
 

Similar to Access Open Data with Open Source Software Tools (20)

Use open source software to develop ideas at work
Use open source software to develop ideas at workUse open source software to develop ideas at work
Use open source software to develop ideas at work
 
Software Freedom and Community
Software Freedom and CommunitySoftware Freedom and Community
Software Freedom and Community
 
Open Source Weather Information Project with OpenStack Object Storage
Open Source Weather Information Project with OpenStack Object StorageOpen Source Weather Information Project with OpenStack Object Storage
Open Source Weather Information Project with OpenStack Object Storage
 
How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)
 
Intro to open_source
Intro to open_sourceIntro to open_source
Intro to open_source
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
 
Open source, What | Why | How
Open source, What | Why | How Open source, What | Why | How
Open source, What | Why | How
 
Workshop For pycon13
Workshop For pycon13Workshop For pycon13
Workshop For pycon13
 
Linux Seminar for Beginners
Linux Seminar for BeginnersLinux Seminar for Beginners
Linux Seminar for Beginners
 
Python 101 For The Net Developer
Python 101 For The Net DeveloperPython 101 For The Net Developer
Python 101 For The Net Developer
 
Software Freedom and Open Source Community
Software Freedom and Open Source CommunitySoftware Freedom and Open Source Community
Software Freedom and Open Source Community
 
Embedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowEmbedded Android Workshop with Marshmallow
Embedded Android Workshop with Marshmallow
 
Embedded Android Workshop with Lollipop
Embedded Android Workshop with LollipopEmbedded Android Workshop with Lollipop
Embedded Android Workshop with Lollipop
 
DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)DevOps of Python applications using OpenShift (Italian version)
DevOps of Python applications using OpenShift (Italian version)
 
Embedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowEmbedded Android Workshop with Marshmallow
Embedded Android Workshop with Marshmallow
 
Embedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowEmbedded Android Workshop with Marshmallow
Embedded Android Workshop with Marshmallow
 
Open Source Concepts
Open Source ConceptsOpen Source Concepts
Open Source Concepts
 
Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)
 
Embedded Android Workshop with Lollipop
Embedded Android Workshop with LollipopEmbedded Android Workshop with Lollipop
Embedded Android Workshop with Lollipop
 
Embedded Android Workshop with Marshmallow
Embedded Android Workshop with MarshmallowEmbedded Android Workshop with Marshmallow
Embedded Android Workshop with Marshmallow
 

More from Sammy Fung

Python 爬網⾴工具 - Scrapy 介紹
Python 爬網⾴工具 - Scrapy 介紹Python 爬網⾴工具 - Scrapy 介紹
Python 爬網⾴工具 - Scrapy 介紹Sammy Fung
 
DevRel - Transform article writing from printing to online
DevRel - Transform article writing from printing to onlineDevRel - Transform article writing from printing to online
DevRel - Transform article writing from printing to onlineSammy Fung
 
Introduction to Open Source by opensource.hk (2019 Edition)
Introduction to Open Source by opensource.hk (2019 Edition)Introduction to Open Source by opensource.hk (2019 Edition)
Introduction to Open Source by opensource.hk (2019 Edition)Sammy Fung
 
My Open Source Journey - Developer and Community
My Open Source Journey - Developer and CommunityMy Open Source Journey - Developer and Community
My Open Source Journey - Developer and CommunitySammy Fung
 
Introduction to development with Django web framework
Introduction to development with Django web frameworkIntroduction to development with Django web framework
Introduction to development with Django web frameworkSammy Fung
 
香港中文開源軟件翻譯
香港中文開源軟件翻譯香港中文開源軟件翻譯
香港中文開源軟件翻譯Sammy Fung
 
Open Data and Web API
Open Data and Web APIOpen Data and Web API
Open Data and Web APISammy Fung
 
Global Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastGlobal Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastSammy Fung
 
Mozilla - Openness of the Web
Mozilla - Openness of the WebMozilla - Openness of the Web
Mozilla - Openness of the WebSammy Fung
 
Open Source Technology and Community
Open Source Technology and CommunityOpen Source Technology and Community
Open Source Technology and CommunitySammy Fung
 
Installation of LAMP Server with Ubuntu 14.10 Server Edition
Installation of LAMP Server with Ubuntu 14.10 Server EditionInstallation of LAMP Server with Ubuntu 14.10 Server Edition
Installation of LAMP Server with Ubuntu 14.10 Server EditionSammy Fung
 
Building your own job site with Drupal
Building your own job site with DrupalBuilding your own job site with Drupal
Building your own job site with DrupalSammy Fung
 
Open Source Job Board
Open Source Job BoardOpen Source Job Board
Open Source Job BoardSammy Fung
 
Introduction of Mozilla Hong Kong (COSCUP 2014)
Introduction of Mozilla Hong Kong (COSCUP 2014)Introduction of Mozilla Hong Kong (COSCUP 2014)
Introduction of Mozilla Hong Kong (COSCUP 2014)Sammy Fung
 
Introduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMSIntroduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMSSammy Fung
 
Python, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and DjangoPython, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and DjangoSammy Fung
 
Mozilla Community and Hong Kong
Mozilla Community and Hong KongMozilla Community and Hong Kong
Mozilla Community and Hong KongSammy Fung
 
ITFest 2014 - Open Source Marketing
ITFest 2014 - Open Source MarketingITFest 2014 - Open Source Marketing
ITFest 2014 - Open Source MarketingSammy Fung
 
How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2Sammy Fung
 
Air Pollution Weather Map at OpenDataHK.make.02
Air Pollution Weather Map at OpenDataHK.make.02Air Pollution Weather Map at OpenDataHK.make.02
Air Pollution Weather Map at OpenDataHK.make.02Sammy Fung
 

More from Sammy Fung (20)

Python 爬網⾴工具 - Scrapy 介紹
Python 爬網⾴工具 - Scrapy 介紹Python 爬網⾴工具 - Scrapy 介紹
Python 爬網⾴工具 - Scrapy 介紹
 
DevRel - Transform article writing from printing to online
DevRel - Transform article writing from printing to onlineDevRel - Transform article writing from printing to online
DevRel - Transform article writing from printing to online
 
Introduction to Open Source by opensource.hk (2019 Edition)
Introduction to Open Source by opensource.hk (2019 Edition)Introduction to Open Source by opensource.hk (2019 Edition)
Introduction to Open Source by opensource.hk (2019 Edition)
 
My Open Source Journey - Developer and Community
My Open Source Journey - Developer and CommunityMy Open Source Journey - Developer and Community
My Open Source Journey - Developer and Community
 
Introduction to development with Django web framework
Introduction to development with Django web frameworkIntroduction to development with Django web framework
Introduction to development with Django web framework
 
香港中文開源軟件翻譯
香港中文開源軟件翻譯香港中文開源軟件翻譯
香港中文開源軟件翻譯
 
Open Data and Web API
Open Data and Web APIOpen Data and Web API
Open Data and Web API
 
Global Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastGlobal Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 Forecast
 
Mozilla - Openness of the Web
Mozilla - Openness of the WebMozilla - Openness of the Web
Mozilla - Openness of the Web
 
Open Source Technology and Community
Open Source Technology and CommunityOpen Source Technology and Community
Open Source Technology and Community
 
Installation of LAMP Server with Ubuntu 14.10 Server Edition
Installation of LAMP Server with Ubuntu 14.10 Server EditionInstallation of LAMP Server with Ubuntu 14.10 Server Edition
Installation of LAMP Server with Ubuntu 14.10 Server Edition
 
Building your own job site with Drupal
Building your own job site with DrupalBuilding your own job site with Drupal
Building your own job site with Drupal
 
Open Source Job Board
Open Source Job BoardOpen Source Job Board
Open Source Job Board
 
Introduction of Mozilla Hong Kong (COSCUP 2014)
Introduction of Mozilla Hong Kong (COSCUP 2014)Introduction of Mozilla Hong Kong (COSCUP 2014)
Introduction of Mozilla Hong Kong (COSCUP 2014)
 
Introduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMSIntroduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMS
 
Python, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and DjangoPython, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and Django
 
Mozilla Community and Hong Kong
Mozilla Community and Hong KongMozilla Community and Hong Kong
Mozilla Community and Hong Kong
 
ITFest 2014 - Open Source Marketing
ITFest 2014 - Open Source MarketingITFest 2014 - Open Source Marketing
ITFest 2014 - Open Source Marketing
 
How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2
 
Air Pollution Weather Map at OpenDataHK.make.02
Air Pollution Weather Map at OpenDataHK.make.02Air Pollution Weather Map at OpenDataHK.make.02
Air Pollution Weather Map at OpenDataHK.make.02
 

Recently uploaded

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 

Recently uploaded (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 

Access Open Data with Open Source Software Tools

  • 1. Access Open Data with Open Source Software Tools Sammy Fung sammy@sammy.hk
  • 2. Sammy Fung ● Developer ● Founder, JobFOL ● President of Open Source Hong Kong
  • 3. Creating values to us and community
  • 5. Open Data ● Discoverable – Available and Searchable on Internet. ● Structured – Open and Machine-readable Format. ● Unconditional – Legal Framework allows to reproduce an repurpose the data.
  • 6.
  • 7.
  • 9. Open Source ● Software Development Model ● Free Software (1985) – Free = Freedom – Run the program (Freedom 0) – Study the source code and change it (Freedom 1) – Redistribute copies (Freeom 2) – Distribute your modified version in same license (Freedom 3) ● Open Source (1998)
  • 10.
  • 11. Open Source Web Application Software Stack ● LAMP – Linux (1991): Operating System – Apache (1995): Web Server – MySQL (1995): Database Server – PHP (1995): Server-side Scripting Language ● Other Alternatives: – LNMP: Replacing Apache with Nginx – Another M of LAMP: MariaDB, MongoDB
  • 12. Python ● Programming Language – Since 1991 – Widely used general purpose – High-level – Open Source ● Another P of LAMP
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20. My Open Data related Projects ● TV Timetable of Live Football Matches (2004) ● Weather Information (2006) ● Public Transportation Information (2006) ● LegCo Vote Information (2013) ● Air Quality Information (2014) ● Restaurant Information (2014)
  • 21.
  • 22. TCTrack ● Plot a map of typhoon path of different observation agencies ● Google Map API – First Typhoon Map in HK using Google API – Sammy.HK TCTrack → Weather Underground → Hong Kong Observatory ● Twitter API – Posting typhoon updates from any potential formation of tropcial cyclone in Northwest Pacific Ocean. ● Data Sources: HKO, JTWC.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43. Open Data on Hong Kong Restaurant & Food Licenses
  • 44.
  • 45.
  • 46. Licensed Restaurants in Hong Kong ● Open Data from Data.One PSI ● Open Source Software Tools – Python – Scrapy Web Scraping Framework ● Source Codes are released on GitHub – https://github.com/sammyfung/LP_Restaurants_Scr apy
  • 47. Creating environment of a Scrapy project ● Requirements – Python, Python-Dev, virtualenv, pip ● Creating a virtual enviornment for python project – virtualenv ~/env – source ~/env/bin/activate – pip install scrapy
  • 48. Creating a Scrapy project ● Creating a new Scrapy project with spider – scrapy startproject LP_Restaurants_Scrapy – cd LP_Restaurants_Scrapy – scrapy genspider rlxml fehd.gov.hk ● Creating a scrapy data model ● Doing some tests with scrapy shell. – scrapy shell <URL> – http://www.fehd.gov.hk/english/licensing/license/text/LP_Restaurants_EN.XML ● Writing the parse function of a scrapy spider. ● Try and test the spider – scrapy crawl rlxml -t json -o restaurant_licenses.json
  • 51. Creating values to us and community