SlideShare a Scribd company logo
1 of 17
Download to read offline
Mapping Commodity Trading in the 19th Century
Benjamin Bach,
INRIA,
Paris
Asma Malik,
University of
Strathclyde,
Glasgow
Michael
Mauderer,
University of
St Andrews
Sadiq Sani,
Robert Gordon
University,
Aberdeen
Joe Wandy,
University of
Glasgow
Outline
● Project Overview
● Data
● Technology
● Demo
● Future Work
Overview
19th Century
Commodities Diseases
Locations Disasters
Process
Tasks
● Retrieve documents mentioning
○ Commodities
○ Locations
○ Time range
● Relations between retrieved terms
○ Spatial relations
○ Temporal relations
○ Co-occurrence relations
Users:
Historians
Data
● Commodities: 1067
● Time: 1600 - 1952 (452 years)
● Documents: 18 580
● Location occurrences: 91 650 469
● Commodity occurrences: 29 020 013
The Data
● PostgreSQL Database in Edinburgh
○ Not accessible
● PostgreSQL Database in St Andrews
○ Low Performance
● PostgreSQL Database Backup
○ 2.5GB compressed binary data
○ Cannot be imported into Amazon RDS
Solution 1
● Create a more compatible SQL export to
import into Amazon RDS
○ 24GB raw text file containing SQL statements
○ still incompatible
○ hard to correct errors in a timely manner
Solution 2
● Create EC2 instance running a PostgreSQL
database
○ Powerful enough
○ Enough storage
○ Accessible
Big Data Problems
● Simple things take a long time
● Incremental finding of errors/new problems
The Pipeline
● D3 for client-side presentation
● Java+SQL for server-side processing
data
Database
Web Service
Client
Commodities, date range
Initial Sketches
Visualization
- Space and time
-> Finding related terms + documents
- find related documents
- what are documents talking about
- Implicit knowledge:
- Co-occurrences of terms in documents
For every commodity:
1) Get top 10 documents,
2) Limit related terms to 6
3) Sum up co-occurrences
Demo
Future work
- Query by Location
- Time diagrams for term frequency over time
- Encode information in matrix cells (#doc,collection..)
- Show and browse documents
- Handle big data: diseases, disasters, ..
- Co-occurrences ?
Thank you for listening!

More Related Content

Similar to Mapping Commodity Trading

Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdfLars Albertsson
 
Simple Archive Architectures
Simple Archive ArchitecturesSimple Archive Architectures
Simple Archive ArchitecturesLighton Phiri
 
Lviv Outsourcing Forum 2016 Михайло Крамаренко “IT-outsourcing: Retrospection...
Lviv Outsourcing Forum 2016 Михайло Крамаренко “IT-outsourcing: Retrospection...Lviv Outsourcing Forum 2016 Михайло Крамаренко “IT-outsourcing: Retrospection...
Lviv Outsourcing Forum 2016 Михайло Крамаренко “IT-outsourcing: Retrospection...Lviv Startup Club
 
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital Objects
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital ObjectsPortland Common Data Model (PCDM): Creating and Sharing Complex Digital Objects
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital ObjectsKaren Estlund
 
The Internet in Database: A Cassandra Use Case
The Internet in Database: A Cassandra Use CaseThe Internet in Database: A Cassandra Use Case
The Internet in Database: A Cassandra Use CaseDatafiniti
 
India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015Kanwal Prakash Singh
 
India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015Kanwal Prakash Singh
 

Similar to Mapping Commodity Trading (7)

Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdf
 
Simple Archive Architectures
Simple Archive ArchitecturesSimple Archive Architectures
Simple Archive Architectures
 
Lviv Outsourcing Forum 2016 Михайло Крамаренко “IT-outsourcing: Retrospection...
Lviv Outsourcing Forum 2016 Михайло Крамаренко “IT-outsourcing: Retrospection...Lviv Outsourcing Forum 2016 Михайло Крамаренко “IT-outsourcing: Retrospection...
Lviv Outsourcing Forum 2016 Михайло Крамаренко “IT-outsourcing: Retrospection...
 
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital Objects
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital ObjectsPortland Common Data Model (PCDM): Creating and Sharing Complex Digital Objects
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital Objects
 
The Internet in Database: A Cassandra Use Case
The Internet in Database: A Cassandra Use CaseThe Internet in Database: A Cassandra Use Case
The Internet in Database: A Cassandra Use Case
 
India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015
 
India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015India Analytics and Big Data Summit 2015
India Analytics and Big Data Summit 2015
 

Recently uploaded

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Recently uploaded (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Mapping Commodity Trading

  • 1. Mapping Commodity Trading in the 19th Century Benjamin Bach, INRIA, Paris Asma Malik, University of Strathclyde, Glasgow Michael Mauderer, University of St Andrews Sadiq Sani, Robert Gordon University, Aberdeen Joe Wandy, University of Glasgow
  • 2. Outline ● Project Overview ● Data ● Technology ● Demo ● Future Work
  • 5. Tasks ● Retrieve documents mentioning ○ Commodities ○ Locations ○ Time range ● Relations between retrieved terms ○ Spatial relations ○ Temporal relations ○ Co-occurrence relations Users: Historians
  • 6. Data ● Commodities: 1067 ● Time: 1600 - 1952 (452 years) ● Documents: 18 580 ● Location occurrences: 91 650 469 ● Commodity occurrences: 29 020 013
  • 7. The Data ● PostgreSQL Database in Edinburgh ○ Not accessible ● PostgreSQL Database in St Andrews ○ Low Performance ● PostgreSQL Database Backup ○ 2.5GB compressed binary data ○ Cannot be imported into Amazon RDS
  • 8. Solution 1 ● Create a more compatible SQL export to import into Amazon RDS ○ 24GB raw text file containing SQL statements ○ still incompatible ○ hard to correct errors in a timely manner
  • 9. Solution 2 ● Create EC2 instance running a PostgreSQL database ○ Powerful enough ○ Enough storage ○ Accessible
  • 10. Big Data Problems ● Simple things take a long time ● Incremental finding of errors/new problems
  • 11. The Pipeline ● D3 for client-side presentation ● Java+SQL for server-side processing data Database Web Service Client Commodities, date range
  • 13.
  • 14. Visualization - Space and time -> Finding related terms + documents - find related documents - what are documents talking about - Implicit knowledge: - Co-occurrences of terms in documents For every commodity: 1) Get top 10 documents, 2) Limit related terms to 6 3) Sum up co-occurrences
  • 15. Demo
  • 16. Future work - Query by Location - Time diagrams for term frequency over time - Encode information in matrix cells (#doc,collection..) - Show and browse documents - Handle big data: diseases, disasters, .. - Co-occurrences ?
  • 17. Thank you for listening!