Mapping Commodity Trading in the 19th Century
Benjamin Bach,
INRIA,
Paris
Asma Malik,
University of
Strathclyde,
Glasgow
Michael
Mauderer,
University of
St Andrews
Sadiq Sani,
Robert Gordon
University,
Aberdeen
Joe Wandy,
University of
Glasgow
Outline
● Project Overview
● Data
● Technology
● Demo
● Future Work
Overview
19th Century
Commodities Diseases
Locations Disasters
Process
Tasks
● Retrieve documents mentioning
○ Commodities
○ Locations
○ Time range
● Relations between retrieved terms
○ Spatial relations
○ Temporal relations
○ Co-occurrence relations
Users:
Historians
Data
● Commodities: 1067
● Time: 1600 - 1952 (452 years)
● Documents: 18 580
● Location occurrences: 91 650 469
● Commodity occurrences: 29 020 013
The Data
● PostgreSQL Database in Edinburgh
○ Not accessible
● PostgreSQL Database in St Andrews
○ Low Performance
● PostgreSQL Database Backup
○ 2.5GB compressed binary data
○ Cannot be imported into Amazon RDS
Solution 1
● Create a more compatible SQL export to
import into Amazon RDS
○ 24GB raw text file containing SQL statements
○ still incompatible
○ hard to correct errors in a timely manner
Solution 2
● Create EC2 instance running a PostgreSQL
database
○ Powerful enough
○ Enough storage
○ Accessible
Big Data Problems
● Simple things take a long time
● Incremental finding of errors/new problems
The Pipeline
● D3 for client-side presentation
● Java+SQL for server-side processing
data
Database
Web Service
Client
Commodities, date range
Initial Sketches
Visualization
- Space and time
-> Finding related terms + documents
- find related documents
- what are documents talking about
- Implicit knowledge:
- Co-occurrences of terms in documents
For every commodity:
1) Get top 10 documents,
2) Limit related terms to 6
3) Sum up co-occurrences
Demo
Future work
- Query by Location
- Time diagrams for term frequency over time
- Encode information in matrix cells (#doc,collection..)
- Show and browse documents
- Handle big data: diseases, disasters, ..
- Co-occurrences ?
Thank you for listening!

Mapping Commodity Trading

  • 1.
    Mapping Commodity Tradingin the 19th Century Benjamin Bach, INRIA, Paris Asma Malik, University of Strathclyde, Glasgow Michael Mauderer, University of St Andrews Sadiq Sani, Robert Gordon University, Aberdeen Joe Wandy, University of Glasgow
  • 2.
    Outline ● Project Overview ●Data ● Technology ● Demo ● Future Work
  • 3.
  • 4.
  • 5.
    Tasks ● Retrieve documentsmentioning ○ Commodities ○ Locations ○ Time range ● Relations between retrieved terms ○ Spatial relations ○ Temporal relations ○ Co-occurrence relations Users: Historians
  • 6.
    Data ● Commodities: 1067 ●Time: 1600 - 1952 (452 years) ● Documents: 18 580 ● Location occurrences: 91 650 469 ● Commodity occurrences: 29 020 013
  • 7.
    The Data ● PostgreSQLDatabase in Edinburgh ○ Not accessible ● PostgreSQL Database in St Andrews ○ Low Performance ● PostgreSQL Database Backup ○ 2.5GB compressed binary data ○ Cannot be imported into Amazon RDS
  • 8.
    Solution 1 ● Createa more compatible SQL export to import into Amazon RDS ○ 24GB raw text file containing SQL statements ○ still incompatible ○ hard to correct errors in a timely manner
  • 9.
    Solution 2 ● CreateEC2 instance running a PostgreSQL database ○ Powerful enough ○ Enough storage ○ Accessible
  • 10.
    Big Data Problems ●Simple things take a long time ● Incremental finding of errors/new problems
  • 11.
    The Pipeline ● D3for client-side presentation ● Java+SQL for server-side processing data Database Web Service Client Commodities, date range
  • 12.
  • 14.
    Visualization - Space andtime -> Finding related terms + documents - find related documents - what are documents talking about - Implicit knowledge: - Co-occurrences of terms in documents For every commodity: 1) Get top 10 documents, 2) Limit related terms to 6 3) Sum up co-occurrences
  • 15.
  • 16.
    Future work - Queryby Location - Time diagrams for term frequency over time - Encode information in matrix cells (#doc,collection..) - Show and browse documents - Handle big data: diseases, disasters, .. - Co-occurrences ?
  • 17.
    Thank you forlistening!