Your SlideShare is downloading. ×
0
Use of Cloud Computing for scalable geospatial
data processing and access
Andrew Turner
CTO, FortiusOne
andrew@fortiusone....
What is GeoCommons?
                      A Brief History
Vulnerability Identification

                                 Chicago



                        Denver         Atlanta  ...
Columbus
      Circle



Holland
Tunnel

                   Baseline connectivity of a fiber
             WTC   network pr...
Lastly a scenario is run where just
10,000 sq ft. of damage is done to
the Holland Tunnel and the impact
calculated. The r...
GeoCommons: Version 1
Find
interesting data
Find             Map a
interesting data   relevant area
Find             Map a        Visualize to
interesting data   relevant area   find meaning
Find             Map a        Visualize to
interesting data   relevant area   find meaning


                              ...
Find             Map a        Visualize to
interesting data   relevant area   find meaning


                   Collaborate...
Find             Map a        Visualize to
interesting data   relevant area   find meaning


 Publish and       Collaborate...
Visualization
Analysis
Applying Lessons Learned
Modularize

             Application Programming Interface




                Finder              Maker




 RESTful     ...
Relational Databases Don’t Scale Well
Datasets as Databases

           KML
           Shapefile
           CSV (Excel)
           GeoRSS
           Documents

...
Datasets as Databases

 Upload    KML
           Shapefile
           CSV (Excel)
           GeoRSS
           Documents

...
Datasets as Databases

 Upload     KML
            Shapefile
            CSV (Excel)
            GeoRSS
            Docume...
Datasets as Databases

 Upload     KML
            Shapefile
            CSV (Excel)
            GeoRSS
            Docume...
Datasets as Databases

 Upload     KML
            Shapefile
            CSV (Excel)
            GeoRSS
            Docume...
Datasets as Databases

 Upload     KML
            Shapefile
            CSV (Excel)
            GeoRSS
            Docume...
Datasets as Databases

 Upload     KML
            Shapefile
            CSV (Excel)
            GeoRSS
            Docume...
Datasets as Databases

 Upload     KML
            Shapefile
            CSV (Excel)
            GeoRSS
            Docume...
Datasets as Databases

 Upload     KML
            Shapefile                 Visualize
            CSV (Excel)
           ...
Geospatial Catalog and Server
Delivery Mechanisms
Appliances

• Sun 4150
• RAID Array
Web Scaled Racks

• 3 Appliances
• Network File Storage
• Load Balancer
• Monitoring and Tunnels
• Production & Staging ra...
Limits in   Limits in
Scaling     Development
Limits in   Limits in
Scaling     Development
People
Limits in   Limits in
Scaling     Development
People
Power
Limits in   Limits in
Scaling     Development
People
Power
Size
Limits in   Limits in
Scaling     Development
People
Power
Size
Cost
Limits in   Limits in
Scaling     Development
People
Power
Size
Cost
Time
Limits in   Limits in
Scaling     Development
People
Power
Size
Cost
Time
Limits in   Limits in
Scaling     Development
People      Testing on “clean” machines
Power
Size
Cost
Time
Limits in   Limits in
Scaling     Development
People      Testing on “clean” machines
Power
Size        Deployment testing...
Limits in   Limits in
Scaling     Development
People      Testing on “clean” machines
Power
Size        Deployment testing...
Leveraging the Cloud




                       http://www.flickr.com/photos/kky/704056791
                                ...
Amazon Web Services
Management Consoles
Processing via MapReduce
Launching New Instances
Elastic Computing Cluster - EC2

• Virtual Servers
• Machine Images (AMI)
• On-Demand



    CentOS AMI
Elastic Computing Cluster - EC2

• Virtual Servers
• Machine Images (AMI)
• On-Demand



    CentOS AMI



   build
Elastic Computing Cluster - EC2

• Virtual Servers
• Machine Images (AMI)
• On-Demand


                          register...
Elastic Computing Cluster - EC2

• Virtual Servers
• Machine Images (AMI)
• On-Demand


                          register...
Elastic Computing Cluster - EC2

• Virtual Servers
• Machine Images (AMI)
• On-Demand


                          register...
Elastic Computing Cluster - EC2

• Virtual Servers
• Machine Images (AMI)
• On-Demand


                          register...
Elastic Computing Cluster - EC2

• Virtual Servers
• Machine Images (AMI)
• On-Demand


                          register...
Elastic Computing Cluster - EC2

• Virtual Servers
• Machine Images (AMI)
• On-Demand


                          register...
Elastic Block Store - EBS




Create EBS


        100 GB
Elastic Block Store - EBS




Create EBS

                 attach
        100 GB
Elastic Block Store - EBS




Create EBS

                 attach
        100 GB            snapshot
Elastic Block Store - EBS




Create EBS

                 attach
        100 GB                           snapshot




  ...
Elastic Block Store - EBS




Create EBS

                 attach
        100 GB                           snapshot




  ...
Elastic Block Store - EBS




Create EBS

                 attach
        100 GB                           snapshot


    ...
Elastic Block Store - EBS




Create EBS

                 attach
        100 GB                           snapshot


    ...
Elastic Block Store - EBS




Create EBS

                 attach
        100 GB                           snapshot


    ...
Elastic Block Store - EBS




Create EBS

                 attach
        100 GB                           snapshot


    ...
Public Datasets
Additional Benefits

• Federation
• Tile generation
• Content-delivery System
• Simple Queue System (SQS)
                 ...
Cloud Architecture

• EC2 image of current system architecture
• EBS image stored to S3 of default database
• Current appl...
Cloud Architecture

• EC2 image of current system architecture
• EBS image stored to S3 of default database
• Current appl...
Cloud Architecture

• EC2 image of current system architecture
• EBS image stored to S3 of default database
• Current appl...
Cloud Architecture

• EC2 image of current system architecture
• EBS image stored to S3 of default database
• Current appl...
Cloud Architecture

• EC2 image of current system architecture
• EBS image stored to S3 of default database
• Current appl...
Cloud Architecture

• EC2 image of current system architecture
• EBS image stored to S3 of default database
• Current appl...
Scaling

• RESTful architecture
• Caching for speed, and CDN support
• Amazon Web Services
  • CloudWatch
  • Elastic Scal...
Private Instances
First Users: Meedan, Media
Repeatable
Repeatable
Data Federation


                  community
Geospatial Federated Search Search
Geocoding
Geocoding - Scale as Required

   Upload
    CSV



                          Cache
             Geocode
                 ...
Geocoding - Scale as Required

   Upload
    CSV



                          Cache
             Geocode
                 ...
Best Practices Applied to the Government

• Built using open, established tools
• Full choice - Linux, Windows
• Full Cont...
Level of Maturity

• Widely adopted
• Broad support and ecosystem
• Full stack support
Perceived Impediments to Adoption

• Single Vendor (open-source alternatives arising)
• Maintenance and Location
• Data Se...
Thank you




                  Andrew Turner
          andrew@fortiusone.com
http://highearthorbit.com/presentations
Geospatial Analysis in the Cloud
Geospatial Analysis in the Cloud
Geospatial Analysis in the Cloud
Geospatial Analysis in the Cloud
Geospatial Analysis in the Cloud
Geospatial Analysis in the Cloud
Upcoming SlideShare
Loading in...5
×

Geospatial Analysis in the Cloud

1,024

Published on

Presented at the Government Cloud Service Oriented Architecture Workshop

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,024
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Geospatial Analysis in the Cloud"

  1. 1. Use of Cloud Computing for scalable geospatial data processing and access Andrew Turner CTO, FortiusOne andrew@fortiusone.com Partner: U.S. Federal Geographic Data Committee
  2. 2. What is GeoCommons? A Brief History
  3. 3. Vulnerability Identification Chicago Denver Atlanta Fiber Density Route 2 Los Angeles Route 1 Electric Transmission Line Density
  4. 4. Columbus Circle Holland Tunnel Baseline connectivity of a fiber WTC network provider in NYC. This particular provider is a good proxy for the structure of the entire island of Manhattan since they have about 80% of the right of ways on the island and a large number of egress points off the island. The higher the peak in the map the more frequently used the path is as a possible routing path.
  5. 5. Lastly a scenario is run where just 10,000 sq ft. of damage is done to the Holland Tunnel and the impact calculated. The result is a 8.6% loss of network connectivity, 134 times the impact of the WTC simulation. The dramatic impact is seen in the image from the loss as well as the stress put on the GW Bridge route out of the city.
  6. 6. GeoCommons: Version 1
  7. 7. Find interesting data
  8. 8. Find Map a interesting data relevant area
  9. 9. Find Map a Visualize to interesting data relevant area find meaning
  10. 10. Find Map a Visualize to interesting data relevant area find meaning Layer, Modify, and Analyze
  11. 11. Find Map a Visualize to interesting data relevant area find meaning Collaborate Layer, Modify, with others and Analyze
  12. 12. Find Map a Visualize to interesting data relevant area find meaning Publish and Collaborate Layer, Modify, share results with others and Analyze
  13. 13. Visualization
  14. 14. Analysis
  15. 15. Applying Lessons Learned
  16. 16. Modularize Application Programming Interface Finder Maker RESTful Core Interfaces
  17. 17. Relational Databases Don’t Scale Well
  18. 18. Datasets as Databases KML Shapefile CSV (Excel) GeoRSS Documents Finder Maker Core
  19. 19. Datasets as Databases Upload KML Shapefile CSV (Excel) GeoRSS Documents Finder Maker Core
  20. 20. Datasets as Databases Upload KML Shapefile CSV (Excel) GeoRSS Documents Finder Maker Parse & Store Core
  21. 21. Datasets as Databases Upload KML Shapefile CSV (Excel) GeoRSS Documents Finder Maker Parse & Store Core
  22. 22. Datasets as Databases Upload KML Shapefile CSV (Excel) GeoRSS Documents Finder Maker Parse & Store Core
  23. 23. Datasets as Databases Upload KML Shapefile CSV (Excel) GeoRSS Documents Finder Maker Parse & Store Core
  24. 24. Datasets as Databases Upload KML Shapefile CSV (Excel) GeoRSS Documents Download Finder Maker Parse & Store Core
  25. 25. Datasets as Databases Upload KML Shapefile CSV (Excel) GeoRSS Documents Download Finder Maker Parse & Store Analyze Core
  26. 26. Datasets as Databases Upload KML Shapefile Visualize CSV (Excel) GeoRSS Documents Download Finder Maker Parse & Store Analyze Core
  27. 27. Geospatial Catalog and Server
  28. 28. Delivery Mechanisms
  29. 29. Appliances • Sun 4150 • RAID Array
  30. 30. Web Scaled Racks • 3 Appliances • Network File Storage • Load Balancer • Monitoring and Tunnels • Production & Staging racks • Racks in office for development
  31. 31. Limits in Limits in Scaling Development
  32. 32. Limits in Limits in Scaling Development People
  33. 33. Limits in Limits in Scaling Development People Power
  34. 34. Limits in Limits in Scaling Development People Power Size
  35. 35. Limits in Limits in Scaling Development People Power Size Cost
  36. 36. Limits in Limits in Scaling Development People Power Size Cost Time
  37. 37. Limits in Limits in Scaling Development People Power Size Cost Time
  38. 38. Limits in Limits in Scaling Development People Testing on “clean” machines Power Size Cost Time
  39. 39. Limits in Limits in Scaling Development People Testing on “clean” machines Power Size Deployment testing of Cost upgrades Time
  40. 40. Limits in Limits in Scaling Development People Testing on “clean” machines Power Size Deployment testing of Cost upgrades Time Controlled Environments
  41. 41. Leveraging the Cloud http://www.flickr.com/photos/kky/704056791 url
  42. 42. Amazon Web Services
  43. 43. Management Consoles
  44. 44. Processing via MapReduce
  45. 45. Launching New Instances
  46. 46. Elastic Computing Cluster - EC2 • Virtual Servers • Machine Images (AMI) • On-Demand CentOS AMI
  47. 47. Elastic Computing Cluster - EC2 • Virtual Servers • Machine Images (AMI) • On-Demand CentOS AMI build
  48. 48. Elastic Computing Cluster - EC2 • Virtual Servers • Machine Images (AMI) • On-Demand register CentOS AMI bundle build
  49. 49. Elastic Computing Cluster - EC2 • Virtual Servers • Machine Images (AMI) • On-Demand register CentOS AMI bundle instantiate build
  50. 50. Elastic Computing Cluster - EC2 • Virtual Servers • Machine Images (AMI) • On-Demand register CentOS AMI bundle instantiate build
  51. 51. Elastic Computing Cluster - EC2 • Virtual Servers • Machine Images (AMI) • On-Demand register CentOS AMI bundle instantiate build
  52. 52. Elastic Computing Cluster - EC2 • Virtual Servers • Machine Images (AMI) • On-Demand register CentOS AMI bundle instantiate build
  53. 53. Elastic Computing Cluster - EC2 • Virtual Servers • Machine Images (AMI) • On-Demand register CentOS AMI bundle instantiate build
  54. 54. Elastic Block Store - EBS Create EBS 100 GB
  55. 55. Elastic Block Store - EBS Create EBS attach 100 GB
  56. 56. Elastic Block Store - EBS Create EBS attach 100 GB snapshot
  57. 57. Elastic Block Store - EBS Create EBS attach 100 GB snapshot S3 Diff v1
  58. 58. Elastic Block Store - EBS Create EBS attach 100 GB snapshot S3 Diff v1 Diff v2
  59. 59. Elastic Block Store - EBS Create EBS attach 100 GB snapshot Create & Attach S3 Diff v1 Diff v2
  60. 60. Elastic Block Store - EBS Create EBS attach 100 GB snapshot Create & Attach S3 Diff v1 Diff v2
  61. 61. Elastic Block Store - EBS Create EBS attach 100 GB snapshot Create & Attach S3 Diff v1 Diff v2
  62. 62. Elastic Block Store - EBS Create EBS attach 100 GB snapshot Create & Attach S3 Diff v1 Diff v2
  63. 63. Public Datasets
  64. 64. Additional Benefits • Federation • Tile generation • Content-delivery System • Simple Queue System (SQS) tiles/openstreetmap/9/74/97.png tiles/openstreetmap/9/74/98.png tiles/bluemarble/9/74/97.png S3 Storage tiles/bluemarble/9/74/98.png
  65. 65. Cloud Architecture • EC2 image of current system architecture • EBS image stored to S3 of default database • Current application release in S3 • Start an EC2, attach data, attach code, startup v1.4.3 Default Datasets
  66. 66. Cloud Architecture • EC2 image of current system architecture • EBS image stored to S3 of default database • Current application release in S3 • Start an EC2, attach data, attach code, startup create instance v1.4.3 Default Datasets
  67. 67. Cloud Architecture • EC2 image of current system architecture • EBS image stored to S3 of default database • Current application release in S3 • Start an EC2, attach data, attach code, startup create instance v1.4.3 Default Datasets
  68. 68. Cloud Architecture • EC2 image of current system architecture • EBS image stored to S3 of default database • Current application release in S3 • Start an EC2, attach data, attach code, startup create instance v1.4.3 attach data Default Datasets
  69. 69. Cloud Architecture • EC2 image of current system architecture • EBS image stored to S3 of default database • Current application release in S3 • Start an EC2, attach data, attach code, startup create instance v1.4.3 Snapshot attach data Default Datasets Backup Backup Backup
  70. 70. Cloud Architecture • EC2 image of current system architecture • EBS image stored to S3 of default database • Current application release in S3 • Start an EC2, attach data, attach code, startup create instance Cache S3 Downloads v1.4.3 Snapshot attach data Default Datasets Backup Backup Backup
  71. 71. Scaling • RESTful architecture • Caching for speed, and CDN support • Amazon Web Services • CloudWatch • Elastic Scaling • Load Balancer
  72. 72. Private Instances
  73. 73. First Users: Meedan, Media
  74. 74. Repeatable
  75. 75. Repeatable
  76. 76. Data Federation community
  77. 77. Geospatial Federated Search Search
  78. 78. Geocoding
  79. 79. Geocoding - Scale as Required Upload CSV Cache Geocode Results API Geocoding Engine TIGER/Line SQLite
  80. 80. Geocoding - Scale as Required Upload CSV Cache Geocode Results API Geocoding Engine TIGER/Line SQLite
  81. 81. Best Practices Applied to the Government • Built using open, established tools • Full choice - Linux, Windows • Full Control • Repeatable processes • Continual backup • Scaling dynamic and large datasets • Synchronous and Asynchronous analysis
  82. 82. Level of Maturity • Widely adopted • Broad support and ecosystem • Full stack support
  83. 83. Perceived Impediments to Adoption • Single Vendor (open-source alternatives arising) • Maintenance and Location • Data Security
  84. 84. Thank you Andrew Turner andrew@fortiusone.com http://highearthorbit.com/presentations
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×