Your SlideShare is downloading. ×
0
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastructure

235

Published on

Using Open Source and Cloud Computing principles, these slides walk through the architectural patterns for building scalable cloud services. The second part of the presentation focuses on profiling …

Using Open Source and Cloud Computing principles, these slides walk through the architectural patterns for building scalable cloud services. The second part of the presentation focuses on profiling common geolocation tasks like importing large datasets and rendering map tiles.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
235
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. FOSS4G  in  the  Cloud     Mohamed  Sayed   mohamed@fossworx.org   Version  092013   License:  CC-­‐BY-­‐SA  
  • 2. Agenda   •  Disclaimers   •  Goals/MoLves   •  The  historical  path  to  ‘Cloud  CompuLng’   •  ‘DefiniLon’  of  cloud  compuLng   •  FOSS4G  in  Cloud  Use  cases   •  AWS:  Components  and  Services   •  Building  for  the  cloud   –  Architectural  paUerns  for    Cloud  Services   –  Cultural  changes   –  Processes  changes   –  Things  to  remember   •  Common  FOSS4G  tasks  in  AWS   –  ImporLng  OSM  data  into  POSTGIS   –  Mod_Lle/Mapnik   –  GWC/Geoserver   •  QuesLons?  
  • 3. Disclaimers   •  The  work  presented  was  funded  personally   and  done  during  my  vacaLon.  All  opinions  are   my  own  and  not  my  employer.   •  I  am  not  affiliated  with  AWS  in  any  other  way   than  being  a  customer,  I  choose  them  when   that  choice  makes  sense    and  would  use   others  where  applicable.   •  This  is  sLll  Work  in  progress.  YMMV  
  • 4. Goals/MoLves   •  Goals   – We  will  learn  or  validate  some  ideas.   – Get  some  feedback  on  what  to  do  next.   – Help    save  someone  Lme/money/frustraLon   – Raise  awareness  about  some  risks.   •  MoLves   – The  new  disrupLon  is  in  data  and  services  around   it,  we(Open  Source  people)  should  not  miss  out   on  that  and  I  believe  I  can  help.  
  • 5. Cloud Computing Hardware Changes Virtualization Mobile Computing Path to Cloud Computing MultiScreen Tablets KVM/ Xen Solaris Zones VMWare/ Parallels Storage/Network Virtualization I/O Offloading NPT/EPT Multicore Support Smart Phones
  • 6. Cloud  CompuLng  definiLon   (IMHO)   •  Cloud  compuLng  is  a  compuLng  paradigm   composed    of  abstracLons  ,    a  set  of  primiLves   and  a  set  of  interfaces  and  tools  to  drive  those   abstracLons  and  primiLves.  The  abstracLons   and  primiLves  need  not  be  new  in  themselves,   but  their  combinaLon  and  impact  is  what   create  ‘The  Cloud’  culture.  
  • 7. Compute   Storage  Network   PrimiLves   AbstracLons   FoundaLon   Image   Volumes   Snapshots   Autoscale   Tools   APIs   Config  Management  
  • 8. Example  “High  level”  Architecture   OpenStack  
  • 9. In  reality,  it  sorta  looks  like  this  
  • 10. AWS  as  a  Public  Cloud  
  • 11. FOSS4G  Use  Cases   •  Disaster  Recover/Backup   •  StaLc,  Logic-­‐free,  web  publishing   •  Online  FOSS4G  as  a  Service   •  Data  transformaLon  jobs   •  Content  CuraLon  and  Batch  processes  
  • 12. Example  FOSS4G  AWS  Use  Case   StaLc  publishing  blueprint  
  • 13. How  to  Build  your  Cloud  Infrastructure  
  • 14. Architectural  PaUerns   •  The  Cookie  CuUer/Soloist.   •  The  Centrist.   •  The  Replicator.   •  The  Masters  of  Colonies.  
  • 15. CAP:  Cookie  CuUer  
  • 16. The  Cookie  CuUer/Soloist   •  Pros:   – Simple.   – Scales  Horizontally  w/load.   – Localized  failure  impact.   •  Cons:   – Poor  support  for  write-­‐oriented  services.   – Coarse  grained  scalability.   – Node  capacity  has  verLcal  scalability  issues.  
  • 17. CAP  –  The  Centrist  
  • 18. The  Centrist   •  Pros:   –  Scales  at  components  level.   –  Moderate  complexity  up  to  middle  range  load.   –  Faster/Easier  fault  isolaLon/detecLon.   –  Data  stores  Master/Slave  is  a  well  studied  concept.   •  Cons:   –  Central  data  store  becomes  more  criLcal/boUleneck.   –  MulL-­‐region  deployments  suffer  from  latency.   –  VerLcal  scaling  characterisLcs  pronounced  on  the   Data  store.  
  • 19. CAP  –  The  Replicator  
  • 20. The  Replicator   •  Pros:   – Scales  at  components  level.   – Improved  read  performance.   – BeUer  Disaster  Recovery.   – Well  suited  for  mulL  regions  deployments.   •  Cons:   – Writes  are  sLll  central.   – Added  complexity.   – Increased  bandwidth  requirements.  
  • 21. Masters  of  Colonies  
  • 22. CAP  –  Master  of  Colonies   •  Pros:   – Improved  write  performance.   – Decompose  large  data  sets  into  smaller  ones.   – Faster  data  iteraLons.   – Good  disaster  recovery  strategy.   •  Cons:   – Complex!   – Weak/Varying  support  by  various  data  stores.   – High  maintenance  overhead  
  • 23. Cultural  Changes   •  Get  stakeholders  buy-­‐in  early.   •  Build  a  full  ownership  culture.   •  Adopt  an  agile  approach.   •  Encourage  prototyping  and  experimentaLon.   •  AutomaLon  as  a  way  of  life.  
  • 24. Processes  Changes   •  Somware  Architecture:   –  Know  the  floor,  and  the  ceiling.   –  Be  as  stateless  as  possible.   –  Graceful  failure  response.   –  Good  Logging  as  a  way  of  life.   •  Release  Engineering   –  The  VM  as  an  arLfact   –  AutomaLon   –  Versioning   –  Snapshot   •  AutomaLon:   –  ConfiguraLon  management   –  OrchestraLon   –  Auto-­‐scaling  
  • 25. Things  to  remember   •  Review  any  legal  implicaLons.   •  Use  the  cloud  primiLves.   •  Pay  aUenLon  to  security:  Security  groups,   Encrypted  data  at  rest,  etc.   •  Cleanup  old  stuff.   •  Things  fail:  don’t  fight  it,  just  handle  it.   •  You  will  not  get  it  right  the  first  Lme  but  things   should  look  good  on  3rd  iteraLon.(Read  the   mythical  man  month)  
  • 26. FOSS4G  in  AWS   Performance/Architecture  EvaluaLon   •  Tools  used:   – Siege   – Sar   – Oprofile   – R/AWK/Python/Ruby   •  Postgresql  queries  log.   •  Test  client  -­‐>  Target  server  as  separate  nodes.  
  • 27. OSM  Data  into  AWS   •  Setup  1   –  M1.Large  (  2  Cores)   –  Standard  EBS   –  EU-­‐West  region   •  Setup  2   –  M1.Large   –  Provisioned  EBS  :  8000  IOPS   –  EU-­‐West  region   •  Setup  3   –  Hi.4xlarge   –  SSD  drive   –  EU-­‐West  region   •  Setup  4   –  M2.2xlarge   –  EU-­‐West   –  Ephemeral  drives  
  • 28. ImporLng  OSM  data  into  AWS   TesLng  the  water  
  • 29. ImporLng  OSM  data  into  AWS   TesLng  the  water  some  more  
  • 30. Enough  Water  TesLng   ImporLng  Planet  to  SSD   •  Guess  how  long  it  took  to  finish  
  • 31. ImporLng  Planet  into  AWS   Using  SSD   •  It  only  took  35  hours!   •  Disk  uLlizaLon:  ~250Gb   •  Guess  what  was  the  first  thing  I  did  when  it   finished?  
  • 32. ImporLng  Planet  into  AWS   •  I  made  a  copy  of  course  J   •  Create  a  RAID  0  set   •  Create  LVM  on  top  of  RAID  0   •  Kick  off  data  copy   •  Guess  how  long  it  took  
  • 33. ImporLng  Planet  into  AWS   •  It  only  took  2.5    hours.  
  • 34. Data  Import  in  AWS   OSM  full  planet  
  • 35. Profiling  OSM2PGSQL   •  Data  sets  used   •  Links/Ways/nodes  of  each  set   •  Time  
  • 36. Data  import  notes   •  Create  the  DB  on  SSD  and  clone  to  EBS:   – Use  case:  quickly  import  the  data  but  make  it   persistent.   – Full  planet  volume  takes  2-­‐2.5  hours.   •  Create  Provisioned  EBS  and  clone  to  SSD:   – Use  case:  Need  very  fast  runLme  access   – Full  planet  volume  takes  5.4  hours   •  Can  we  get  OSM  primiLves  summary  per   dump  and  full  planet  as  part  of  the  pbf?  
  • 37. Data  Import  in  AWS   Lessons  learned   •  It  is  not  only  the  disk.   •  Risk  on  mulLple  levels   – Dev  teams  can’t  possibly  be  tesLng  to  their  full   potenLal(in  the  data  context).   – Evident  in  outdated/incorrect  documentaLon  for   bootstraping  
  • 38. Rendering  –  ModLle/mapnik   •  Apache  module  +  a  unix  daemon.   •  Apache  module  is  process  model,  Renderd  is   mulLthreaded.   •  Apache  module  sends  a  command  to  renderd  over  a   unix  socket.   •  The  renderer  will  fetch  the  data  and  writes  it  out.   •  Non  cached  data  will:   –  Fail  on  first  aUempt(return  404)   –  Pass  on  second  aUempt(~600  msec)   •  Cached  data  is  served  <  10  msec   •  Very  SQL  chaUery  
  • 39. Renderd  Threads  Profiling  
  • 40. Renderd  Profiling  
  • 41. Renderd  Profiling  
  • 42. Renderd  Profiling  
  • 43. Renderd  Profiling  
  • 44. Rendering  –  GeoServer/GWC   •  Single  layer,  ZL  15,  RAM  Disk  :  100  Lles/sec   •  TruncaLon  is  very  slow.  Please  version  your   published  layers.   •  Standalone  GWC  offers  much  beUer  scalability   model   •  Possible  race  condiLons  in  threads  wriLng   Lles.   •  Didn’t  hit  the  getAlphaTile()  issue.    
  • 45. GWC/Geoserver  in  AWS   Example  deployment  
  • 46. Cost?   •  Screenshot  of  my  account  acLvity  
  • 47. Released  arLfacts   Snapshots  of  OSM  data  in  flat  PGSQL   •  2  drives  :   –  snap-­‐f9affde6   –  snap-­‐ffaffde0   •  To  use:   –  Create  a  volume  based  on  the  snapshot   –  Mdadm  acLvate  (  raid0  ,  2  drives)   –  Pvscan,vgscan,vgchange,lvscan   –  Installing  mdadm  and  rebooLng  should  work  on  most   machines  to  do  this  for  you  automagically.   –  Mount  on  the  volume  on  your  PGDATA  path  
  • 48. Backlog   •  Geocoding  tesLng  with  Twofish  and   GISGraphy   •  OSRM  profiling   •  SuggesLons?  
  • 49. Many  thanks  to   •  Geofabrik  for  compiling  all  those  sets/formats.   •  FOSS4G2013  for  this  opportunity   •  And  THANK  YOU  

×