Your SlideShare is downloading. ×
Cloud Computing Meets Data Warehousing
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Cloud Computing Meets Data Warehousing

4,434
views

Published on

Introduction to data warehousing and analytics in the cloud.

Introduction to data warehousing and analytics in the cloud.

Published in: Technology, Business

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,434
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
184
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • About Omer: Developed Vertica’s internal cloud solution for use with PoC Created the Vertica for the Cloud offering based on Amazon EC2 Heading up Vertica’s cloud and virtualization initiatives Define cloud: available, scalable and efficient…usually managed by someone else Available means anyone can get resources on demand Scalable means you get as much or as little as you need Efficient means you can get resources quickly and in bit sized chunks
  • Transcript

    • 1. Cloud Computing Meets Data Warehousing Omer Trajman Sr. Dir. for Cloud and Virtualization Vertica Systems [email_address]
    • 2. What is….Cloud?
      • What are Cloud Services?
      • Other Peoples’ Software
      • What are Cloud Platforms?
      • Other Peoples’ Frameworks
      • What is Cloud Infrastructure?
      • Other Peoples’ Hardware
    • 3. Data Warehousing in the Cloud
      • Applications, e.g. Birst, LogiXML, Lucidera
        • Web app providing data services
        • Analytic SaaS – Full stack solution
      • Platforms, e.g. Google AppEngine, MSFT Azure
        • Programming API (Java, Python, .Net)
        • Integrated data access
      • Infrastructure, e.g. Amazon Web Services
        • Favorite OS on demand (Linux, Solaris, Windows)
        • Additional services (Simple DB, Queue, Storage)
    • 4. Security is a Tradeoff
      • “ Security costs money, but it also costs in time, convenience, capabilities,… ”
      • -Bruce Schneier
      • Assess how important it is to secure your data
      • What are the risks with in-house and cloud?
      • Why not keep it under your mattress ?
    • 5. Full Stack Offerings
      • Birst
        • Hosted data access
        • Spreadsheet in the Cloud
      • LogiXML
        • Framework for online Analytic SaaS
        • Reporting, Dashboard, ETL
      • Lucidera
        • Vertical apps focused
        • Analysis Tools (e.g. Pipeline Healthcheck)
    • 6. DIY Analytics in the Cloud
      • Google AppEngine, Microsoft Azure
        • Java, Python, .Net driven UI to capture and display
        • GQL or SQL to query (limited joins, no aggs)
        • Use client tool to upload app
      • Amazon Ecosystems
        • Provision via API or service e.g. RightScale
        • Pick your UI and ETL – Jasper, Pentaho , etc.
        • Pick your DB – MySql, Oracle, SQLServer, Vertica
    • 7. Key Questions
      • Is my data safe and secure?
      • Can I get fast access to my data in the cloud?
      • What does this cost?
      • Do I need a detailed plan for growth?
      • How much IT do I need?
    • 8. Securing the Cloud
      • Create a VPN
      • Firewall the host
      • Encrypt the disk
      • Consider where to keep sensitive data
    • 9. Upload and Beyond
      • PUT it, one object at a time
      • Web page upload…a few MB at a time
      • Bulk upload via FTP, SCP at 1+GB/ hour
      • Use an ETL Tool
      • Is data already in the cloud?
    • 10. Economics of DW in the Cloud
      • Getting and keeping data in the cloud
        • Cloud applications are a source of data
        • Upload services (sneakernet)
      • Scale on demand
        • Transparent scaling
        • Managed or API based scaling
      • Pay as you go
        • Operational cost by resource or volume
        • Incremental changes on demand
    • 11. Get Set…
      • Check out Analytic SaaS Vendors
        • Birst, LogiXML, Lucidera
      • Are you a coder? Look for a Framework
        • Google AppEngine, MSFT Azure, Elastic MR
      • Looking for Classic BI ?
        • Best of breed on the cloud
        • Amazon, GoGrid, RightScale, Joyent
      • Questions? [email_address]

    ×