Cloud Computing Meets Data Warehousing
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Cloud Computing Meets Data Warehousing

  • 7,251 views
Uploaded on

Introduction to data warehousing and analytics in the cloud.

Introduction to data warehousing and analytics in the cloud.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
7,251
On Slideshare
7,196
From Embeds
55
Number of Embeds
4

Actions

Shares
Downloads
184
Comments
0
Likes
3

Embeds 55

http://www.slideshare.net 41
http://www.linkedin.com 10
https://www.linkedin.com 3
http://www.slashdocs.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • About Omer: Developed Vertica’s internal cloud solution for use with PoC Created the Vertica for the Cloud offering based on Amazon EC2 Heading up Vertica’s cloud and virtualization initiatives Define cloud: available, scalable and efficient…usually managed by someone else Available means anyone can get resources on demand Scalable means you get as much or as little as you need Efficient means you can get resources quickly and in bit sized chunks

Transcript

  • 1. Cloud Computing Meets Data Warehousing Omer Trajman Sr. Dir. for Cloud and Virtualization Vertica Systems [email_address]
  • 2. What is….Cloud?
    • What are Cloud Services?
    • Other Peoples’ Software
    • What are Cloud Platforms?
    • Other Peoples’ Frameworks
    • What is Cloud Infrastructure?
    • Other Peoples’ Hardware
  • 3. Data Warehousing in the Cloud
    • Applications, e.g. Birst, LogiXML, Lucidera
      • Web app providing data services
      • Analytic SaaS – Full stack solution
    • Platforms, e.g. Google AppEngine, MSFT Azure
      • Programming API (Java, Python, .Net)
      • Integrated data access
    • Infrastructure, e.g. Amazon Web Services
      • Favorite OS on demand (Linux, Solaris, Windows)
      • Additional services (Simple DB, Queue, Storage)
  • 4. Security is a Tradeoff
    • “ Security costs money, but it also costs in time, convenience, capabilities,… ”
    • -Bruce Schneier
    • Assess how important it is to secure your data
    • What are the risks with in-house and cloud?
    • Why not keep it under your mattress ?
  • 5. Full Stack Offerings
    • Birst
      • Hosted data access
      • Spreadsheet in the Cloud
    • LogiXML
      • Framework for online Analytic SaaS
      • Reporting, Dashboard, ETL
    • Lucidera
      • Vertical apps focused
      • Analysis Tools (e.g. Pipeline Healthcheck)
  • 6. DIY Analytics in the Cloud
    • Google AppEngine, Microsoft Azure
      • Java, Python, .Net driven UI to capture and display
      • GQL or SQL to query (limited joins, no aggs)
      • Use client tool to upload app
    • Amazon Ecosystems
      • Provision via API or service e.g. RightScale
      • Pick your UI and ETL – Jasper, Pentaho , etc.
      • Pick your DB – MySql, Oracle, SQLServer, Vertica
  • 7. Key Questions
    • Is my data safe and secure?
    • Can I get fast access to my data in the cloud?
    • What does this cost?
    • Do I need a detailed plan for growth?
    • How much IT do I need?
  • 8. Securing the Cloud
    • Create a VPN
    • Firewall the host
    • Encrypt the disk
    • Consider where to keep sensitive data
  • 9. Upload and Beyond
    • PUT it, one object at a time
    • Web page upload…a few MB at a time
    • Bulk upload via FTP, SCP at 1+GB/ hour
    • Use an ETL Tool
    • Is data already in the cloud?
  • 10. Economics of DW in the Cloud
    • Getting and keeping data in the cloud
      • Cloud applications are a source of data
      • Upload services (sneakernet)
    • Scale on demand
      • Transparent scaling
      • Managed or API based scaling
    • Pay as you go
      • Operational cost by resource or volume
      • Incremental changes on demand
  • 11. Get Set…
    • Check out Analytic SaaS Vendors
      • Birst, LogiXML, Lucidera
    • Are you a coder? Look for a Framework
      • Google AppEngine, MSFT Azure, Elastic MR
    • Looking for Classic BI ?
      • Best of breed on the cloud
      • Amazon, GoGrid, RightScale, Joyent
    • Questions? [email_address]