• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Claremont Report on Database Research: Research Directions (Raghu Ramakrishnan)
 

Claremont Report on Database Research: Research Directions (Raghu Ramakrishnan)

on

  • 2,484 views

This is a set of slides from the Claremont Report on Database Research, see http://db.cs.berkeley.edu/claremont/ for more details. These particular slides are from a "Research Directions" talk by ...

This is a set of slides from the Claremont Report on Database Research, see http://db.cs.berkeley.edu/claremont/ for more details. These particular slides are from a "Research Directions" talk by "Raghu Ramakrishnan." (Uploaded for discussion at the Stanford InfoBlog, http://infoblog.stanford.edu/.)

Statistics

Views

Total Views
2,484
Views on SlideShare
1,186
Embed Views
1,298

Actions

Likes
2
Downloads
0
Comments
0

5 Embeds 1,298

http://infoblog.stanford.edu 1289
http://feeds.feedburner.com 4
http://translate.googleusercontent.com 3
http://static.slideshare.net 1
http://infoblog.stanford.edu.sixxs.org 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Claremont Report on Database Research: Research Directions (Raghu Ramakrishnan) Claremont Report on Database Research: Research Directions (Raghu Ramakrishnan) Presentation Transcript

    • Web Data Management Raghu Ramakrishnan
    • QUIQ Lessons
      • Structured data management powers scalable collaboration environments
      • ASP
      • Multi-tenancy
      • Massively distributed
      • Fine-grained permissions, hierarchical acls
      • RDBMSs were a lousy fit
    • Cloud Computing: Computing as a Service Cloud Computing CPU Intensive Data Intensive Analytic E.g., SSDS, Hadoop Packaged Software High-throughput E.g., Condor “ Transactional” Storage & Serving E.g., PNUTS, S3, SSDS, UDB
    • Implications
      • Data management as a service
        • Scientists and others who’ve resisted (installing, maintaining, and) using DBMSs will find it much easier to reap the benefits
        • “ Data centers” and “Computing Centers” will come into vogue again
      • Hosted back-ends and RAD tools will make Web application development accessible to all
        • The Web is becoming open
          • E.g., OpenSocial, OpenID
          • Ideas will be the most valuable currency, not the wherewithal to build complex systems
      • Paradigm shifts possible for how we do research in many fields
        • Build applications that embed your algorithms and test them directly in the field—Computer Scientists can interact directly with users (ironically, this would still be a breakthrough of sorts after four decades!)
        • Many other disciplines (e.g., Sociology, microeconomics) can design and conduct online experiments involving unprecedented numbers of participants
    • PNUTS: DB in the Cloud CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replication Indexes and views Structured, flexible schema Hosted, managed infrastructure E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E
      • Goal:
      • Make it easier for applications to reason about updates and cope with asynchrony—alternative to “transactions” in an asynchronous world
      • What happens to a record with primary key “Brian”?
      • Guarantees:
      • Every reader will always see some consistent, but possibly stale version
      • Readers can request a more up-to-date version, but may pay extra latency
        • Special case: Critical read (writer/readers see their own writes)
      • Writers can verify that the record is still at the version they expect
      Basic Consistency Model Time Record inserted Update Update Delete v. 1 v. 2 v. 3 Generation 1 Record inserted Update Update Delete v. 1 v. 2 v. 4 Generation 2 Update v. 3 Record inserted Delete v. 1 Generation 3
    • Lots of Issues to Re-think
      • Massive distribution & replication
        • Asynchrony
        • Availability
        • Consistency
      • DBA to the world
        • Auto-tuning
        • Multi-tenancy
        • Access control (granularity, online ids)
        • Encryption
      • App-support
        • Caching
    • Querying the Web
      • Search will become more semantic—best-effort match-making between:
        • Query intent (NLP, query logs …)
        • Interpreted web content
      • Deep web has a lot of structured data
        • How we get a handle on it is an interesting problem
        • But this is only part of the problem … lots of data not here
      • Semantic web isn’t working
      • Site-wrapping doesn’t scale
      • Solutions?
        • Domain-wrapping
        • Mass collaboration
        • ??