×
  • Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
 

Spark and Shark: Lightning-Fast Analytics over Hadoop and Hive Data

by on Jul 12, 2012

  • 17,733 views

Spark is an open source cluster computing framework that can outperform Hadoop by 30x through a combination of in-memory computation and a richer execution engine. Shark is a port of Apache Hive onto ...

Spark is an open source cluster computing framework that can outperform Hadoop by 30x through a combination of in-memory computation and a richer execution engine. Shark is a port of Apache Hive onto Spark, which provides a similar speedup for SQL queries, allowing interactive exploration of data in existing Hive warehouses. This talk will cover how both Spark and Shark are being used at various companies to accelerate big data analytics, the architecture of the systems, and where they are heading. We will also discuss the next major feature we are developing, Spark Streaming, which adds support for low-latency stream processing to Spark, giving users a unified interface for batch and real-time analytics.

Statistics

Views

Total Views
17,733
Views on SlideShare
17,519
Embed Views
214

Actions

Likes
48
Downloads
463
Comments
1

8 Embeds 214

http://www.jetlore.com 145
http://localhost 32
http://blargh.internal.qwhisper.com 21
https://twitter.com 7
http://www.linkedin.com 5
http://staging.jetlore.com 2
http://nguyentantrieu.info 1
https://www.linkedin.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

11 of 1 previous next

Post Comment
Edit your comment

Spark and Shark: Lightning-Fast Analytics over Hadoop and Hive Data Spark and Shark: Lightning-Fast Analytics over Hadoop and Hive Data Presentation Transcript