SlideShare is now on Android. 15 million presentations at your fingertips.  Get the app

×
  • Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
 

SAS on Your (Apache) Cluster, Serving your Data (Analysts)

by on Jul 10, 2013

  • 1,057 views

SAS is a both a Language for processing data and an Application for doing Analytics. SAS has adapted to the Hadoop eco-system and intends to be a good citizen amongst the choices for processing large ...

SAS is a both a Language for processing data and an Application for doing Analytics. SAS has adapted to the Hadoop eco-system and intends to be a good citizen amongst the choices for processing large volumes of data on your cluster. As more people inside an organization want to access and process the accumulated data, the “schema on read” approach can degenerate into “redo work someone else might have done already”.
This talk begins comparing and contrasting different data storage strategies, and describes the flexibility provided by SAS to accommodate different approaches. These different storage techniques are ranked according to convenience, performance, interoperabilty – both practicality and cost of the translation. Techniques considered include:
· Storing the rawdata (weblogs, CSVs)
· Storing Hadoop metadata, then using Hive/Impala/Hawk
· Storing in Hadoop optimized formats (avro, protobufs, RCfile, parquet)
· Storing in Proprietary formats
The talk finishes up discussing the array of analytical techniques that SAS has converted to run on your cluster, with particular mention of situations where HDFS is just plain better than the RDBMS that came before it.

Statistics

Views

Total Views
1,057
Views on SlideShare
1,057
Embed Views
0

Actions

Likes
2
Downloads
33
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
Post Comment
Edit your comment

SAS on Your (Apache) Cluster, Serving your Data (Analysts) SAS on Your (Apache) Cluster, Serving your Data (Analysts) Presentation Transcript