Intro to Joyent's Manta Object Storage Service
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Intro to Joyent's Manta Object Storage Service

  • 513 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
513
On Slideshare
508
From Embeds
5
Number of Embeds
1

Actions

Shares
Downloads
13
Comments
0
Likes
1

Embeds 5

http://www.linkedin.com 5

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Introduction to Manta Rod Boothby VP 415-819-9253 rod@joyent.com August 12, 2013
  • 2. Object Stores are the Future 2 $14,639 $12,597 $14,193 $13,228 $15,305 $11,812 $10,868 $10,432 $9,924 $13,147 $15,700 $15,200 10 14 18 29 40 82 102 262 449 556 762 905 1,000 1,300 2,000 0 500 1000 1500 2000 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 Oct-06 Feb-08 Jul-09 Nov-10 Apr-12 Aug-13 IDC Wordwide Server Sales in $ Millions Vs Billions of Objects in AWS S3 The Number of Objects in Amazon S3 is Growing Fast Server Sales are basically flat
  • 3. Manta is Joyent’s new Object Storage Service 3 Joyent Object Store Manta Put Data into Manta Get Data from Manta Via a RESTful API An object is non-interpreted data of any size that you read and write to the store.
  • 4. Manta is Live and Available Today 4 http://www.joyent.com/products/manta
  • 5. A file is an example of an object • The code below does the following: 1. Creates a file called hello.txt that contains the words “Hello Manta” 2. Puts the file into Manta 3. Gets the file back from Manta and outputs it’s contents 5 $ echo "Hello, Manta" > /tmp/hello.txt $ mput -f /tmp/hello.txt /$MANTA_USER/stor/hello-foo /$MANTA_USER/stor/hello-foo [====================>] 100% 13B $ mget /$MANTA_USER/stor/hello-foo Hello, Manta
  • 6. Manta Partners support File Interfaces 6 Joyent Object Store Manta Partners offer NAS File Interfaces that run in existing data centers but back up to the Manta Object Store Panzura solution is available today. The other solutions are due to be available by end of Q4, 2013.
  • 7. Manta adds Big Data to Object Storage 7 Joyent Object Store Manta Only 1 Step - Analyze or Process Data using Manta Jobs Send in the Big Data Job Manta acts like a Platform as a Service (PaaS) for Big Data Analytics Manta is the only Object Storage System that brings Compute directly to the Data.
  • 8. Big Data is easy on Manta vs complex on AWS 8 1 - Download Data 3 - Upload Data Again Cloud Object Store S3 2 - Analyze or Process Data Netflix has open-sourced their Genie Management Tools for Running Hadoop Jobs with S3. To Analyze Data in S3, the Netflix system requires coordinating 9 pieces of Software: Hadoop, Hive, Pig, Karyon, Servo, Ribbon, Archaius, Eureka, and Genie Big Data analytics on AWS/S3 requires 3 complex steps vs 1 simple step on Manta.
  • 9. S3 + EC2 also requires new Sysadmins 9 Admins are needed because “Genie is not an end-to- end resource management tool - it doesn’t provision or launch clusters, and neither does it scale clusters up and down based on their utilization” End-users are the data-scientists who want to analyze or process data stored in S3
  • 10. 4 Big Data Made Simple • Single store of record for your data • Do analysis without the learning curve of server administration • Do big data analysis in any language “There is no learning curve to run Manta for us, since it runs on Unix.” Konstantin Gredeskoul, CTO
  • 11. Manta delivers Value • Requests • Delete! Free • POST, PUT, LIST (“GET DIR”)! $0.005/1000 requests • GET, OPTION, HEAD! $0.004/10000 requests • Bandwidth • All bandwidth in $0.000 (free) • Bandwidth out after 1st TB $0.120 /GB to $0.050 / GB 11 Storage Tier Per Individual Copy Per 2 Copies (default) First 1 TB/month $0.043 per GB $0.086 per GB Next 49 TB/month $0.036 per GB $0.072 per GB Next 450 TB/month $0.032 per GB $0.064 per GB Next 500 TB/month $0.029 per GB $0.058 per GB Next 4000 TB/month $0.027 per GB $0.054 per GB Next 5000 TB/month $0.025 per GB $0.050 per GB Default is 2 copies. When submitting an object to the service, you can specify the number of copies stored, from one (1) to six (6). Default is 2 copies. When submitting an object to the service, you can specify the number of copies stored, from one (1) to six (6). Default is 2 copies. When submitting an object to the service, you can specify the number of copies stored, from one (1) to six (6). • Storage • Compute • $0.00004/GB DRAM•sec • If you run 1000 parallel tasks on 1000 objects and they each take a second, then you've used 1000 seconds of time and the cost for this job would be $0.04.
  • 12. Technical Appendix
  • 13. Accessing Manta is Easy • Manta REST API • Manta CLI & Shell • Manta Node.js SDK • Manta Python SDK • Manta Ruby SDK • Manta Java SDK 13
  • 14. Technical Description of Manta • Multi-datacenter Object Store • Granular datacenter and copy policies • No size limits • In-kernel (clustered ZFS DMU) • More akin to a MetroCluster Netapp • S3: JVM on ext3 on Linux • Strongly consistent and transactional data semantics • Close to UNIX file-system semantics 14
  • 15. Analytics Capability: Codename Marlin • A facility for running compute jobs directly on Manta storage nodes • Complete EC2-like batch compute environment • A framework for distributing work to the right physical servers, tracking which pieces are complete, capturing the output, and repeating the whole process to facilitate multi-phase computation on objects at rest • Complete unix environment without any ETL • A non-interactive unix shell environment for doing "work" on Manta objects as local files 15
  • 16. Why Marlin is Revolutionary Customers are able to do queries, create datapipes, do transformations and map reduce on objects very quickly and without data movement and without the additional costs of spinning up instances 16
  • 17. Big Data Use Case Examples - Part 1 • Log processing • Clickstream analysis, map reduce on logs • Image processing • converting formats, generating thumbnails • Video processing • transcoding, extracting segments, resizing • “Hardcore" data analysis • NumPy, SciPy, R, machine learning, data mining 17
  • 18. Big Data Use Case Examples - Part 2 • SQL-like queries over structured data • Similar to what Hive provides for Hadoop • Datapipeling • MySQL, Postgres plus other clients • Text processing • e-discovery and internal search engines • Backup and Disaster recovery • Encrypt and verify integrity without moving/downloading the data 18
  • 19. Key Security & Sharing Example • With rich access controls in Manta, it is possible to run compute on other users' data that's been made available to you • Without actually having access to it • Without having to ship it • Without being able to egress the dataset itself 19
  • 20. Thank You