SlideShare a Scribd company logo
1 of 13
Download to read offline
Apache jclouds at Maginatics
Experiences with billions of blobs across many
blobstore providers.

Andrew Gaul
jclouds PMC
Agenda
•
•
•
•
•
•
•

What is blob storage
Blobstore compatibility
Case study: Maginatics Cloud Storage Platform (MCSP)
Scaling
Lessons learned
Future directions
Conclusion

http://jclouds.apache.org
https://maginatics.com
Maginatics

2
What is blob storage?
Blobstores offer key-value storage that is:
• Scalable: 10s TB with few nodes and 100s of PB with
thousands of nodes
• Inexpensive: built on commodity hardware
• Available/durable: tolerates hardware failures
Do not offer guarantees that block storage and file systems
provide:
• Limited interface: get, put, delete
• Eventual consistency: blob reads may return stale or no data
for some limited time
Maginatics

3
jclouds supports many providers
Multiple public and private implementations allow customer
trade-offs.
Public Object Storage

Maginatics

Private Object Storage

4
Blobstore compatibility
jclouds abstracts differences between APIs, but semantic differences
remain:
• Atmos: cannot overwrite blob
• AWS-S3: cannot mutate or append to a blob, cannot put blob
without explicit size
• Swift: eventually consistent
Portable applications must use the lowest-common denominator
functionality:
• Write to blobs exactly once, never mutate or append
• Can read from blobs at any time, but must retry due to eventual
consistency
• When deleting, never reuse blob name

Maginatics

5
Maginatics Cloud Storage Platform (MCSP)
• Virtualized, cloud-based storage system
• Layers network file system semantics on top of blob storage
• Run any application on a variety platforms, including
multiple-client file sharing
• MCSP is a cloud-optimized NAS filer
• Smart client gives LAN performance over WANs
• Flexible deployment options: public, private, hybrid cloud
• Refer to SNIA SDC 2013 slides for technical background

Maginatics

6
Scaling Throughput
MCSP supports thousands of clients reading and writing
simultaneously.
Single server could become a bottleneck, especially smaller
instance sizes.
Instead vend signed URLs to clients to allow them direct access
to blobstore:
• Cryptographically signed URLs allow read or write access to
a specific blob for a specified time
• Can embed other properties like content length and hash
This technique allows a single MCSP server to mediate many
Gbit/s throughput!
Maginatics

7
Scaling Number of Blobs
MCSP manages 100 TB of blob data across 1 billion blobs.
Some providers require specific naming or sharding for best
performance:
• Atmos: no more than 100,000 blobs per directory, shard across
directories
• AWS-S3: name blob with unique prefixes
• Swift: no more than 1 million blobs per container, shard across
containers
• GCS & HPC: remove Expect: 100-continue
• Other quirks: Cleversafe performs better when disabling container
listing
Surprisingly challenging workload: removing all blobs from a large
container.
Maginatics

8
Scaling Blob Sizes
Most MCSP blobs have small sizes, but some use cases require
larger ones.
jclouds support up to 2 GB blobs across all blobstores:
• Could support 5 GB with Java 7
AWS-S3, Azure, and Swift support multi-part upload, tested
with 40 GB blobs:
Large blobs increase chances of transient network errors and
failures:
• Use a repeatable Payload like ByteSource to allow jclouds to
retry
• Always include MD5 checksum to guarantee data integrity
Maginatics

9
Lessons Learned
Cross-provider support required substantial effort:
• Long tail of issues with authentication, configuration, error
codes, timeouts, etc.
• S3- and Swift-compatible clones are like snowflakes, no two
are alike
Measuring performance is difficult:
• Blob naming and sharding important
• Public providers will reshard very active containers for
better performance
• Private blobstores require configuration and tuning
Mock blobstores (filesystem and transient) helped testing.
Maginatics

10
Future Directions
More diagnostic tools, especially for private blobstores.
• Maginatics will contribute benchmark tool and compatibility
tester
Modernize with Guava additions, e.g., ByteSource, Hashing,
MediaType.
Simplify implementation:
• De-async?
• Remove annotations?
New providers:
• Modernized Swift (in-progress)
• Google Cloud Storage (GSoC 2014?)
• Amazon Glacier
• Joyent Manta?
Maginatics

11
Recap
• jclouds can provide portability between blobstore providers
if your application does not strongly depend on blobstore
semantics
• Applications can scale with the correct architecture and
implementation choices
• More work to do to make jclouds an inviting platform for all
Java developers
• jclouds community helped Maginatics over the last three
years and we look forward to continuing to contribute

Maginatics

12
http://jclouds.apache.org
https://maginatics.com

Thank you.
13

More Related Content

Recently uploaded

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsZilliz
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Apache jclouds at Maginatics

  • 1. Apache jclouds at Maginatics Experiences with billions of blobs across many blobstore providers. Andrew Gaul jclouds PMC
  • 2. Agenda • • • • • • • What is blob storage Blobstore compatibility Case study: Maginatics Cloud Storage Platform (MCSP) Scaling Lessons learned Future directions Conclusion http://jclouds.apache.org https://maginatics.com Maginatics 2
  • 3. What is blob storage? Blobstores offer key-value storage that is: • Scalable: 10s TB with few nodes and 100s of PB with thousands of nodes • Inexpensive: built on commodity hardware • Available/durable: tolerates hardware failures Do not offer guarantees that block storage and file systems provide: • Limited interface: get, put, delete • Eventual consistency: blob reads may return stale or no data for some limited time Maginatics 3
  • 4. jclouds supports many providers Multiple public and private implementations allow customer trade-offs. Public Object Storage Maginatics Private Object Storage 4
  • 5. Blobstore compatibility jclouds abstracts differences between APIs, but semantic differences remain: • Atmos: cannot overwrite blob • AWS-S3: cannot mutate or append to a blob, cannot put blob without explicit size • Swift: eventually consistent Portable applications must use the lowest-common denominator functionality: • Write to blobs exactly once, never mutate or append • Can read from blobs at any time, but must retry due to eventual consistency • When deleting, never reuse blob name Maginatics 5
  • 6. Maginatics Cloud Storage Platform (MCSP) • Virtualized, cloud-based storage system • Layers network file system semantics on top of blob storage • Run any application on a variety platforms, including multiple-client file sharing • MCSP is a cloud-optimized NAS filer • Smart client gives LAN performance over WANs • Flexible deployment options: public, private, hybrid cloud • Refer to SNIA SDC 2013 slides for technical background Maginatics 6
  • 7. Scaling Throughput MCSP supports thousands of clients reading and writing simultaneously. Single server could become a bottleneck, especially smaller instance sizes. Instead vend signed URLs to clients to allow them direct access to blobstore: • Cryptographically signed URLs allow read or write access to a specific blob for a specified time • Can embed other properties like content length and hash This technique allows a single MCSP server to mediate many Gbit/s throughput! Maginatics 7
  • 8. Scaling Number of Blobs MCSP manages 100 TB of blob data across 1 billion blobs. Some providers require specific naming or sharding for best performance: • Atmos: no more than 100,000 blobs per directory, shard across directories • AWS-S3: name blob with unique prefixes • Swift: no more than 1 million blobs per container, shard across containers • GCS & HPC: remove Expect: 100-continue • Other quirks: Cleversafe performs better when disabling container listing Surprisingly challenging workload: removing all blobs from a large container. Maginatics 8
  • 9. Scaling Blob Sizes Most MCSP blobs have small sizes, but some use cases require larger ones. jclouds support up to 2 GB blobs across all blobstores: • Could support 5 GB with Java 7 AWS-S3, Azure, and Swift support multi-part upload, tested with 40 GB blobs: Large blobs increase chances of transient network errors and failures: • Use a repeatable Payload like ByteSource to allow jclouds to retry • Always include MD5 checksum to guarantee data integrity Maginatics 9
  • 10. Lessons Learned Cross-provider support required substantial effort: • Long tail of issues with authentication, configuration, error codes, timeouts, etc. • S3- and Swift-compatible clones are like snowflakes, no two are alike Measuring performance is difficult: • Blob naming and sharding important • Public providers will reshard very active containers for better performance • Private blobstores require configuration and tuning Mock blobstores (filesystem and transient) helped testing. Maginatics 10
  • 11. Future Directions More diagnostic tools, especially for private blobstores. • Maginatics will contribute benchmark tool and compatibility tester Modernize with Guava additions, e.g., ByteSource, Hashing, MediaType. Simplify implementation: • De-async? • Remove annotations? New providers: • Modernized Swift (in-progress) • Google Cloud Storage (GSoC 2014?) • Amazon Glacier • Joyent Manta? Maginatics 11
  • 12. Recap • jclouds can provide portability between blobstore providers if your application does not strongly depend on blobstore semantics • Applications can scale with the correct architecture and implementation choices • More work to do to make jclouds an inviting platform for all Java developers • jclouds community helped Maginatics over the last three years and we look forward to continuing to contribute Maginatics 12