Scality S3 Server: Node js Meetup Presentation

•Download as PPTX, PDF•

1 like•1,051 views

This document discusses Scality's experiences building their first Node.js project. It summarizes that the project was building a TiVo-like cloud service for 25 million users, which required high parallelism and throughput of terabytes per second. It also discusses lessons learned around logging performance, optimizing the event loop and buffers, and useful Node.js tools.

Technology

CONFIDENTIAL - FOR GARTNER USE ONLY © Scality 20161
Node.js @Scality
Experiences and Lessons Learned
Giorgio Regni, CTO
Lauren Spiegel, Software Engineer

Disrupting storage – unlimited & everywhere

When to use object storage?
1. Need for capacities beyond 100 TB and
growing fast
2. Very large number of clients accessing
isolated data
3. Object must be > 100KB, otherwise use a
Database
Bucket 1
Object A
Object B
Object C
Bucket 2
Object A
Object B
Object …
Object Z

Copyright Scality 2014Copyright Scality 2014
Our first node.js project - Building a Tivo in the Cloud
• 25 million users -> Designed for high degree of
parallelism
• TB/sec –> Need very efficient network transfer
• Scales out by adding nodes and drives
• Proved 30 GB/sec of ingest with 10 servers and 360
drives
SS1
Scality
FanOut
APPLICATION SERVER
A/V Fragment 1 fragment sent with X fanout
1 2 3 4 5 6 7 1 2 3
fragment is erasure coded (7,3)
data slices code slices
metadata chunk+
1
A B C D E F
HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD
HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD
HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD
Scality
FanOut
Comcast Live
Recorder
+
Chunking
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
SS1
1 to 10 servers
SS2 SS10
Test Case Latency (seconds)
Duration Recordings Batch Size Sockets
RPM per
Client
Threads per
Client
Average at 95% at 99%
2 hours 20,000 2500 1000 270 63 0.159 0.319 0.426

S3-Server
 AWS S3 compatible server
 Open source
https://github.com/scality/s3
 Can use local storage
S3-MetaData
 A distributed metadata database service
 Supports fast Bucket & object listing
 Stores ACL and Users/Groups
S3-Vault
 Security, Identity & Authentication Service
 Provides Accounts/Keys
 Supports AWS IAM Users & Groups
 Interoperable with user directory services (via SAML)
What we built: Three Key Components
7

Logging is hard
• Challenges
• Logging is expensive as it taxes the Node.js process
• UDP datagrams have expensive DNS lookups
• Redundant transformations by bunyan and bunyan-logstash
• Solution: Werelogs
• Produces raw JSON logs with the least resistive path
• Forward logs to ELK using Filebeat for indexing
• Avoids expensive and redundant transformations
• Ability to track requests across the components with UIDs
• Dump log history on errors
Open source -> http://github.com/scality/werelogs

The performance cycle
Code, Benchmark, … Repeat
• Socket & Nagle algorithm on by default -> very high
latencies
• The event loop can get backed up quickly -> hunt for
all cpu intensive tasks in the main loop
• Buffers are much more efficient when writing server
response
• Micro optimizations: Date.now() > new Date()
• Beware of libraries doing way too many things for you
• ES6 support, Babel5 was killing performance -> Babel6

Nifty Node Tools
Getting going
• Airbnb JavaScript Style Guide + Eslint
• babel — babel5 to babel6 with just imports,
destructuring and default parameters
• Commander — cool cli tools in minutes
• Async

Nifty Node Tools
Getting serious
• Level — LevelDB wrapper for node
• Memcached — client library for node
• xml — <parse>yes</parse>
• Profiler — Go fast or go home

Nifty Node Tools
Might as well test
• Mocha
• Istanbul
• lolex
• aws-node-sdk

Nifty Node Tools
Docs and Open Source Code
• Docs are good, but
• Code is even better
• Read the readable stream code and take a nap.
• Then read the transform stream code and create new
universes.

Download the code!
http://s3.scality.com/
https://github.com/scality/s3

Lauren:
github: laurenspiegel
twitter: @notfollowingyet
Giorgio:
github: @giorgioregni
twitter: @giorgioregni

What's hot

Openstack and Reddwarf OverviewCraig Vyvial

Cloud native policy enforcement with Open Policy AgentLibbySchulze

Dok Talks #111 - Scheduled Scaling with Dask and Argo WorkflowsDoKC

Kubernetes on DC/OSCloud Technology Experts

Ejecución del Elastic Stack en KubernetesElasticsearch

Combinación de logs, métricas y seguimiento para una visibilidad centralizadaElasticsearch

Neutron Updates - Liberty Edition OpenStack Foundation

7 - Monitoring Kubernetes with ElasticKangaroot

Kubera Launch Webinar: Kubernetes native management of Kubernetes native dataMayaData Inc

Cncf storage-final-filipJuraj Hantak

Getting started with OpenStackKnoldus Inc.

SFScon16 - Michele Baldessari: "OpenStack – An introduction"South Tyrol Free Software Conference

Cinder Updates - Liberty Edition OpenStack Foundation

Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...InfluxData

Openstack Swift IntroductionPark YounSung

Top Considerations For Operating a Kubernetes Environment at ScaleSignalFx

Case Study: Utilizing Mirantis Fuel to install OpenStack AnsibleTeK Charnsilp Chinprasert

Searchlight Updates - Liberty EditionOpenStack Foundation

Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...HostedbyConfluent

What's hot (20)

Openstack and Reddwarf Overview

Cloud native policy enforcement with Open Policy Agent

Dok Talks #111 - Scheduled Scaling with Dask and Argo Workflows

Kubernetes on DC/OS

Ejecución del Elastic Stack en Kubernetes

Combinación de logs, métricas y seguimiento para una visibilidad centralizada

Neutron Updates - Liberty Edition

7 - Monitoring Kubernetes with Elastic

Kubera Launch Webinar: Kubernetes native management of Kubernetes native data

Cncf storage-final-filip

Getting started with OpenStack

SFScon16 - Michele Baldessari: "OpenStack – An introduction"

Cinder Updates - Liberty Edition

Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...

Openstack Swift Introduction

Top Considerations For Operating a Kubernetes Environment at Scale

Case Study: Utilizing Mirantis Fuel to install OpenStack Ansible

Searchlight Updates - Liberty Edition

Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...

Similar to Scality S3 Server: Node js Meetup Presentation

AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly SolarWinds Loggly

beSharp a serverless approach to big data on awsClaudio Pontili

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services

Cloud Security Monitoring and Spark Analyticsamesar0

Cloud computing UNIT 2.1 presentation inRahulBhole12

Webinar - DreamObjects/Ceph Case StudyCeph Community

Why Kubernetes as a container orchestrator is a right choice for running spar...DataWorks Summit

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?DATAVERSITY

The impact of cloud NSBCon NY by Yves GoelevenParticular Software

Create cloud service on AWSAmazon Web Services

Delivering big content at NBC News with RavenDBJohn Bennett

Leveraging Databricks for Spark PipelinesRose Toomey

Leveraging Databricks for Spark pipelinesRose Toomey

AWS for the Java DeveloperRory Preddy

EC2 and S3 Level 100AWS Riyadh User Group

Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent

The Best of re:invent 2016Amazon Web Services

Run Cloud Native MySQL NDB Cluster in KubernetesBernd Ocklin

John adams talk cloudyJohn Adams

Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)Jeff Chu

Similar to Scality S3 Server: Node js Meetup Presentation (20)

AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly

beSharp a serverless approach to big data on aws

Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

Cloud Security Monitoring and Spark Analytics

Cloud computing UNIT 2.1 presentation in

Webinar - DreamObjects/Ceph Case Study

Why Kubernetes as a container orchestrator is a right choice for running spar...

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?

The impact of cloud NSBCon NY by Yves Goeleven

Create cloud service on AWS

Delivering big content at NBC News with RavenDB

Leveraging Databricks for Spark Pipelines

Leveraging Databricks for Spark pipelines

AWS for the Java Developer

EC2 and S3 Level 100

Capital One Delivers Risk Insights in Real Time with Stream Processing

The Best of re:invent 2016

Run Cloud Native MySQL NDB Cluster in Kubernetes

John adams talk cloudy

Innovations of .NET and Azure (Recaps of Build 2017 selected sessions)

Recently uploaded

"ML in Production",Oleksandr BaganFwdays

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

Understanding the Laravel MVC ArchitecturePixlogix Infotech

Gen AI in Business - Global Trends Report 2024.pdfAddepto

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Bluetooth Controlled Car with Arduino.pdfngoud9212

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada

Pigging Solutions in Pet Food ManufacturingPigging Solutions

Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida

Recently uploaded (20)

"ML in Production",Oleksandr Bagan

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

Understanding the Laravel MVC Architecture

Gen AI in Business - Global Trends Report 2024.pdf

Designing IA for AI - Information Architecture Conference 2024

SQL Database Design For Developers at php[tek] 2024

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service

SIP trunking in Janus @ Kamailio World 2024

Streamlining Python Development: A Guide to a Modern Project Setup

My Hashitalk Indonesia April 2024 Presentation

Bluetooth Controlled Car with Arduino.pdf

Human Factors of XR: Using Human Factors to Design XR Systems

My INSURER PTE LTD - Insurtech Innovation Award 2024

DMCC Future of Trade Web3 - Special Edition

Are Multi-Cloud and Serverless Good or Bad?

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024

Pigging Solutions in Pet Food Manufacturing

Science&tech:THE INFORMATION AGE STS.pdf

Scality S3 Server: Node js Meetup Presentation

1. CONFIDENTIAL - FOR GARTNER USE ONLY © Scality 20161 Node.js @Scality Experiences and Lessons Learned Giorgio Regni, CTO Lauren Spiegel, Software Engineer

2. Disrupting storage – unlimited & everywhere

3. When to use object storage? 1. Need for capacities beyond 100 TB and growing fast 2. Very large number of clients accessing isolated data 3. Object must be > 100KB, otherwise use a Database Bucket 1 Object A Object B Object C Bucket 2 Object A Object B Object … Object Z

4. Our first Node.js project

5. Copyright Scality 2014Copyright Scality 2014 Our first node.js project - Building a Tivo in the Cloud • 25 million users -> Designed for high degree of parallelism • TB/sec –> Need very efficient network transfer • Scales out by adding nodes and drives • Proved 30 GB/sec of ingest with 10 servers and 360 drives SS1 Scality FanOut APPLICATION SERVER A/V Fragment 1 fragment sent with X fanout 1 2 3 4 5 6 7 1 2 3 fragment is erasure coded (7,3) data slices code slices metadata chunk+ 1 A B C D E F HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Scality FanOut Comcast Live Recorder + Chunking 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 SS1 1 to 10 servers SS2 SS10 Test Case Latency (seconds) Duration Recordings Batch Size Sockets RPM per Client Threads per Client Average at 95% at 99% 2 hours 20,000 2500 1000 270 63 0.159 0.319 0.426

6. The team SF & Paris

7. S3-Server  AWS S3 compatible server  Open source https://github.com/scality/s3  Can use local storage S3-MetaData  A distributed metadata database service  Supports fast Bucket & object listing  Stores ACL and Users/Groups S3-Vault  Security, Identity & Authentication Service  Provides Accounts/Keys  Supports AWS IAM Users & Groups  Interoperable with user directory services (via SAML) What we built: Three Key Components 7

8. What have we learned??

9. Logging is hard • Challenges • Logging is expensive as it taxes the Node.js process • UDP datagrams have expensive DNS lookups • Redundant transformations by bunyan and bunyan-logstash • Solution: Werelogs • Produces raw JSON logs with the least resistive path • Forward logs to ELK using Filebeat for indexing • Avoids expensive and redundant transformations • Ability to track requests across the components with UIDs • Dump log history on errors Open source -> http://github.com/scality/werelogs

10.

11. Our first Node.Js project

12. Performance, performance & performance

13. The performance cycle Code, Benchmark, … Repeat • Socket & Nagle algorithm on by default -> very high latencies • The event loop can get backed up quickly -> hunt for all cpu intensive tasks in the main loop • Buffers are much more efficient when writing server response • Micro optimizations: Date.now() > new Date() • Beware of libraries doing way too many things for you • ES6 support, Babel5 was killing performance -> Babel6

14.

15. Nifty Node Tools

16. Nifty Node Tools Getting going • Airbnb JavaScript Style Guide + Eslint • babel — babel5 to babel6 with just imports, destructuring and default parameters • Commander — cool cli tools in minutes • Async

17. Nifty Node Tools Getting serious • Level — LevelDB wrapper for node • Memcached — client library for node • xml — <parse>yes</parse> • Profiler — Go fast or go home

18. Nifty Node Tools Might as well test • Mocha • Istanbul • lolex • aws-node-sdk

19. Nifty Node Tools Docs and Open Source Code • Docs are good, but • Code is even better • Read the readable stream code and take a nap. • Then read the transform stream code and create new universes.

20. What can you do with this thing?

21. Download the code! http://s3.scality.com/ https://github.com/scality/s3

22. Lauren: github: laurenspiegel twitter: @notfollowingyet Giorgio: github: @giorgioregni twitter: @giorgioregni

Scality S3 Server: Node js Meetup Presentation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Scality S3 Server: Node js Meetup Presentation

Similar to Scality S3 Server: Node js Meetup Presentation (20)

More from Scality

More from Scality (9)

Recently uploaded

Recently uploaded (20)

Scality S3 Server: Node js Meetup Presentation