RELIABILITY ENGINEERING
FOR ENTERPRISE SERVERLESS
MASASHI TERUI @ JAWS DAYS 2018
SERVERWORKS CO.,LTD.
+ FREELANCER
• Serverless Oji-san
• Serverless Framework Plugin Developer
• Serverlessconf Tokyo 2016/2017 speaker
• Remote worker (in Sapporo)
• The best Cloud Engineer in Hokkaido!! (でありたい)
MASASHI TERUI
ARCHITECT / DEVELOPER
AGENDA
SERVERLESS
ってなんだっけ?
1 6
2 7
3 8
4 9
5 10
SERVERLESS
の信頼性とは?
SERVERLESS
のよくある課題
SERVERLESS
を実体を捉える
RELIABILITY
考え方
RELIABILITY
設計編
RELIABILITY
実装編
RELIABILITY
監視編
SUMMARY
世界を広げる
SERVERLESS
は特別じゃない
SERVERLESSってなんだっけ?
WHAT IS
CNCF SERVERLESS WHITEPAPER V1.0
• Serverless computing refers to the concept of building and
running applications that do not require server management
• A platform may provide one or both of the following:
• Functions-as-a-Service (FaaS)
• Backend-as-a-Service (BaaS)
• Products or platforms deliver the following benefits to developers:
• Zero Server Ops
• No Compute Cost When Idle


https://github.com/cncf/wg-serverless/tree/master/whitepaper
SERVERLESS IS NOT GLUE
IN ENTERPRISE APPLICATION
”THE ORCHESTRATOR MANAGES THE TRADES USING A GRAPH OF STATES”
SERVERLESS CLOUD NATIVE LANDSCAPE
BUT I PREFER THIS ONE
https://www.slideshare.net/acloudguru/ant-stanley-being-serverless
SERVERLESS USE CASES (FROM CNCF WP)
• Asynchronous, concurrent, easy to parallelize into independent
units of work
• Infrequent or has sporadic demand, with large, unpredictable
variance in scaling requirements
• Stateless, ephemeral, without a major need for instantaneous
cold start time
• Highly dynamic in terms of changing business requirements that
drive a need for accelerated developer velocity
• Non-HTTP-centric and non-elastic scale workloads that weren’t
good fits for an IaaS, PaaS, or CaaS solution (Event Driven
workloads)
“There are many workloads
that are stateful and/or not easy to parallelize”
と思ってませんか?
“Asynchronous and Event Driven processing is
too difficult for humans”
SERVERLESSの信頼性とは?
Reliability
RELIABILITY(RASIS)
細分化すると色々ある
• Reliability
• Availability
• Serviceability
• Integrity
• Security
ここでは以下のように定義

「Reliability = 広義の信頼性(RASIS)」
Reliability
Availability Serviceability
Integrity Security
Reliability
IS SERVERLESS DIFFICULT TO
GUARANTEE THE RELIABILITY?
• Strongly depends on FaaS platform and BaaS products
• Lose the business continuity (Reliability, Availability)
• Distributed Instances
• Lose the traceability (Serviceability)
• Hard to develop
• All become functions (Serviceability)
• NoSQL matches better than RDB (Integrity)
SERVERLESSのよくある課題
i s s u e s
FaaSのよくある課題
• How to test the functions?
• Granularity of the functions
• Messaging between the functions and backends
• Handling request and response (Error Handling)
• Log Aggregation, Traceability
• Monitoring
BaaSのよくある課題
• How to choose the services?
• Fault Tolerance
• Monitoring
SERVERLESSの実体を捉える
MechanismHINT
MECANISM OF FAAS
SERVERLESS PROCESSING MODELhttps://github.com/cncf/wg-serverless/tree/master/whitepaper#detail-view-serverless-processing-model
処理の全体像を掴む
イベントソースからのイベント処理要求を
多数のインスタンス上のFunctionが分散処理
THE INTERNAL FLOW OF PROCESSINGhttps://github.com/apache/incubator-openwhisk/blob/master/docs/about.md#the-internal-flow-of-processing
イベント(HTTP)が処理される流れを掴む
ストリームやキューを挟んで分散処理するのが基本
外から見て同期でも中は非同期かつ分散
THE FUNCTIONS THAT
INVOKED ASYNCHRONOUS
IN THE CONTAINERS
FaaS is… コンテナ内で非同期に呼び出される関数
WHAT IS BAAS?
FROM OWNERSHIP
TO USE SERVICES
クラウドによってサーバが所有から利用へ
さらにミドルウェアやライブラリも利用へ
BaaSを使うのはそういう自然な流れ
FULLY MANAGED AND ABSTRACTED
MIDDLEWARES AND LIBRARIES
BaaS is… フルマネージドかつ抽象化されたMWやライブラリ
SERVERLESSは特別じゃない
is not special
CONSTITUTED OF THE ABSTRACTED
FUNCTIONS AND MIDDLEWARES
Serverless is… 抽象化されてるだけで関数とMWでできてる
WE CAN MAKE RELIABLE
SERVERLESS APPLICATION
SO… 信頼性は作れる!
RELIABILITYの考え方
Method of thinking
RELIABILITYの考え方
• Make the reliability by myself
• Serverless will help you, but will not protect your business
• Think simple
• Apply generally development/operation practices
• If you can't apply the practices, take care of the serverless mechanism
• Keep simple
• Don't be afraid that increase the number of the functions
• We should be afraid complicated architecture
• Change your mind as a software
• Everything is part of your application
RELIABILITY 設計
Architecting
ALL EVENTS FLOWS IN
THE SAME DIRECTION
イベント(データ)は同じ方向に流す
結果が必要なら同期で返さず
取りに行かせる
🙅 🙆
ALL EVENTS FLOWS IN THE SAME DIRECTION
• They will be naturally Asynchronous and Functional
• Asynchronous processing is Retriable
• Functional processing is Reproducible
• The clients get the results by myself
• However, polling is not good choise...
• Pushing is better choice
• Can we be happy with AppSync? (Pushing via Websocket)
UNIFY THE ENDPOINTS
BETWEEN THE SERVICES
サービス間のエンドポイントは集約する
ラップして集約しコントロールを得る
UNIFY THE ENDPOINTS BETWEEN
THE SERVICES
• Microservices
• Separate the services by the domains (One BaaS is one of your service)
• The endpoint of the service is not unique, it has the endpoints for each operations
• Wrap the endpoints to abstract them
• Like a “MySQL Server” and “libmysql”
• Do you call “libmysql” directly?
• You can make Failover/Failsafe mechanism
• Like a Reverse Proxy
• Do you connect to multiple “Read replicas” from “each app servers”?
• Trafic controlling, Caching
ALL SERIES OF EVENTS
HAVE THE SAME ID
一連の処理には同じIDを付与
それを引き回すことでIDでトレースできる
このIDは様々な制御にも使える
ALL SERIES OF EVENTS HAVE
THE SAME ID
• Log Aggregation
• A series of events can be traced by the ID
• Monitor the progress
• Log all event messages
• Execution control
• At least once -> Exactly once
• e.g. DynamoDB Conditional Writes
• Make it easy to implement with something
like a decorator
DATA MODELING
• Become the friend with DynamoDB
• Distributed by Partition Key and
Indexed(B-tree) by Sort Key, LSI
• GSI is a projection of sorted(indexed) data
• The consistency can be guaranteed without
ACID transaction
• Denormalization
• Strong consistency reading, 

Conditional Writing
• There are some difficult situation
• Write asynchronous to RDB
RELIABILITY 実装
Implementation
GRANULARITY OF THE
FUNCTIONS
•Testable!!
• Unit testing is justice in the
serverless world
• Make the dependencies of other
services are replaceable
• Would be replaceable to the mocks
• Easy to Failover/Failsafe
HOW TO TEST THE FUNCTIONS
• Unit testing is justice in the serverless world (2回目)
• Deploy a new environment if the mocks are not enough at integration testing
• It is easy with some frameworks (e.g. Serverless, SAM)
• The services outside of AWS are needed to be easily to deploy (via API)
• Continuous E2E testing with traceable ID
• It become a monitoring
RELIABILITY 監視
Monitoring
RELIABILITY MONITORING
• The greatest monitoring is the notifications from the application
• Be sure to catch the errors and notify them
• Collect the metrics of the services
• CloudWatch
• This is a condition to choice the services outside of AWS
• Continuous E2E testing with traceable ID
SUMMARY
“SERVERLESS IS NOT SPECIAL”
THANKS!!
“MAKE THE RELIABILITY BY MYSELF”
“THINK SIMPLE, KEEP SIMPLE”
“EVERYTHING IS PART OF YOUR APPLICATION”
“LET'S EXPAND SERVERLESS WORLD”
bit.ly/jd2018-sls
#jawsdays #jd2018_a
PLEASE TAKE A SURVEY

Reliability Engineering for Enterprise Serverless

  • 1.
    RELIABILITY ENGINEERING FOR ENTERPRISESERVERLESS MASASHI TERUI @ JAWS DAYS 2018
  • 2.
    SERVERWORKS CO.,LTD. + FREELANCER •Serverless Oji-san • Serverless Framework Plugin Developer • Serverlessconf Tokyo 2016/2017 speaker • Remote worker (in Sapporo) • The best Cloud Engineer in Hokkaido!! (でありたい) MASASHI TERUI ARCHITECT / DEVELOPER
  • 3.
    AGENDA SERVERLESS ってなんだっけ? 1 6 2 7 38 4 9 5 10 SERVERLESS の信頼性とは? SERVERLESS のよくある課題 SERVERLESS を実体を捉える RELIABILITY 考え方 RELIABILITY 設計編 RELIABILITY 実装編 RELIABILITY 監視編 SUMMARY 世界を広げる SERVERLESS は特別じゃない
  • 4.
  • 5.
    CNCF SERVERLESS WHITEPAPERV1.0 • Serverless computing refers to the concept of building and running applications that do not require server management • A platform may provide one or both of the following: • Functions-as-a-Service (FaaS) • Backend-as-a-Service (BaaS) • Products or platforms deliver the following benefits to developers: • Zero Server Ops • No Compute Cost When Idle 
 https://github.com/cncf/wg-serverless/tree/master/whitepaper
  • 6.
    SERVERLESS IS NOTGLUE IN ENTERPRISE APPLICATION ”THE ORCHESTRATOR MANAGES THE TRADES USING A GRAPH OF STATES”
  • 7.
  • 8.
    BUT I PREFERTHIS ONE https://www.slideshare.net/acloudguru/ant-stanley-being-serverless
  • 9.
    SERVERLESS USE CASES(FROM CNCF WP) • Asynchronous, concurrent, easy to parallelize into independent units of work • Infrequent or has sporadic demand, with large, unpredictable variance in scaling requirements • Stateless, ephemeral, without a major need for instantaneous cold start time • Highly dynamic in terms of changing business requirements that drive a need for accelerated developer velocity • Non-HTTP-centric and non-elastic scale workloads that weren’t good fits for an IaaS, PaaS, or CaaS solution (Event Driven workloads)
  • 10.
    “There are manyworkloads that are stateful and/or not easy to parallelize” と思ってませんか? “Asynchronous and Event Driven processing is too difficult for humans”
  • 11.
  • 12.
    RELIABILITY(RASIS) 細分化すると色々ある • Reliability • Availability •Serviceability • Integrity • Security ここでは以下のように定義
 「Reliability = 広義の信頼性(RASIS)」 Reliability Availability Serviceability Integrity Security Reliability
  • 13.
    IS SERVERLESS DIFFICULTTO GUARANTEE THE RELIABILITY? • Strongly depends on FaaS platform and BaaS products • Lose the business continuity (Reliability, Availability) • Distributed Instances • Lose the traceability (Serviceability) • Hard to develop • All become functions (Serviceability) • NoSQL matches better than RDB (Integrity)
  • 14.
  • 15.
    FaaSのよくある課題 • How totest the functions? • Granularity of the functions • Messaging between the functions and backends • Handling request and response (Error Handling) • Log Aggregation, Traceability • Monitoring
  • 16.
    BaaSのよくある課題 • How tochoose the services? • Fault Tolerance • Monitoring
  • 17.
  • 18.
  • 19.
  • 20.
    THE INTERNAL FLOWOF PROCESSINGhttps://github.com/apache/incubator-openwhisk/blob/master/docs/about.md#the-internal-flow-of-processing イベント(HTTP)が処理される流れを掴む ストリームやキューを挟んで分散処理するのが基本 外から見て同期でも中は非同期かつ分散
  • 21.
    THE FUNCTIONS THAT INVOKEDASYNCHRONOUS IN THE CONTAINERS FaaS is… コンテナ内で非同期に呼び出される関数
  • 22.
  • 23.
    FROM OWNERSHIP TO USESERVICES クラウドによってサーバが所有から利用へ さらにミドルウェアやライブラリも利用へ BaaSを使うのはそういう自然な流れ
  • 24.
    FULLY MANAGED ANDABSTRACTED MIDDLEWARES AND LIBRARIES BaaS is… フルマネージドかつ抽象化されたMWやライブラリ
  • 25.
  • 26.
    CONSTITUTED OF THEABSTRACTED FUNCTIONS AND MIDDLEWARES Serverless is… 抽象化されてるだけで関数とMWでできてる
  • 27.
    WE CAN MAKERELIABLE SERVERLESS APPLICATION SO… 信頼性は作れる!
  • 28.
  • 29.
    RELIABILITYの考え方 • Make thereliability by myself • Serverless will help you, but will not protect your business • Think simple • Apply generally development/operation practices • If you can't apply the practices, take care of the serverless mechanism • Keep simple • Don't be afraid that increase the number of the functions • We should be afraid complicated architecture • Change your mind as a software • Everything is part of your application
  • 30.
  • 31.
    ALL EVENTS FLOWSIN THE SAME DIRECTION イベント(データ)は同じ方向に流す 結果が必要なら同期で返さず 取りに行かせる 🙅 🙆
  • 32.
    ALL EVENTS FLOWSIN THE SAME DIRECTION • They will be naturally Asynchronous and Functional • Asynchronous processing is Retriable • Functional processing is Reproducible • The clients get the results by myself • However, polling is not good choise... • Pushing is better choice • Can we be happy with AppSync? (Pushing via Websocket)
  • 33.
    UNIFY THE ENDPOINTS BETWEENTHE SERVICES サービス間のエンドポイントは集約する ラップして集約しコントロールを得る
  • 34.
    UNIFY THE ENDPOINTSBETWEEN THE SERVICES • Microservices • Separate the services by the domains (One BaaS is one of your service) • The endpoint of the service is not unique, it has the endpoints for each operations • Wrap the endpoints to abstract them • Like a “MySQL Server” and “libmysql” • Do you call “libmysql” directly? • You can make Failover/Failsafe mechanism • Like a Reverse Proxy • Do you connect to multiple “Read replicas” from “each app servers”? • Trafic controlling, Caching
  • 35.
    ALL SERIES OFEVENTS HAVE THE SAME ID 一連の処理には同じIDを付与 それを引き回すことでIDでトレースできる このIDは様々な制御にも使える
  • 36.
    ALL SERIES OFEVENTS HAVE THE SAME ID • Log Aggregation • A series of events can be traced by the ID • Monitor the progress • Log all event messages • Execution control • At least once -> Exactly once • e.g. DynamoDB Conditional Writes • Make it easy to implement with something like a decorator
  • 37.
    DATA MODELING • Becomethe friend with DynamoDB • Distributed by Partition Key and Indexed(B-tree) by Sort Key, LSI • GSI is a projection of sorted(indexed) data • The consistency can be guaranteed without ACID transaction • Denormalization • Strong consistency reading, 
 Conditional Writing • There are some difficult situation • Write asynchronous to RDB
  • 38.
  • 39.
    GRANULARITY OF THE FUNCTIONS •Testable!! •Unit testing is justice in the serverless world • Make the dependencies of other services are replaceable • Would be replaceable to the mocks • Easy to Failover/Failsafe
  • 40.
    HOW TO TESTTHE FUNCTIONS • Unit testing is justice in the serverless world (2回目) • Deploy a new environment if the mocks are not enough at integration testing • It is easy with some frameworks (e.g. Serverless, SAM) • The services outside of AWS are needed to be easily to deploy (via API) • Continuous E2E testing with traceable ID • It become a monitoring
  • 41.
  • 42.
    RELIABILITY MONITORING • Thegreatest monitoring is the notifications from the application • Be sure to catch the errors and notify them • Collect the metrics of the services • CloudWatch • This is a condition to choice the services outside of AWS • Continuous E2E testing with traceable ID
  • 43.
  • 44.
    “SERVERLESS IS NOTSPECIAL” THANKS!! “MAKE THE RELIABILITY BY MYSELF” “THINK SIMPLE, KEEP SIMPLE” “EVERYTHING IS PART OF YOUR APPLICATION” “LET'S EXPAND SERVERLESS WORLD”
  • 45.