How To Use Scala At Work - Airframe In Action at Arm Treasure Data

Taro L. Saito
Taro L. SaitoPh.D., Software Engineer at Treasure Data
Taro L. Saito, Ph.D.
Arm Treasure Data
June 29, 2019
Scala Matsuri 2019 - Tokyo
How To Use Scala At Work
Airframe In Action At Arm Treasure Data
1calaを仕事で使おう - Arm reasure DataでのAirframe活用事例

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
About Me: Taro L. Saito (Leo)
2
● Principal Software Engineer at Arm
Treasure Data
● Building distributed query engine service
● Living in US for 4 years
● DBMS & Data Science Background
● Ph.D. of Computer Science
● Database Systems and Genome
Sciences Research
● Assistant Professor at the University of
Tokyo
● OSS Projects Around Scala
● sbt-sonatype: used for releasing 3000+
Scala projects
● snappy-java: a compression library used
in Spark, Parquet, etc.
自己紹介

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
New Release from O’Reilly Japan
● Helped Japanese translation of Data-Intensive
Application Design
● Techniques and concepts around distributed data
processing systems
● Available at Amazon.co.jp and O’Reilly Japan web sites
● will be published on July 18, 2019
3
分散データシステム入門の決定版の翻訳が来月発売

400+
Customers
Founded in
2011
Raised
$54M
Security
Acquired by Arm / Softbank
2018
Arm Treasure Data
Arm reasure Dataの概要

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
The Architecture of Arm Treasure Data
5
DataLogs
Device
Data
Batch
Data
PlazmaDB
Table Schema
Data Collection Cloud Storage Distributed Data Processing
2 million records / sec. 130 trillion records 1 billion rows processed / sec.
Jobs
Job Management
SQL Editor
Scheduler
Workflows
Machine
Learning
Treasure Data OSS
Third Party OSS
reasure Dataのシステム構成。 calaはどこに?

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Module Mix-InPackaging
HTTP Requests and
Responses
Data
airframe-launcher
> _
airframe-log
production:
port: 10010
user: xxxx
...
airframe-config
airframe-codec
sbt-pack
airframe-fluentd
Scala
Objects
Table Data
(CSV, TSV)
JSON
airframe-jsonairframe-surface
airframe-tablet
airframe-jmx
Monitor Runtime States
Generate Mapping Codec
Metrics &
Log Data
JDBC
ResultSets
airframe-jdbc
airframe-http
airframe-http-finagle
Launch HTTP
Services
airframe DI
Debug Logs
Schema-On-Read
Mapping
Airframe
サービスの裏側で使われているAirframe ( cala製 ) のモジュール群

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Our OSS Strategy Around Scala
● Gather the best practices of Scala into Airframe OSS
● Get the real experiences by operating 24/7 services
7
Knowledge
Experiences
Design Decisions
Products
24/7 Services
Business Values
Programming OSS Outcome
Airframeを核にした cala周辺の 戦略

Airframe
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
● Various internal and third-party Scala/Java libraries
● Managed in different repositories, different release cycles
● High-learning cost
■ The knowledge is confined to engineers’ brains
3 Years Ago...
8
Knowledge
Experiences
Design Decisions
Products
24/7 Services
Business Values
Programming Various Libraries Outcome
3年前、Airframeは存在せず、様々なライブラリが混在していた

logger
launcher
object mapper
JDBC reader
json4s jackson
….
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
5 Years Ago...
● No Scala engineer in the company
● Scala in 2014: Scala 2.9.x
● Was not good enough to use:
■ e.g., no string interpolation like s”... ${x}...”
9
Knowledge
Experiences
Design Decisions
Products
24/7 Services
Business Values
Programming Ruby, Java Outcome
5年前には calaのエンジニアも、 calaのコードもなかった

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Today’s Agenda
● How to introduce Scala to your company
● Learn the best practices of using Scala at work
● From 20 Airframe modules
10本日紹介する内容

Airframe
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
How Can We Introduce Scala?
● Saying “I want to use Scala”
● It will not work, especially if you or your team are not familiar with Scala
● Your managers need more information whether it’s good enough or not
● Even if you are a tech lead:
● Need some confidence in using Scala in production
● How can we establish such confidence in using Scala?
11calaをどう導入するか? calaを使っても良いという自信を得るには?

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Start With A Small Investment to Scala
● Guidelines
● Think how you can save your time with Scala
● If you can save 1 minute in a day, your can spend 6 hours for this improvement
■ Save 1 minute / day = 365 minutes / year = 6 hour investment
■ Save 10 minutes / week = 520 minutes / year = 8.6 hour investment
■ Save 1 hour / week = 52 hours / year = 2.2 day investment
● Time is your most valuable asset
● Save your time by using Scala
12「 calaを使って」時間を節約するための「小さな投資」をはじめよう

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
● prestop (presto + top)
● Non production service code
● A handy query monitoring tool for Presto, written in Scala
● Display complex JSON data with fancy ANSI color
The First Scala Code in TD
13reasure Data最初の calaプログラム

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-log
● Scala 2.10: My small investment to test Scala Macros and String interpolation
● A Modern Logging Library for Scala (at Medium)
● ANSI color and source code location display
● Just add LogSupport trait to your class
14プログラムの開発をログメッセージで効率化する

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-launcher
● Needed to handle complex command line options and nested commands
● e.g., $ prestop -e production monitor (other options …)
● Enabled annotation-based command line definitions
15複雑なコマンドラインプログラムを簡単に作成できるようにする

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-config: Application Configuration Flow
● YAML config (embedded into Docker)
● Override credentials, then bind to config objects
YAML
development:
addr: api-dev.com
production:
addr: api.com
Config Object
case class ServerConfig(
addr: String,
port: Int = 8080,
password: String
)
production:
addr: api.com
command: -e production Credentials and Local
Configurations
Merge
Immutable
Object Default Parameters
(e.g., port = 8080)
Object
Mapping
16アプリケーション設定のフローをライブラリ化

airframe-launcher
> _
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
sbt-pack plugin
● A sbt plugin to create standalone Scala packages
● A single folder package with bin and lib folders containing all dependent JARs
● Generates command-line launcher scripts
● My small investment in 2012 to save packaging time
17sbt-packでプログラムをパッケージングし、Dockerイメージを手軽に作成

airframe-launcher
airframe-config
YAML config file
Standalone
Scala Package
sbt-pack Dockerfile
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Medium-SIze Investment: Find A Common Pattern
● Extract a common problem pattern and create a solution
● Data -> Object Mapping
● How many data readers and object mappers do we need?
● How can we save our time for handling such various data types?
YAML
JDBC
ResultSet
YAML Parser +
Object Mapper
Config
Object
Table
Object
Object-Relation
Mapper
JSON
JSON Parser +
Object Mapper
Object
18入力データを cala bjectにマッピングしたいケースは多い。中期的な投資が必要

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-msgpack: MessagePack as Universal Data Format
● MessagePack (msgpack.org)
● Compact JSON-like binary format
● Describes data types and data values at the same time (self-describing)
Object
Unpack
Pack
JDBC
ResultSet
Pack/Unpack
YAML
JSON
19essage ackを中間フォーマットとして使うと、オブジェクトマッパーの実装は1つに

MessagePack
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
PlazmaDB: MessagePack DBMS
● Fluentd -> MessagePack -> Arm Treasure Data
● Automatically generating table schema from MessagePack data
● Apply schema–on-read for providing table data for Presto/Hive/Spark, etc.
Table Schema
Int Column Reader
String Column Reader
Update
Schema
Generate
Reader Set
Table Reader
Schema-free Data
20
Data Collection Distributed Data Processing
Arm reasure Dataは essage ackベースの chema-on-readシステム

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Schema-On-Read Data Processing with MessagePack
● Users can store arbitrary typed data (No table design is required)
● Data can be read in a target type required by the application (e.g., SQL query)
Int
Float
Boolean
String
Array
Map
Binary
SQL BigInt
parseInt
toInt
0 or 1
IntCodec
Pack Unpack
Error or null
“100”
(string)
100
(int)
100
(int)
21
Logs
データ読み込み時に、アプリケーションの要求する型に合わせる ( chema-on- ead)

CSV
command-line
arguments
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-codec: Schema-On-Read Pack/Unpack Interface
● Apply schema-on-read for Scala objects
Input MessagePack Output
Pack Unpack
PackUnpack
22essage ackを通した chema-on-readデータ変換インターフェースを calaに適用

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Pre-defined Codecs in airframe-codec
● Primitive Codecs
● ByteCodec, CharCodec, ShortCodec, IntCodec, LongCodec
● FloatCodec, DoubleCodec
● StringCodec
● BooleanCodec
● TimeStampCodec
● Collection Codec
● ArrayCodec, SeqCodec, ListCodec, IndexSeqCodec, MapCodec, etc.
● OptionCodec
● JsonCodec (airframe-json)
● Java-specific Codec
● FileCodec, ZonedDateTimeCodec, JDBCResultSetCodec, etc.
● Adding Custom Codecs
● Implement MessageCodec[X] interface
23calaで必要なほぼ全てのデータ型へのマッピングをサポート

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
MessageCodec.of[A]: Combination of Codecs
Unpack
Pack
IntCodec
StringCodec
DoubleCodec
MessagePack
MessageCodec.of[A]
24オブジェクトの型に合わせてCodecを合成

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-surface
● Reading Type Signatures From ScalaSig
● Scala compiler embeds Scala Type Signatures (ScalaSig) to class files
● Surface.of[A]
■ returns A’s parameter names and types
class A (data:List[B])
class A
data: List[java.lang.Object]
class A
data: List[java.lang.Object]
ScalaSig: data:List[B]
javac
scalac
Surface.of[A]
data: List[B]
scala.reflect.runtime.universe.TypeTag
Type erasure removes
generic type information
25オブジェクトの型情報を cala igから取得する

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
[WIP] Scala.js RPC
● Scala.js
● Compiling Scala code into JavaScript for Web Browsers
● airframe-codec: Passing model class data between Scala and Scala.js
UserInfo MessagePack UserInfo
Pack Unpack
PackUnpack
Scala
Server Side
Scala.js
Client Side
XML RPC
26airframe-codecは cala.js(ブラウザ側)とのデータ受け渡しにも使える

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
[WIP] airframe-sql
● Universal stream SQL engine
● Processing various types of data through MessagePack
MessagePack Stream SQL MessagePack
Query
Processing
Filter/Aggregation/Join, etc.
27任意のデータ形式に対し、 essage ackを通して で処理をする

JDBC
ResultSet
Pack
YAML
JSON
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 28
Scala In Production
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
A Technical Debt In TD (2015-2016)
● Prestogres: PostgreSQL gateway to Presto
● Enabled using PostgreSQL JDBC/ODBC
drivers to access Presto
● So-called Sada (founder)’s magic
● Was good for the first use cases
● Many Problems:
● Hacks around pgpool-II was hard to
debug
● Hard to support customers upon errors
● Incompatible SQL with Presto
● Nobody could fix these issues
■ including the creator!
29restogresというハックが技術的負債になっていた

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Replacing Prestogres with Prestobase
30calaで restobaseのプロトタイプを作成. 3ヶ月後にサービスリリース

● Prototyped in Scala within a week after a quick chat with Sada
● Utilizing Airframe assets
● Deployed as a production service in 3 months
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-di
● Created a dependency injection library for Scala
● For Prestobase development
● Scala-friendly Syntax
● Useful for combining hundreds of modules
● based on airframe-surface, airframe-log
● See also:
● Airframe Meetup #1 Report (2018)
31restobaseの開発中に calaのためのAirframe DIが誕生

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Airframe OSS
● Lightweight Building Blocks for Scala
● Collection of our investments to Scala
● Repackaged into wvlet.airframe in 2016
● airframe-log
● airframe-launcher
● airframe-config
● airframe-surface
● airframe-di
● airframe-codec
● ...
● As of 2019, Airframe has 20 modules
● 35+ releases in 2018
● Already had 17+ releases in 2019
● Contributing to the Scala Community Build
● To test the latest Scala versions
322016年に各種ツールをAirframeとして統合。20のモジュール、頻繁なリリースサイクル

Airframe
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Monorepo
● Cross build
● For 3 + 1 Scala versions
■ 2.13, 2.12, 2.11, and Scala.js
● 20 modules
■ 4 x 20 = 80 artifacts!
● Challenge
● Publishing took 3 hours with
sbt-release
● Bottleneck
● Sequential run of compile -> test ->
publish for all artifacts
33Airframeはメンテナンスを集約するため単一レポジトリ構成

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Release Automation on Travis CI
● Single-Step Release
● Triggered by git tag
● Running Tasks In Parallel
● Run tests for each Scala version
● Update doc & release notes
■ Generate release notes
from git logs
● Publish
■ sbt-pgp & sbt-sonatype
○ GPG signature
○ Copy to Maven Central
● Finishes around 10~20 minutes
● Blog: 3 Tips For Maintaining
Scala Projects
34ravis CI上でリリースを全自動化し、頻繁なリリースを可能に

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
sbt-sonatype plugin
● A sbt-plugin for releasing projects to Maven Central
● open staging repository -> verify -> close -> promote -> drop
● A small investment
● At 2015 new year holiday => Payed off for saving Airframe release time
● 3000+ Scala projects are using sbt-sonatype
35sbt-sonatypeはお正月休みに作られたプロジェクト。多くの calaライブラリで使われている

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-http
● Created a simple HTTP framework
● Based on Airframe modules:
■ airframe-surface
■ airframe-codec
■ airframe-msgpack
■ etc.
● Blog
● Building Low-Friction Web Service
Over Finagle
● Save the time for choosing a web
framework:
● Many frameworks exist:
● e.g, Finatra, Finch, akka-http, spring,
RESTeasy, open-api, swagger, etc.
36Airframe資産を活用して、Webフレームワークも手軽に作成

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-http-client
● Error handling of HTTP requests is
difficult
● 4xx, 5xx status code
● Should we retry the request?
■ IOException, EOFException
■ TimeoutException
■ InterruptedException
■ SSLException
■ InvocationTargetException
● HTTP client
● request retries
● response mapping
■ JSON, MessagePack format
● airframe-codec
37間違いやすいH リクエストのエラーハンドリングをライブラリ化

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-control
● Everything can fail …
● Network disconnection
● Servere crash
● ...
● Retry
● Exponential backoff
■ 2x, 4x, ...
● Jittering
■ 1 sec., 2 * rand, 4 * rand, …
● Customize error type classifiers
● retryable failures
● non-retryable failures
38リトライ処理をパターン化

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-http-recorder
● Testing against actual web services is time consuming
● Record & Replay HTTP responses
● Reproducible results
● Runnable on small machines (e.g., Travis CI)
39H リクエストをレコーディングして、Webサービスのテストを効率化する

HTTP
Request
HTTP
Recorder
Request
Real Web
Service
Recording Mode:
Response
HTTP
Request
HTTP
Recorder
Replay Mode:
Request
Response Recording
Responses
Request
Recorded
Responses
Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 40
Data Analysis with Scala
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Data-Driven System Optimization
● TD is one of the biggest users of TD
● Query logs
● Collecting all Presto query logs since 2015
● Query statements, performance statistics, logs, etc.
● Logs are our valuable assets
● To understand user activities and enable data-driven optimizations
41
Logs
User
Query
Collect Query Logs
Analyze Query Logs
Machine
Learning
Query
Optimization
Optimize System
システムの最適化のためにログの収集、解析が重要

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-fluentd
● Collect Scala Application Logs To Fluentd
● Scala Objects -> MessagePack -> Fluentd
42essage ackを受け取るFluentdには、airframe-codeの出力を渡せる

Collect Query Logs
Analyze Query Logs
Machine
Learning
Query
Optimization
Optimize System
airframe-fluentd
Scala
Objects
airframe-codec
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
airframe-jmx
● Add @JMX annotation to your application metrics
● It’s also useful to check the application version, configurations, etc.
● JMX clients can check these metrics
● e.g., jconsole
43J Xで、JV の外側からアプリケーションの状態を確認し、メトリックを収集

Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved.
airframe-metrics
● Human Readable Data Format (ElapsedTime, DataSize, etc.)
● Handy Time Window String Support
44時間幅、区間、データサイズを人間を扱いやすい形式にし、ログの解析を効率化

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Taking Snapshots of Data Analysis Tasks
● Save Long-Running Task Results As MessagePack (binary)
● Save the cost of re-computation
Result: Seq[A] MessagePack Storage
Pack
Save
Unpack
Task
Run
Load
Second Run:
Load
Compute
(e.g., 10 min)
First run
Snapshot
45Airframe資産を活用して、データ解析結果をキャッシュし作業を効率化する

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Module Mix-InPackaging
HTTP Requests and
Responses
Data
airframe-launcher
> _
airframe-log
production:
port: 10010
user: xxxx
...
airframe-config
airframe-codec
sbt-pack
airframe-fluentd
Scala
Objects
Table Data
(CSV, TSV)
JSON
airframe-jsonairframe-surface
airframe-tablet
airframe-jmx
Monitor Runtime States
Generate Mapping Codec
Metrics &
Log Data
JDBC
ResultSets
airframe-jdbc
airframe-http
airframe-http-finagle
Launch HTTP
Services
airframe DI
Debug Logs
Schema-On-Read
Mapping
Airframe
Airframeを中心にコード資産が形成されている

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Resolving Technical Debts with Airframe Upgrade
● Migrate common programming patterns into Airframe
● Upgrade Airframe Version
● YY.MM.patch versioning: 19.5.x, 19.6.x, …
■ Easy to see how behind the project is from the latest version.
● Reduce code and logic duplications across components
47
Knowledges
Experiences
Design Decisions
Products
24/7 Services
Business Values
Programming OSS Outcome
Airframeをアップグレードする際に技術的負債を解消していく

Airframe
Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Scala At Arm Treasure Data
● Scala is now an official language at Arm Treasure Data
● 0 -> 10+ engineers who can write Scala
● Use cases are growing:
● Query optimization, API, Spark, data analysis,
storage systems, service operation, etc.
● We are happy to share our Scala assets through Airframe!
48
Add Your GitHub Star!
wvlet/airframe
Airframe
calaエンジニアが充実してきたArm reasure Data。 calaの適用範囲も広がっている

Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved.
Presto Conference Tokyo 2019
● July 11 (Thu), 2019, 13:30 ~ (Free)
● https://techplay.jp/event/733772
● Inviting Presto Creators (Martin, Dain, David)
● Presto Software Foundation
● Talks from big Presto users in Japan
● Yahoo! JAPAN, LINE, Arm Treasure Data
● Presto Source Code Navigation
49
resto Conference okyo 2019を7/11(木) 13:30~より開催 (参加無料)

Confidential © Arm 2017Confidential © Arm 2017Confidential © Arm 2017
Thank You!
Danke!
Merci!
谢谢!
ありがとう!
Gracias!
Kiitos!
50
1 of 50

Recommended

Airframe Meetup #3: 2019 Updates & AirSpec by
Airframe Meetup #3: 2019 Updates & AirSpecAirframe Meetup #3: 2019 Updates & AirSpec
Airframe Meetup #3: 2019 Updates & AirSpecTaro L. Saito
495 views40 slides
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020 by
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020
Scala for Everything: From Frontend to Backend Applications - Scala Matsuri 2020Taro L. Saito
1.8K views39 slides
Presto At Arm Treasure Data - 2019 Updates by
Presto At Arm Treasure Data - 2019 UpdatesPresto At Arm Treasure Data - 2019 Updates
Presto At Arm Treasure Data - 2019 UpdatesTaro L. Saito
2.8K views31 slides
Reading The Source Code of Presto by
Reading The Source Code of PrestoReading The Source Code of Presto
Reading The Source Code of PrestoTaro L. Saito
4.2K views34 slides
Airframe RPC by
Airframe RPCAirframe RPC
Airframe RPCTaro L. Saito
754 views32 slides
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020 by
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020
Journey of Migrating 1 Million Presto Queries - Presto Webinar 2020Taro L. Saito
320 views35 slides

More Related Content

What's hot

Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17 by
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17Taro L. Saito
2.9K views43 slides
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018 by
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018Taro L. Saito
659 views37 slides
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine by
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core enginePLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engineRyu Kobayashi
797 views20 slides
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17 by
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17Muga Nishizawa
1.4K views32 slides
Managing Machine Learning workflows on Treasure Data by
Managing Machine Learning workflows on Treasure DataManaging Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure DataAki Ariga
2.6K views21 slides
Recent Changes and Challenges for Future Presto by
Recent Changes and Challenges for Future PrestoRecent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future PrestoKai Sasaki
1.3K views32 slides

What's hot(20)

Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17 by Taro L. Saito
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Taro L. Saito2.9K views
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018 by Taro L. Saito
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018
Taro L. Saito659 views
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine by Ryu Kobayashi
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core enginePLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
PLAZMA TD Tech Talk 2018 at Shibuya: Hive2 as a new td hadoop core engine
Ryu Kobayashi797 views
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17 by Muga Nishizawa
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Custom Script Execution Environment on TD Workflow @ TD Tech Talk 2018-10-17
Muga Nishizawa1.4K views
Managing Machine Learning workflows on Treasure Data by Aki Ariga
Managing Machine Learning workflows on Treasure DataManaging Machine Learning workflows on Treasure Data
Managing Machine Learning workflows on Treasure Data
Aki Ariga2.6K views
Recent Changes and Challenges for Future Presto by Kai Sasaki
Recent Changes and Challenges for Future PrestoRecent Changes and Challenges for Future Presto
Recent Changes and Challenges for Future Presto
Kai Sasaki1.3K views
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo by Equnix
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki KondoPGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
PGConf.ASIA 2019 - The Future of TDEforPG - Taiki Kondo
Equnix840 views
Introduction to Flink Streaming by datamantra
Introduction to Flink StreamingIntroduction to Flink Streaming
Introduction to Flink Streaming
datamantra1.3K views
Flink Forward Berlin 2017: Roberto Bentivoglio, Saverio Veltri - NSDB (Natura... by Flink Forward
Flink Forward Berlin 2017: Roberto Bentivoglio, Saverio Veltri - NSDB (Natura...Flink Forward Berlin 2017: Roberto Bentivoglio, Saverio Veltri - NSDB (Natura...
Flink Forward Berlin 2017: Roberto Bentivoglio, Saverio Veltri - NSDB (Natura...
Flink Forward375 views
Productionalizing a spark application by datamantra
Productionalizing a spark applicationProductionalizing a spark application
Productionalizing a spark application
datamantra1.3K views
Functional APIs with Absinthe GraphQL by Zvi Avraham
Functional APIs with Absinthe GraphQLFunctional APIs with Absinthe GraphQL
Functional APIs with Absinthe GraphQL
Zvi Avraham437 views
Improve data engineering work with Digdag and Presto UDF by Kentaro Yoshida
Improve data engineering work with Digdag and Presto UDFImprove data engineering work with Digdag and Presto UDF
Improve data engineering work with Digdag and Presto UDF
Kentaro Yoshida1.9K views
Migrating batch ETLs to streaming Flink by William Saar
Migrating batch ETLs to streaming FlinkMigrating batch ETLs to streaming Flink
Migrating batch ETLs to streaming Flink
William Saar898 views
BlackRay - The open Source Data Engine by fschupp
BlackRay - The open Source Data EngineBlackRay - The open Source Data Engine
BlackRay - The open Source Data Engine
fschupp2K views
P4 Introduction by Netronome
P4 Introduction P4 Introduction
P4 Introduction
Netronome1.5K views
Introduction to Structured streaming by datamantra
Introduction to Structured streamingIntroduction to Structured streaming
Introduction to Structured streaming
datamantra1.3K views
Enabling Java: Windows on Arm64 - A Success Story! by Monica Beckwith
Enabling Java: Windows on Arm64 - A Success Story!Enabling Java: Windows on Arm64 - A Success Story!
Enabling Java: Windows on Arm64 - A Success Story!
Monica Beckwith193 views

Similar to How To Use Scala At Work - Airframe In Action at Arm Treasure Data

Five cool ways the JVM can run Apache Spark faster by
Five cool ways the JVM can run Apache Spark fasterFive cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark fasterTim Ellison
2.1K views26 slides
Introduction to Amazon EC2 F1 Instances by
Introduction to Amazon EC2 F1 Instances Introduction to Amazon EC2 F1 Instances
Introduction to Amazon EC2 F1 Instances Amazon Web Services
1.2K views42 slides
ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S... by
ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...
ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...Amazon Web Services
4.8K views38 slides
Breaking the Monolith road to containers.pdf by
Breaking the Monolith road to containers.pdfBreaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfAmazon Web Services
206 views64 slides
Apache Big Data Europe 2016 by
Apache Big Data Europe 2016Apache Big Data Europe 2016
Apache Big Data Europe 2016Tim Ellison
372 views40 slides
Kubernetes is hard! Lessons learned taking our apps to Kubernetes - Eldad Ass... by
Kubernetes is hard! Lessons learned taking our apps to Kubernetes - Eldad Ass...Kubernetes is hard! Lessons learned taking our apps to Kubernetes - Eldad Ass...
Kubernetes is hard! Lessons learned taking our apps to Kubernetes - Eldad Ass...Cloud Native Day Tel Aviv
663 views32 slides

Similar to How To Use Scala At Work - Airframe In Action at Arm Treasure Data(20)

Five cool ways the JVM can run Apache Spark faster by Tim Ellison
Five cool ways the JVM can run Apache Spark fasterFive cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark faster
Tim Ellison2.1K views
ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S... by Amazon Web Services
ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...
ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...
Amazon Web Services4.8K views
Apache Big Data Europe 2016 by Tim Ellison
Apache Big Data Europe 2016Apache Big Data Europe 2016
Apache Big Data Europe 2016
Tim Ellison372 views
Kubernetes is hard! Lessons learned taking our apps to Kubernetes - Eldad Ass... by Cloud Native Day Tel Aviv
Kubernetes is hard! Lessons learned taking our apps to Kubernetes - Eldad Ass...Kubernetes is hard! Lessons learned taking our apps to Kubernetes - Eldad Ass...
Kubernetes is hard! Lessons learned taking our apps to Kubernetes - Eldad Ass...
Performing serverless analytics in AWS Glue - ADB202 - Chicago AWS Summit by Amazon Web Services
Performing serverless analytics in AWS Glue - ADB202 - Chicago AWS SummitPerforming serverless analytics in AWS Glue - ADB202 - Chicago AWS Summit
Performing serverless analytics in AWS Glue - ADB202 - Chicago AWS Summit
Revisit Dependency Injection in scala by takezoe
Revisit Dependency Injection in scalaRevisit Dependency Injection in scala
Revisit Dependency Injection in scala
takezoe5.3K views
Apache Spark Performance Observations by Adam Roberts
Apache Spark Performance ObservationsApache Spark Performance Observations
Apache Spark Performance Observations
Adam Roberts593 views
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ... by Amazon Web Services
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison. by J On The Beach
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.
A Java Implementer's Guide to Boosting Apache Spark Performance by Tim Ellison.
J On The Beach1.6K views
Make your data fly - Building data platform in AWS by Kimmo Kantojärvi
Make your data fly - Building data platform in AWSMake your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWS
Kimmo Kantojärvi319 views
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re... by Amazon Web Services
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re...Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances - CMP402 - re...
Amazon Web Services1.8K views
Building a Recommender System Using Amazon SageMaker's Factorization Machine ... by Amazon Web Services
Building a Recommender System Using Amazon SageMaker's Factorization Machine ...Building a Recommender System Using Amazon SageMaker's Factorization Machine ...
Building a Recommender System Using Amazon SageMaker's Factorization Machine ...
Make your PySpark Data Fly with Arrow! by Databricks
Make your PySpark Data Fly with Arrow!Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!
Databricks965 views
Optimizing your SparkML pipelines using the latest features in Spark 2.3 by DataWorks Summit
Optimizing your SparkML pipelines using the latest features in Spark 2.3Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3
DataWorks Summit1.1K views
Build Low-Latency Applications in Rust on ScyllaDB by ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
ScyllaDB567 views
IBM Runtimes Performance Observations with Apache Spark by AdamRobertsIBM
IBM Runtimes Performance Observations with Apache SparkIBM Runtimes Performance Observations with Apache Spark
IBM Runtimes Performance Observations with Apache Spark
AdamRobertsIBM676 views
Interactive Analytics using Apache Spark by Sachin Aggarwal
Interactive Analytics using Apache SparkInteractive Analytics using Apache Spark
Interactive Analytics using Apache Spark
Sachin Aggarwal3.3K views

More from Taro L. Saito

Tips For Maintaining OSS Projects by
Tips For Maintaining OSS ProjectsTips For Maintaining OSS Projects
Tips For Maintaining OSS ProjectsTaro L. Saito
314 views21 slides
Learning Silicon Valley Culture by
Learning Silicon Valley CultureLearning Silicon Valley Culture
Learning Silicon Valley CultureTaro L. Saito
4.8K views28 slides
Presto At Treasure Data by
Presto At Treasure DataPresto At Treasure Data
Presto At Treasure DataTaro L. Saito
5.4K views29 slides
Scala at Treasure Data by
Scala at Treasure DataScala at Treasure Data
Scala at Treasure DataTaro L. Saito
3.1K views33 slides
Introduction to Presto at Treasure Data by
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure DataTaro L. Saito
1.7K views28 slides
Workflow Hacks #1 - dots. Tokyo by
Workflow Hacks #1 - dots. TokyoWorkflow Hacks #1 - dots. Tokyo
Workflow Hacks #1 - dots. TokyoTaro L. Saito
3.1K views31 slides

More from Taro L. Saito(17)

Tips For Maintaining OSS Projects by Taro L. Saito
Tips For Maintaining OSS ProjectsTips For Maintaining OSS Projects
Tips For Maintaining OSS Projects
Taro L. Saito314 views
Learning Silicon Valley Culture by Taro L. Saito
Learning Silicon Valley CultureLearning Silicon Valley Culture
Learning Silicon Valley Culture
Taro L. Saito4.8K views
Presto At Treasure Data by Taro L. Saito
Presto At Treasure DataPresto At Treasure Data
Presto At Treasure Data
Taro L. Saito5.4K views
Scala at Treasure Data by Taro L. Saito
Scala at Treasure DataScala at Treasure Data
Scala at Treasure Data
Taro L. Saito3.1K views
Introduction to Presto at Treasure Data by Taro L. Saito
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
Taro L. Saito1.7K views
Workflow Hacks #1 - dots. Tokyo by Taro L. Saito
Workflow Hacks #1 - dots. TokyoWorkflow Hacks #1 - dots. Tokyo
Workflow Hacks #1 - dots. Tokyo
Taro L. Saito3.1K views
Presto @ Treasure Data - Presto Meetup Boston 2015 by Taro L. Saito
Presto @ Treasure Data - Presto Meetup Boston 2015Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015
Taro L. Saito1.9K views
Presto As A Service - Treasure DataでのPresto運用事例 by Taro L. Saito
Presto As A Service - Treasure DataでのPresto運用事例Presto As A Service - Treasure DataでのPresto運用事例
Presto As A Service - Treasure DataでのPresto運用事例
Taro L. Saito9.9K views
Presto as a Service - Tips for operation and monitoring by Taro L. Saito
Presto as a Service - Tips for operation and monitoringPresto as a Service - Tips for operation and monitoring
Presto as a Service - Tips for operation and monitoring
Taro L. Saito6.8K views
Treasure Dataを支える技術 - MessagePack編 by Taro L. Saito
Treasure Dataを支える技術 - MessagePack編Treasure Dataを支える技術 - MessagePack編
Treasure Dataを支える技術 - MessagePack編
Taro L. Saito13.7K views
Weaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo by Taro L. Saito
Weaving Dataflows with Silk - ScalaMatsuri 2014, TokyoWeaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
Weaving Dataflows with Silk - ScalaMatsuri 2014, Tokyo
Taro L. Saito3K views
Spark Internals - Hadoop Source Code Reading #16 in Japan by Taro L. Saito
Spark Internals - Hadoop Source Code Reading #16 in JapanSpark Internals - Hadoop Source Code Reading #16 in Japan
Spark Internals - Hadoop Source Code Reading #16 in Japan
Taro L. Saito21.7K views
Streaming Distributed Data Processing with Silk #deim2014 by Taro L. Saito
Streaming Distributed Data Processing with Silk #deim2014Streaming Distributed Data Processing with Silk #deim2014
Streaming Distributed Data Processing with Silk #deim2014
Taro L. Saito4.1K views
Silkによる並列分散ワークフロープログラミング by Taro L. Saito
Silkによる並列分散ワークフロープログラミングSilkによる並列分散ワークフロープログラミング
Silkによる並列分散ワークフロープログラミング
Taro L. Saito2.5K views
2011年度 生物データベース論 2日目 木構造データ by Taro L. Saito
2011年度 生物データベース論 2日目 木構造データ2011年度 生物データベース論 2日目 木構造データ
2011年度 生物データベース論 2日目 木構造データ
Taro L. Saito2.5K views
Relational-Style XML Query @ SIGMOD-J 2008 Dec. by Taro L. Saito
Relational-Style XML Query @ SIGMOD-J 2008 Dec.Relational-Style XML Query @ SIGMOD-J 2008 Dec.
Relational-Style XML Query @ SIGMOD-J 2008 Dec.
Taro L. Saito2.2K views

Recently uploaded

Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates by
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesKeynote Talk: Open Source is Not Dead - Charles Schulz - Vates
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesShapeBlue
178 views15 slides
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc
130 views29 slides
State of the Union - Rohit Yadav - Apache CloudStack by
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStackShapeBlue
218 views53 slides
Uni Systems for Power Platform.pptx by
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
60 views21 slides
Network Source of Truth and Infrastructure as Code revisited by
Network Source of Truth and Infrastructure as Code revisitedNetwork Source of Truth and Infrastructure as Code revisited
Network Source of Truth and Infrastructure as Code revisitedNetwork Automation Forum
49 views45 slides
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...ShapeBlue
105 views15 slides

Recently uploaded(20)

Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates by ShapeBlue
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesKeynote Talk: Open Source is Not Dead - Charles Schulz - Vates
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates
ShapeBlue178 views
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc130 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue218 views
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue105 views
Data Integrity for Banking and Financial Services by Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely76 views
Digital Personal Data Protection (DPDP) Practical Approach For CISOs by Priyanka Aash
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Priyanka Aash103 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... by ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue69 views
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... by ShapeBlue
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
ShapeBlue52 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software373 views
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ... by ShapeBlue
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
ShapeBlue114 views
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue68 views
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue113 views
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue by ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
ShapeBlue191 views
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... by ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue128 views
"Surviving highload with Node.js", Andrii Shumada by Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays49 views
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue by ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
ShapeBlue149 views
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... by ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue97 views

How To Use Scala At Work - Airframe In Action at Arm Treasure Data

  • 1. Taro L. Saito, Ph.D. Arm Treasure Data June 29, 2019 Scala Matsuri 2019 - Tokyo How To Use Scala At Work Airframe In Action At Arm Treasure Data 1calaを仕事で使おう - Arm reasure DataでのAirframe活用事例

  • 2. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. About Me: Taro L. Saito (Leo) 2 ● Principal Software Engineer at Arm Treasure Data ● Building distributed query engine service ● Living in US for 4 years ● DBMS & Data Science Background ● Ph.D. of Computer Science ● Database Systems and Genome Sciences Research ● Assistant Professor at the University of Tokyo ● OSS Projects Around Scala ● sbt-sonatype: used for releasing 3000+ Scala projects ● snappy-java: a compression library used in Spark, Parquet, etc. 自己紹介

  • 3. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. New Release from O’Reilly Japan ● Helped Japanese translation of Data-Intensive Application Design ● Techniques and concepts around distributed data processing systems ● Available at Amazon.co.jp and O’Reilly Japan web sites ● will be published on July 18, 2019 3 分散データシステム入門の決定版の翻訳が来月発売

  • 4. 400+ Customers Founded in 2011 Raised $54M Security Acquired by Arm / Softbank 2018 Arm Treasure Data Arm reasure Dataの概要

  • 5. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. The Architecture of Arm Treasure Data 5 DataLogs Device Data Batch Data PlazmaDB Table Schema Data Collection Cloud Storage Distributed Data Processing 2 million records / sec. 130 trillion records 1 billion rows processed / sec. Jobs Job Management SQL Editor Scheduler Workflows Machine Learning Treasure Data OSS Third Party OSS reasure Dataのシステム構成。 calaはどこに?

  • 6. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Module Mix-InPackaging HTTP Requests and Responses Data airframe-launcher > _ airframe-log production: port: 10010 user: xxxx ... airframe-config airframe-codec sbt-pack airframe-fluentd Scala Objects Table Data (CSV, TSV) JSON airframe-jsonairframe-surface airframe-tablet airframe-jmx Monitor Runtime States Generate Mapping Codec Metrics & Log Data JDBC ResultSets airframe-jdbc airframe-http airframe-http-finagle Launch HTTP Services airframe DI Debug Logs Schema-On-Read Mapping Airframe サービスの裏側で使われているAirframe ( cala製 ) のモジュール群

  • 7. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Our OSS Strategy Around Scala ● Gather the best practices of Scala into Airframe OSS ● Get the real experiences by operating 24/7 services 7 Knowledge Experiences Design Decisions Products 24/7 Services Business Values Programming OSS Outcome Airframeを核にした cala周辺の 戦略
 Airframe
  • 8. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. ● Various internal and third-party Scala/Java libraries ● Managed in different repositories, different release cycles ● High-learning cost ■ The knowledge is confined to engineers’ brains 3 Years Ago... 8 Knowledge Experiences Design Decisions Products 24/7 Services Business Values Programming Various Libraries Outcome 3年前、Airframeは存在せず、様々なライブラリが混在していた
 logger launcher object mapper JDBC reader json4s jackson ….
  • 9. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. 5 Years Ago... ● No Scala engineer in the company ● Scala in 2014: Scala 2.9.x ● Was not good enough to use: ■ e.g., no string interpolation like s”... ${x}...” 9 Knowledge Experiences Design Decisions Products 24/7 Services Business Values Programming Ruby, Java Outcome 5年前には calaのエンジニアも、 calaのコードもなかった

  • 10. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Today’s Agenda ● How to introduce Scala to your company ● Learn the best practices of using Scala at work ● From 20 Airframe modules 10本日紹介する内容
 Airframe
  • 11. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. How Can We Introduce Scala? ● Saying “I want to use Scala” ● It will not work, especially if you or your team are not familiar with Scala ● Your managers need more information whether it’s good enough or not ● Even if you are a tech lead: ● Need some confidence in using Scala in production ● How can we establish such confidence in using Scala? 11calaをどう導入するか? calaを使っても良いという自信を得るには?

  • 12. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Start With A Small Investment to Scala ● Guidelines ● Think how you can save your time with Scala ● If you can save 1 minute in a day, your can spend 6 hours for this improvement ■ Save 1 minute / day = 365 minutes / year = 6 hour investment ■ Save 10 minutes / week = 520 minutes / year = 8.6 hour investment ■ Save 1 hour / week = 52 hours / year = 2.2 day investment ● Time is your most valuable asset ● Save your time by using Scala 12「 calaを使って」時間を節約するための「小さな投資」をはじめよう

  • 13. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. ● prestop (presto + top) ● Non production service code ● A handy query monitoring tool for Presto, written in Scala ● Display complex JSON data with fancy ANSI color The First Scala Code in TD 13reasure Data最初の calaプログラム

  • 14. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-log ● Scala 2.10: My small investment to test Scala Macros and String interpolation ● A Modern Logging Library for Scala (at Medium) ● ANSI color and source code location display ● Just add LogSupport trait to your class 14プログラムの開発をログメッセージで効率化する

  • 15. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-launcher ● Needed to handle complex command line options and nested commands ● e.g., $ prestop -e production monitor (other options …) ● Enabled annotation-based command line definitions 15複雑なコマンドラインプログラムを簡単に作成できるようにする

  • 16. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-config: Application Configuration Flow ● YAML config (embedded into Docker) ● Override credentials, then bind to config objects YAML development: addr: api-dev.com production: addr: api.com Config Object case class ServerConfig( addr: String, port: Int = 8080, password: String ) production: addr: api.com command: -e production Credentials and Local Configurations Merge Immutable Object Default Parameters (e.g., port = 8080) Object Mapping 16アプリケーション設定のフローをライブラリ化
 airframe-launcher > _
  • 17. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. sbt-pack plugin ● A sbt plugin to create standalone Scala packages ● A single folder package with bin and lib folders containing all dependent JARs ● Generates command-line launcher scripts ● My small investment in 2012 to save packaging time 17sbt-packでプログラムをパッケージングし、Dockerイメージを手軽に作成
 airframe-launcher airframe-config YAML config file Standalone Scala Package sbt-pack Dockerfile
  • 18. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Medium-SIze Investment: Find A Common Pattern ● Extract a common problem pattern and create a solution ● Data -> Object Mapping ● How many data readers and object mappers do we need? ● How can we save our time for handling such various data types? YAML JDBC ResultSet YAML Parser + Object Mapper Config Object Table Object Object-Relation Mapper JSON JSON Parser + Object Mapper Object 18入力データを cala bjectにマッピングしたいケースは多い。中期的な投資が必要

  • 19. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-msgpack: MessagePack as Universal Data Format ● MessagePack (msgpack.org) ● Compact JSON-like binary format ● Describes data types and data values at the same time (self-describing) Object Unpack Pack JDBC ResultSet Pack/Unpack YAML JSON 19essage ackを中間フォーマットとして使うと、オブジェクトマッパーの実装は1つに
 MessagePack
  • 20. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. PlazmaDB: MessagePack DBMS ● Fluentd -> MessagePack -> Arm Treasure Data ● Automatically generating table schema from MessagePack data ● Apply schema–on-read for providing table data for Presto/Hive/Spark, etc. Table Schema Int Column Reader String Column Reader Update Schema Generate Reader Set Table Reader Schema-free Data 20 Data Collection Distributed Data Processing Arm reasure Dataは essage ackベースの chema-on-readシステム

  • 21. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Schema-On-Read Data Processing with MessagePack ● Users can store arbitrary typed data (No table design is required) ● Data can be read in a target type required by the application (e.g., SQL query) Int Float Boolean String Array Map Binary SQL BigInt parseInt toInt 0 or 1 IntCodec Pack Unpack Error or null “100” (string) 100 (int) 100 (int) 21 Logs データ読み込み時に、アプリケーションの要求する型に合わせる ( chema-on- ead)
 CSV command-line arguments
  • 22. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-codec: Schema-On-Read Pack/Unpack Interface ● Apply schema-on-read for Scala objects Input MessagePack Output Pack Unpack PackUnpack 22essage ackを通した chema-on-readデータ変換インターフェースを calaに適用

  • 23. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Pre-defined Codecs in airframe-codec ● Primitive Codecs ● ByteCodec, CharCodec, ShortCodec, IntCodec, LongCodec ● FloatCodec, DoubleCodec ● StringCodec ● BooleanCodec ● TimeStampCodec ● Collection Codec ● ArrayCodec, SeqCodec, ListCodec, IndexSeqCodec, MapCodec, etc. ● OptionCodec ● JsonCodec (airframe-json) ● Java-specific Codec ● FileCodec, ZonedDateTimeCodec, JDBCResultSetCodec, etc. ● Adding Custom Codecs ● Implement MessageCodec[X] interface 23calaで必要なほぼ全てのデータ型へのマッピングをサポート

  • 24. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. MessageCodec.of[A]: Combination of Codecs Unpack Pack IntCodec StringCodec DoubleCodec MessagePack MessageCodec.of[A] 24オブジェクトの型に合わせてCodecを合成

  • 25. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-surface ● Reading Type Signatures From ScalaSig ● Scala compiler embeds Scala Type Signatures (ScalaSig) to class files ● Surface.of[A] ■ returns A’s parameter names and types class A (data:List[B]) class A data: List[java.lang.Object] class A data: List[java.lang.Object] ScalaSig: data:List[B] javac scalac Surface.of[A] data: List[B] scala.reflect.runtime.universe.TypeTag Type erasure removes generic type information 25オブジェクトの型情報を cala igから取得する

  • 26. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. [WIP] Scala.js RPC ● Scala.js ● Compiling Scala code into JavaScript for Web Browsers ● airframe-codec: Passing model class data between Scala and Scala.js UserInfo MessagePack UserInfo Pack Unpack PackUnpack Scala Server Side Scala.js Client Side XML RPC 26airframe-codecは cala.js(ブラウザ側)とのデータ受け渡しにも使える

  • 27. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. [WIP] airframe-sql ● Universal stream SQL engine ● Processing various types of data through MessagePack MessagePack Stream SQL MessagePack Query Processing Filter/Aggregation/Join, etc. 27任意のデータ形式に対し、 essage ackを通して で処理をする
 JDBC ResultSet Pack YAML JSON
  • 28. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 28 Scala In Production
  • 29. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. A Technical Debt In TD (2015-2016) ● Prestogres: PostgreSQL gateway to Presto ● Enabled using PostgreSQL JDBC/ODBC drivers to access Presto ● So-called Sada (founder)’s magic ● Was good for the first use cases ● Many Problems: ● Hacks around pgpool-II was hard to debug ● Hard to support customers upon errors ● Incompatible SQL with Presto ● Nobody could fix these issues ■ including the creator! 29restogresというハックが技術的負債になっていた

  • 30. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Replacing Prestogres with Prestobase 30calaで restobaseのプロトタイプを作成. 3ヶ月後にサービスリリース
 ● Prototyped in Scala within a week after a quick chat with Sada ● Utilizing Airframe assets ● Deployed as a production service in 3 months
  • 31. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-di ● Created a dependency injection library for Scala ● For Prestobase development ● Scala-friendly Syntax ● Useful for combining hundreds of modules ● based on airframe-surface, airframe-log ● See also: ● Airframe Meetup #1 Report (2018) 31restobaseの開発中に calaのためのAirframe DIが誕生

  • 32. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Airframe OSS ● Lightweight Building Blocks for Scala ● Collection of our investments to Scala ● Repackaged into wvlet.airframe in 2016 ● airframe-log ● airframe-launcher ● airframe-config ● airframe-surface ● airframe-di ● airframe-codec ● ... ● As of 2019, Airframe has 20 modules ● 35+ releases in 2018 ● Already had 17+ releases in 2019 ● Contributing to the Scala Community Build ● To test the latest Scala versions 322016年に各種ツールをAirframeとして統合。20のモジュール、頻繁なリリースサイクル
 Airframe
  • 33. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Monorepo ● Cross build ● For 3 + 1 Scala versions ■ 2.13, 2.12, 2.11, and Scala.js ● 20 modules ■ 4 x 20 = 80 artifacts! ● Challenge ● Publishing took 3 hours with sbt-release ● Bottleneck ● Sequential run of compile -> test -> publish for all artifacts 33Airframeはメンテナンスを集約するため単一レポジトリ構成

  • 34. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Release Automation on Travis CI ● Single-Step Release ● Triggered by git tag ● Running Tasks In Parallel ● Run tests for each Scala version ● Update doc & release notes ■ Generate release notes from git logs ● Publish ■ sbt-pgp & sbt-sonatype ○ GPG signature ○ Copy to Maven Central ● Finishes around 10~20 minutes ● Blog: 3 Tips For Maintaining Scala Projects 34ravis CI上でリリースを全自動化し、頻繁なリリースを可能に

  • 35. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. sbt-sonatype plugin ● A sbt-plugin for releasing projects to Maven Central ● open staging repository -> verify -> close -> promote -> drop ● A small investment ● At 2015 new year holiday => Payed off for saving Airframe release time ● 3000+ Scala projects are using sbt-sonatype 35sbt-sonatypeはお正月休みに作られたプロジェクト。多くの calaライブラリで使われている

  • 36. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-http ● Created a simple HTTP framework ● Based on Airframe modules: ■ airframe-surface ■ airframe-codec ■ airframe-msgpack ■ etc. ● Blog ● Building Low-Friction Web Service Over Finagle ● Save the time for choosing a web framework: ● Many frameworks exist: ● e.g, Finatra, Finch, akka-http, spring, RESTeasy, open-api, swagger, etc. 36Airframe資産を活用して、Webフレームワークも手軽に作成

  • 37. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-http-client ● Error handling of HTTP requests is difficult ● 4xx, 5xx status code ● Should we retry the request? ■ IOException, EOFException ■ TimeoutException ■ InterruptedException ■ SSLException ■ InvocationTargetException ● HTTP client ● request retries ● response mapping ■ JSON, MessagePack format ● airframe-codec 37間違いやすいH リクエストのエラーハンドリングをライブラリ化

  • 38. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-control ● Everything can fail … ● Network disconnection ● Servere crash ● ... ● Retry ● Exponential backoff ■ 2x, 4x, ... ● Jittering ■ 1 sec., 2 * rand, 4 * rand, … ● Customize error type classifiers ● retryable failures ● non-retryable failures 38リトライ処理をパターン化

  • 39. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-http-recorder ● Testing against actual web services is time consuming ● Record & Replay HTTP responses ● Reproducible results ● Runnable on small machines (e.g., Travis CI) 39H リクエストをレコーディングして、Webサービスのテストを効率化する
 HTTP Request HTTP Recorder Request Real Web Service Recording Mode: Response HTTP Request HTTP Recorder Replay Mode: Request Response Recording Responses Request Recorded Responses
  • 40. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. 40 Data Analysis with Scala
  • 41. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Data-Driven System Optimization ● TD is one of the biggest users of TD ● Query logs ● Collecting all Presto query logs since 2015 ● Query statements, performance statistics, logs, etc. ● Logs are our valuable assets ● To understand user activities and enable data-driven optimizations 41 Logs User Query Collect Query Logs Analyze Query Logs Machine Learning Query Optimization Optimize System システムの最適化のためにログの収集、解析が重要

  • 42. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-fluentd ● Collect Scala Application Logs To Fluentd ● Scala Objects -> MessagePack -> Fluentd 42essage ackを受け取るFluentdには、airframe-codeの出力を渡せる
 Collect Query Logs Analyze Query Logs Machine Learning Query Optimization Optimize System airframe-fluentd Scala Objects airframe-codec
  • 43. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. airframe-jmx ● Add @JMX annotation to your application metrics ● It’s also useful to check the application version, configurations, etc. ● JMX clients can check these metrics ● e.g., jconsole 43J Xで、JV の外側からアプリケーションの状態を確認し、メトリックを収集

  • 44. Copyright 1995-2018 Arm Limited (or its affiliates). All rights reserved. airframe-metrics ● Human Readable Data Format (ElapsedTime, DataSize, etc.) ● Handy Time Window String Support 44時間幅、区間、データサイズを人間を扱いやすい形式にし、ログの解析を効率化

  • 45. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Taking Snapshots of Data Analysis Tasks ● Save Long-Running Task Results As MessagePack (binary) ● Save the cost of re-computation Result: Seq[A] MessagePack Storage Pack Save Unpack Task Run Load Second Run: Load Compute (e.g., 10 min) First run Snapshot 45Airframe資産を活用して、データ解析結果をキャッシュし作業を効率化する

  • 46. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Module Mix-InPackaging HTTP Requests and Responses Data airframe-launcher > _ airframe-log production: port: 10010 user: xxxx ... airframe-config airframe-codec sbt-pack airframe-fluentd Scala Objects Table Data (CSV, TSV) JSON airframe-jsonairframe-surface airframe-tablet airframe-jmx Monitor Runtime States Generate Mapping Codec Metrics & Log Data JDBC ResultSets airframe-jdbc airframe-http airframe-http-finagle Launch HTTP Services airframe DI Debug Logs Schema-On-Read Mapping Airframe Airframeを中心にコード資産が形成されている

  • 47. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Resolving Technical Debts with Airframe Upgrade ● Migrate common programming patterns into Airframe ● Upgrade Airframe Version ● YY.MM.patch versioning: 19.5.x, 19.6.x, … ■ Easy to see how behind the project is from the latest version. ● Reduce code and logic duplications across components 47 Knowledges Experiences Design Decisions Products 24/7 Services Business Values Programming OSS Outcome Airframeをアップグレードする際に技術的負債を解消していく
 Airframe
  • 48. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Scala At Arm Treasure Data ● Scala is now an official language at Arm Treasure Data ● 0 -> 10+ engineers who can write Scala ● Use cases are growing: ● Query optimization, API, Spark, data analysis, storage systems, service operation, etc. ● We are happy to share our Scala assets through Airframe! 48 Add Your GitHub Star! wvlet/airframe Airframe calaエンジニアが充実してきたArm reasure Data。 calaの適用範囲も広がっている

  • 49. Copyright 1995-2019 Arm Limited (or its affiliates). All rights reserved. Presto Conference Tokyo 2019 ● July 11 (Thu), 2019, 13:30 ~ (Free) ● https://techplay.jp/event/733772 ● Inviting Presto Creators (Martin, Dain, David) ● Presto Software Foundation ● Talks from big Presto users in Japan ● Yahoo! JAPAN, LINE, Arm Treasure Data ● Presto Source Code Navigation 49 resto Conference okyo 2019を7/11(木) 13:30~より開催 (参加無料)

  • 50. Confidential © Arm 2017Confidential © Arm 2017Confidential © Arm 2017 Thank You! Danke! Merci! 谢谢! ありがとう! Gracias! Kiitos! 50