SlideShare a Scribd company logo

SMACK Dev Experience

Chih-Hsuan Hsu
Chih-Hsuan Hsu
Chih-Hsuan HsuPilotTV Data Engineer

2017 DataCon Talk

SMACK Dev Experience

1 of 45
Download to read offline
使用SMACK開發小型 Hybrid DB 系統
(踩過的坑)心得分享
許致軒 (Joe)
PilotTV / Data Engineer
● Chih-Hsuan Hsu (Joe)
● PilotTV Data Engineer
● Interestd in SMACK/ELK architecture
● 技術書籍譯者
○ Spark學習手冊
○ Cassandra技術手冊
● LinkIn:www.linkedin.com/in/joechh
● Mail:joechh731126@gmail.com
2
About Me
既有系統改造Story
● RDB負載與日俱增
● 單點失效
● 想將Batch Mode改造成Straming-based Data Flow
● 第一階段為了簡化只採用SMACK中的Spark、Kafka與Cassandra
○ 因為還不熟Mesos跟Akka…QQ
3
系統改造前/(預期)改造後
ETL
New ETL with
Kafka Producer
4
New ETL
● 以Java實做
● ETL結果會產生Json-format串流資料
● 透過Kafka Producer API將Json Streaming送到Kafka Cluster
︰用預設值建立的Kafka Producer throughput太低
New ETL with
Kafka Producer
5
開始研究Kafka Producer參數實驗(0.8.2)
參數 預設值 可用選項
producer.type sync sync, async
compression.codec none none, gzip, snappy
batch.num.messages 200 unlimited
request.required.acks 0 -1, 0, 1
queue.buffering.max.messages 10000 unlimited
https://kafka.apache.org/082/documentation.html
6

More Related Content

What's hot

Establish The Core of Cloud Computing Application by Using Hazelcast (Chinese)
Establish The Core of  Cloud Computing Application  by Using Hazelcast (Chinese)Establish The Core of  Cloud Computing Application  by Using Hazelcast (Chinese)
Establish The Core of Cloud Computing Application by Using Hazelcast (Chinese)Joseph Kuo
 
2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform SecurityJazz Yao-Tsung Wang
 
Data Analyse Black Horse - ClickHouse
Data Analyse Black Horse - ClickHouseData Analyse Black Horse - ClickHouse
Data Analyse Black Horse - ClickHouseJack Gao
 
Serverless Event Streaming with Pulsar Functions-xiaolong
Serverless Event Streaming with Pulsar Functions-xiaolongServerless Event Streaming with Pulsar Functions-xiaolong
Serverless Event Streaming with Pulsar Functions-xiaolongStreamNative
 
Elastic stack day-2
Elastic stack day-2Elastic stack day-2
Elastic stack day-2YI-CHING WU
 
Distributed Data Analytics at Taobao
Distributed Data Analytics at TaobaoDistributed Data Analytics at Taobao
Distributed Data Analytics at TaobaoMin Zhou
 
京东实时消息队列JDQ技术实践与探索
京东实时消息队列JDQ技术实践与探索京东实时消息队列JDQ技术实践与探索
京东实时消息队列JDQ技术实践与探索confluent
 
The Practice of Apache Pulsar for Logging in China Mobile - Pulsar Summit Asi...
The Practice of Apache Pulsar for Logging in China Mobile - Pulsar Summit Asi...The Practice of Apache Pulsar for Logging in China Mobile - Pulsar Summit Asi...
The Practice of Apache Pulsar for Logging in China Mobile - Pulsar Summit Asi...StreamNative
 
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構Etu Solution
 
How to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentHow to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentAnna Yen
 
淘宝Hadoop数据分析实践
淘宝Hadoop数据分析实践淘宝Hadoop数据分析实践
淘宝Hadoop数据分析实践Min Zhou
 
ELK Stack - Kibana操作實務
ELK Stack - Kibana操作實務ELK Stack - Kibana操作實務
ELK Stack - Kibana操作實務Kedy Chang
 
豆瓣网技术架构变迁
豆瓣网技术架构变迁豆瓣网技术架构变迁
豆瓣网技术架构变迁reinhardx
 
准实时海量数据分析系统架构探究
准实时海量数据分析系统架构探究准实时海量数据分析系统架构探究
准实时海量数据分析系统架构探究Min Zhou
 
Elastic stack day-1
Elastic stack day-1Elastic stack day-1
Elastic stack day-1YI-CHING WU
 
The Evolution of Data Systems
The Evolution of Data SystemsThe Evolution of Data Systems
The Evolution of Data Systems宇 傅
 
Cephfs架构解读和测试分析
Cephfs架构解读和测试分析Cephfs架构解读和测试分析
Cephfs架构解读和测试分析Yang Guanjun
 
分布式文件实践经验交流
分布式文件实践经验交流分布式文件实践经验交流
分布式文件实践经验交流凯 李
 

What's hot (20)

Establish The Core of Cloud Computing Application by Using Hazelcast (Chinese)
Establish The Core of  Cloud Computing Application  by Using Hazelcast (Chinese)Establish The Core of  Cloud Computing Application  by Using Hazelcast (Chinese)
Establish The Core of Cloud Computing Application by Using Hazelcast (Chinese)
 
2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security
 
Data Analyse Black Horse - ClickHouse
Data Analyse Black Horse - ClickHouseData Analyse Black Horse - ClickHouse
Data Analyse Black Horse - ClickHouse
 
Serverless Event Streaming with Pulsar Functions-xiaolong
Serverless Event Streaming with Pulsar Functions-xiaolongServerless Event Streaming with Pulsar Functions-xiaolong
Serverless Event Streaming with Pulsar Functions-xiaolong
 
Elasticsearch 簡介
Elasticsearch 簡介Elasticsearch 簡介
Elasticsearch 簡介
 
Elastic stack day-2
Elastic stack day-2Elastic stack day-2
Elastic stack day-2
 
Distributed Data Analytics at Taobao
Distributed Data Analytics at TaobaoDistributed Data Analytics at Taobao
Distributed Data Analytics at Taobao
 
京东实时消息队列JDQ技术实践与探索
京东实时消息队列JDQ技术实践与探索京东实时消息队列JDQ技术实践与探索
京东实时消息队列JDQ技术实践与探索
 
The Practice of Apache Pulsar for Logging in China Mobile - Pulsar Summit Asi...
The Practice of Apache Pulsar for Logging in China Mobile - Pulsar Summit Asi...The Practice of Apache Pulsar for Logging in China Mobile - Pulsar Summit Asi...
The Practice of Apache Pulsar for Logging in China Mobile - Pulsar Summit Asi...
 
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構
Track A-3 Enterprise Data Lake in Action - 搭建「活」的企業 Big Data 生態架構
 
How to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentHow to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environment
 
淘宝Hadoop数据分析实践
淘宝Hadoop数据分析实践淘宝Hadoop数据分析实践
淘宝Hadoop数据分析实践
 
ELK Stack - Kibana操作實務
ELK Stack - Kibana操作實務ELK Stack - Kibana操作實務
ELK Stack - Kibana操作實務
 
豆瓣网技术架构变迁
豆瓣网技术架构变迁豆瓣网技术架构变迁
豆瓣网技术架构变迁
 
准实时海量数据分析系统架构探究
准实时海量数据分析系统架构探究准实时海量数据分析系统架构探究
准实时海量数据分析系统架构探究
 
Elastic stack day-1
Elastic stack day-1Elastic stack day-1
Elastic stack day-1
 
The Evolution of Data Systems
The Evolution of Data SystemsThe Evolution of Data Systems
The Evolution of Data Systems
 
Cephfs架构解读和测试分析
Cephfs架构解读和测试分析Cephfs架构解读和测试分析
Cephfs架构解读和测试分析
 
分布式文件实践经验交流
分布式文件实践经验交流分布式文件实践经验交流
分布式文件实践经验交流
 
Hantuo openstack
Hantuo openstackHantuo openstack
Hantuo openstack
 

Similar to SMACK Dev Experience

Kafka cluster best practices
Kafka cluster best practicesKafka cluster best practices
Kafka cluster best practicesRico Chen
 
Kmeans in-hadoop
Kmeans in-hadoopKmeans in-hadoop
Kmeans in-hadoopTianwei Liu
 
ELK 交流学习
ELK 交流学习ELK 交流学习
ELK 交流学习杨文 陈
 
Spark性能调优分享
Spark性能调优分享Spark性能调优分享
Spark性能调优分享Wenchun Xu
 
How do we manage more than one thousand of Pegasus clusters - backend part
How do we manage more than one thousand of Pegasus clusters - backend partHow do we manage more than one thousand of Pegasus clusters - backend part
How do we manage more than one thousand of Pegasus clusters - backend partacelyc1112009
 
Pegasus: Designing a Distributed Key Value System (Arch summit beijing-2016)
Pegasus: Designing a Distributed Key Value System (Arch summit beijing-2016)Pegasus: Designing a Distributed Key Value System (Arch summit beijing-2016)
Pegasus: Designing a Distributed Key Value System (Arch summit beijing-2016)涛 吴
 
Hacking Nginx at Taobao
Hacking Nginx at TaobaoHacking Nginx at Taobao
Hacking Nginx at TaobaoJoshua Zhu
 
How does Apache Pegasus used in SensorsData
How does Apache Pegasusused in SensorsDataHow does Apache Pegasusused in SensorsData
How does Apache Pegasus used in SensorsDataacelyc1112009
 
基于Symfony框架下的快速企业级应用开发
基于Symfony框架下的快速企业级应用开发基于Symfony框架下的快速企业级应用开发
基于Symfony框架下的快速企业级应用开发mysqlops
 
Spark在苏宁云商的实践及经验分享
Spark在苏宁云商的实践及经验分享Spark在苏宁云商的实践及经验分享
Spark在苏宁云商的实践及经验分享alipay
 
Java线上应用问题排查方法和工具(空望)
Java线上应用问题排查方法和工具(空望)Java线上应用问题排查方法和工具(空望)
Java线上应用问题排查方法和工具(空望)ykdsg
 
配置Oracle 10g 双向流复制
配置Oracle 10g 双向流复制配置Oracle 10g 双向流复制
配置Oracle 10g 双向流复制maclean liu
 
The Construction and Practice of Apache Pegasus in Offline and Online Scenari...
The Construction and Practice of Apache Pegasus in Offline and Online Scenari...The Construction and Practice of Apache Pegasus in Offline and Online Scenari...
The Construction and Practice of Apache Pegasus in Offline and Online Scenari...acelyc1112009
 
Lamp高性能设计
Lamp高性能设计Lamp高性能设计
Lamp高性能设计锐 张
 
Apache Kylin Data Summit 2019: Kyligence Presentation
Apache Kylin Data Summit 2019: Kyligence PresentationApache Kylin Data Summit 2019: Kyligence Presentation
Apache Kylin Data Summit 2019: Kyligence PresentationTyler Wishnoff
 
Exadata那点事
Exadata那点事Exadata那点事
Exadata那点事freezr
 
Golang 高性能实战
Golang 高性能实战Golang 高性能实战
Golang 高性能实战rfyiamcool
 
探索 ISTIO 新型 DATA PLANE 架構 AMBIENT MESH - GOLANG TAIWAN GATHERING #77 X CNTUG
探索 ISTIO 新型 DATA PLANE 架構 AMBIENT MESH - GOLANG TAIWAN GATHERING #77 X CNTUG探索 ISTIO 新型 DATA PLANE 架構 AMBIENT MESH - GOLANG TAIWAN GATHERING #77 X CNTUG
探索 ISTIO 新型 DATA PLANE 架構 AMBIENT MESH - GOLANG TAIWAN GATHERING #77 X CNTUGYingSiang Geng
 

Similar to SMACK Dev Experience (20)

Kafka cluster best practices
Kafka cluster best practicesKafka cluster best practices
Kafka cluster best practices
 
Kmeans in-hadoop
Kmeans in-hadoopKmeans in-hadoop
Kmeans in-hadoop
 
ELK 交流学习
ELK 交流学习ELK 交流学习
ELK 交流学习
 
Spark性能调优分享
Spark性能调优分享Spark性能调优分享
Spark性能调优分享
 
How do we manage more than one thousand of Pegasus clusters - backend part
How do we manage more than one thousand of Pegasus clusters - backend partHow do we manage more than one thousand of Pegasus clusters - backend part
How do we manage more than one thousand of Pegasus clusters - backend part
 
Pegasus: Designing a Distributed Key Value System (Arch summit beijing-2016)
Pegasus: Designing a Distributed Key Value System (Arch summit beijing-2016)Pegasus: Designing a Distributed Key Value System (Arch summit beijing-2016)
Pegasus: Designing a Distributed Key Value System (Arch summit beijing-2016)
 
Hacking Nginx at Taobao
Hacking Nginx at TaobaoHacking Nginx at Taobao
Hacking Nginx at Taobao
 
Cdc@ganji.com
Cdc@ganji.comCdc@ganji.com
Cdc@ganji.com
 
Kafka in Depth
Kafka in DepthKafka in Depth
Kafka in Depth
 
How does Apache Pegasus used in SensorsData
How does Apache Pegasusused in SensorsDataHow does Apache Pegasusused in SensorsData
How does Apache Pegasus used in SensorsData
 
基于Symfony框架下的快速企业级应用开发
基于Symfony框架下的快速企业级应用开发基于Symfony框架下的快速企业级应用开发
基于Symfony框架下的快速企业级应用开发
 
Spark在苏宁云商的实践及经验分享
Spark在苏宁云商的实践及经验分享Spark在苏宁云商的实践及经验分享
Spark在苏宁云商的实践及经验分享
 
Java线上应用问题排查方法和工具(空望)
Java线上应用问题排查方法和工具(空望)Java线上应用问题排查方法和工具(空望)
Java线上应用问题排查方法和工具(空望)
 
配置Oracle 10g 双向流复制
配置Oracle 10g 双向流复制配置Oracle 10g 双向流复制
配置Oracle 10g 双向流复制
 
The Construction and Practice of Apache Pegasus in Offline and Online Scenari...
The Construction and Practice of Apache Pegasus in Offline and Online Scenari...The Construction and Practice of Apache Pegasus in Offline and Online Scenari...
The Construction and Practice of Apache Pegasus in Offline and Online Scenari...
 
Lamp高性能设计
Lamp高性能设计Lamp高性能设计
Lamp高性能设计
 
Apache Kylin Data Summit 2019: Kyligence Presentation
Apache Kylin Data Summit 2019: Kyligence PresentationApache Kylin Data Summit 2019: Kyligence Presentation
Apache Kylin Data Summit 2019: Kyligence Presentation
 
Exadata那点事
Exadata那点事Exadata那点事
Exadata那点事
 
Golang 高性能实战
Golang 高性能实战Golang 高性能实战
Golang 高性能实战
 
探索 ISTIO 新型 DATA PLANE 架構 AMBIENT MESH - GOLANG TAIWAN GATHERING #77 X CNTUG
探索 ISTIO 新型 DATA PLANE 架構 AMBIENT MESH - GOLANG TAIWAN GATHERING #77 X CNTUG探索 ISTIO 新型 DATA PLANE 架構 AMBIENT MESH - GOLANG TAIWAN GATHERING #77 X CNTUG
探索 ISTIO 新型 DATA PLANE 架構 AMBIENT MESH - GOLANG TAIWAN GATHERING #77 X CNTUG
 

SMACK Dev Experience