Here2Shop DevOps Practice
Hochi Chuang
Who am I
• 熟悉Java相關開發技術 - client & server
• ⾃自動化整合與測試
• 雲端系統建置
• Startup 相關經驗
• ⺫⽬目標 -> 打造⼀一個⼤大系統環境,在台灣
About Here2shop
• 利⽤用Big Data技術讓商家和消費者更互惠
• 致⼒力達成⼝口碑⾏行銷和顛覆傳統的社交平台
• 你不需要懂電商,因為電商懂你
• 官網 https://www.here2shop.com/
Agenda
• DevOps @ Here2Shop
• Development
• Test
• Deployment
• Monitoring
• EC 平台注意事項與安全性
• 經驗分享
DevOps
@Here2Shop
• Java + Spring + Jenkins
• Test (JUnit + selenium)
• Deploy (AWS)
• Monitor (logs + selenium)
• Feedback (human involved)
Development
• Agile Development
sprint — 2 weeks
daily scrum
review & planning
• Role
Scrum master
Product owner
Team
Case: Agile?
• EC — online Web Application
Prioritise!! — bugs, features, data, etc…
Plan — exceptions… 經營模式、發票、特殊規格、
3rd party API
Flexibility — easy to refactor…
Code quality & style — peer co-working
• redmine ticket
commit subject: “refs|fixes|close #xxx: doing something”
• Code
github flow @ gitlab
• merge request
gitlab + jenkins (gitlab merge request builder)
• deploy to DEV environment
QA vs master
• Every work goes into QA branch first
• DEV machine has the latest code
• Staging machine has a subset of passed tests
code
• master branch is always deploy-able
QA vs master
• Every work goes into QA branch first
• DEV machine has the latest code
• Staging machine has a subset of passed tests
code
• master branch is always deploy-able
BUT…
• DEVs be super CAREFUL!!!
merged? (QA or master)
• Complicated issue state
Resolved
Verified
Feedback
• Qualified code?
No code review
Peer comments
Github flow
@Gitlab…
We Expect
Case: in real world
• open source tools NOT integrated well
a)polling to build periodically…
b)cannot auto-update ticket status…
c)automation not yet ready…
• Keep DEV process in everyone’s mind!!!
• https://about.gitlab.com/2014/09/29/
gitlab-flow/
• http://www.15yan.com/story/
6yueHxcgD9Z/
Test
Continuous Test
• from: Understanding DevOps part 4
• DEV
deploy by each merge request
junit passed + BVT
• Daily automation
jenkins + selenium plugin (browse, login, logout, update product, search,
purchase, etc…)
• Acceptance Test on Staging - accessible from outside
Non RD team member
feature as design
data validation
3rd API integration - ⾦金流、簡訊
social media integration - Facebook, LINE, etc…
• Production
selenium - per hour
availability detector - uptimebutler.com, webmon.com
change detector
site links validation - xenu
Vitual Studio Load test
Xenu
Case: sth to know…
• wrong CSS layout - Sikuli
• Site speed tester
Google PageSpeed Insights
GTmetrix — https://gtmetrix.com/
• Google webmaster tool
Structured Data, Data Highlighter, HTML
Improvements
Deployment
to AWS
the first - manually
jars bastion
scp -r v001_20151203 bastion:~/
Web
Server 1
Web
Server 2
scp -r v001_20151203 172.1.0.xxx:~/
the first - manually
jars bastion
scp -r v001_20151203 bastion:~/
Web
Server 1
Web
Server 2
Painful
and
Erroneous
scp -r v001_20151203 172.1.0.xxx:~/
need to CHANGE!!
• static resources
CDN, so resources need versioning!!
• app server retrieves the latest build by itself
jenkins S3 plugin + script
• HA without downtime
AWS API + script
//cdn1r.here2shop.com/00396/css/default.css
AWS CLI
• HA of ELB
# update service
aws autoscaling enter-standby --instance-ids i-dadfc329 --auto-
scaling-group-name prod-asg --should-decrement-desired-capacity
aws autoscaling exit-standby --instance-ids i-dadfc329 --auto-
scaling-group-name prod-asgaws autoscaling
describe-auto-scaling-instances --instance-ids i-dadfc329
# create a new instance
ec2-run-instances ami-xxxxxxxx -t m3.medium -s subnet-xxxxxxxx -
k prod-key -g sg-xxxxxxxx --associate-public-ip-address true
aws autoscaling attach-instances --instance-ids i-109228e5 --
auto-scaling-group-name prod-asg
semi-auto
jars
bastion
Web
Server 1
Web
Server 2
# get latest jars from S3 bucket
java -jar latest-build.jar
S3
jenkins
Next goals
• pack static resources and separate from service jar
• one click to deploy
make 10+ deploy per day!!
• integrate with Hubot + slack
• rollback mechanism
challenge with Hibernate ORM
Monitoring
• CloudWatch —> alert notification
• still in stone age —> login, tail, vi, find & watch…
• lots of human involved actions
Tools
• PaperTrails / fluentd
• nagios
We Hope…
We Hope…
We Hope…
經驗分享
Case I: Spring boot
• spring boot is great for micro-service, but large project…
• pro
‣ convention over configuration
‣ standalone jar
• con
‣ eclipse & standalone jar NOT the same
‣ hard to replace a single static file…
Case II: Security Issue
• Redirect security concerns
nginx —> origin, md5 checksum by LUA
location ~ ^/(ad|edm)/(.*)* {
valid_referers none blocked server_names
*.here2shop.com;
if ($invalid_referer) {
return 403;
}
rewrite_by_lua "
HASH_KEY = 'secret_pass';
local redirect_url= ngx.unescape_uri(ngx.var['arg_r']);
local arg_checksum = ngx.var['arg_m'];
redirect_url_checksum = ngx.md5(redirect_url..HASH_KEY);
if(redirect_url_checksum==arg_checksum) then
return ngx.redirect(redirect_url, 302);
else
return ngx.exit(403);
end
";
}
Case III: more Security
• expose iframe
all site:
specific site:
X-Frame-Options: SAMEORIGIN
Content-Security-Policy:frame-ancestors http://example.com
Case IV: Facebook
• Facebook doesn’t like cloudfront domain…
d8adrk2lu91bp.cloudfront.net —> malicious domain
cdn1r.here2shop.com
Case V: caching
• 10k transactions in 16 hours
• concurrent: ~500
• hanging on single table —> move to Redis
• transaction:
from 5 min to 10 seconds
Thank You
https://www.here2shop.com
mail to: hochi.chuang@here2shop.com
Q&A

2015 jcconf-h2s-devops-practice

  • 1.
  • 2.
    Who am I •熟悉Java相關開發技術 - client & server • ⾃自動化整合與測試 • 雲端系統建置 • Startup 相關經驗 • ⺫⽬目標 -> 打造⼀一個⼤大系統環境,在台灣
  • 3.
    About Here2shop • 利⽤用BigData技術讓商家和消費者更互惠 • 致⼒力達成⼝口碑⾏行銷和顛覆傳統的社交平台 • 你不需要懂電商,因為電商懂你 • 官網 https://www.here2shop.com/
  • 4.
    Agenda • DevOps @Here2Shop • Development • Test • Deployment • Monitoring • EC 平台注意事項與安全性 • 經驗分享
  • 5.
  • 6.
    @Here2Shop • Java +Spring + Jenkins • Test (JUnit + selenium) • Deploy (AWS) • Monitor (logs + selenium) • Feedback (human involved)
  • 7.
  • 8.
    • Agile Development sprint— 2 weeks daily scrum review & planning • Role Scrum master Product owner Team
  • 9.
    Case: Agile? • EC— online Web Application Prioritise!! — bugs, features, data, etc… Plan — exceptions… 經營模式、發票、特殊規格、 3rd party API Flexibility — easy to refactor… Code quality & style — peer co-working
  • 11.
    • redmine ticket commitsubject: “refs|fixes|close #xxx: doing something” • Code github flow @ gitlab • merge request gitlab + jenkins (gitlab merge request builder) • deploy to DEV environment
  • 13.
    QA vs master •Every work goes into QA branch first • DEV machine has the latest code • Staging machine has a subset of passed tests code • master branch is always deploy-able
  • 14.
    QA vs master •Every work goes into QA branch first • DEV machine has the latest code • Staging machine has a subset of passed tests code • master branch is always deploy-able BUT…
  • 16.
    • DEVs besuper CAREFUL!!! merged? (QA or master) • Complicated issue state Resolved Verified Feedback • Qualified code? No code review Peer comments
  • 17.
  • 18.
  • 21.
    Case: in realworld • open source tools NOT integrated well a)polling to build periodically… b)cannot auto-update ticket status… c)automation not yet ready… • Keep DEV process in everyone’s mind!!!
  • 22.
  • 23.
  • 24.
    Continuous Test • from:Understanding DevOps part 4
  • 25.
    • DEV deploy byeach merge request junit passed + BVT • Daily automation jenkins + selenium plugin (browse, login, logout, update product, search, purchase, etc…)
  • 26.
    • Acceptance Teston Staging - accessible from outside Non RD team member feature as design data validation 3rd API integration - ⾦金流、簡訊 social media integration - Facebook, LINE, etc…
  • 27.
    • Production selenium -per hour availability detector - uptimebutler.com, webmon.com change detector site links validation - xenu Vitual Studio Load test
  • 28.
  • 30.
    Case: sth toknow… • wrong CSS layout - Sikuli • Site speed tester Google PageSpeed Insights GTmetrix — https://gtmetrix.com/ • Google webmaster tool Structured Data, Data Highlighter, HTML Improvements
  • 31.
  • 32.
  • 33.
    the first -manually jars bastion scp -r v001_20151203 bastion:~/ Web Server 1 Web Server 2 scp -r v001_20151203 172.1.0.xxx:~/
  • 34.
    the first -manually jars bastion scp -r v001_20151203 bastion:~/ Web Server 1 Web Server 2 Painful and Erroneous scp -r v001_20151203 172.1.0.xxx:~/
  • 35.
    need to CHANGE!! •static resources CDN, so resources need versioning!! • app server retrieves the latest build by itself jenkins S3 plugin + script • HA without downtime AWS API + script //cdn1r.here2shop.com/00396/css/default.css
  • 36.
    AWS CLI • HAof ELB # update service aws autoscaling enter-standby --instance-ids i-dadfc329 --auto- scaling-group-name prod-asg --should-decrement-desired-capacity aws autoscaling exit-standby --instance-ids i-dadfc329 --auto- scaling-group-name prod-asgaws autoscaling describe-auto-scaling-instances --instance-ids i-dadfc329 # create a new instance ec2-run-instances ami-xxxxxxxx -t m3.medium -s subnet-xxxxxxxx - k prod-key -g sg-xxxxxxxx --associate-public-ip-address true aws autoscaling attach-instances --instance-ids i-109228e5 -- auto-scaling-group-name prod-asg
  • 37.
    semi-auto jars bastion Web Server 1 Web Server 2 #get latest jars from S3 bucket java -jar latest-build.jar S3 jenkins
  • 38.
    Next goals • packstatic resources and separate from service jar • one click to deploy make 10+ deploy per day!! • integrate with Hubot + slack • rollback mechanism challenge with Hibernate ORM
  • 39.
  • 40.
    • CloudWatch —>alert notification • still in stone age —> login, tail, vi, find & watch… • lots of human involved actions
  • 42.
    Tools • PaperTrails /fluentd • nagios
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
    Case I: Springboot • spring boot is great for micro-service, but large project… • pro ‣ convention over configuration ‣ standalone jar • con ‣ eclipse & standalone jar NOT the same ‣ hard to replace a single static file…
  • 48.
    Case II: SecurityIssue • Redirect security concerns nginx —> origin, md5 checksum by LUA location ~ ^/(ad|edm)/(.*)* { valid_referers none blocked server_names *.here2shop.com; if ($invalid_referer) { return 403; } rewrite_by_lua " HASH_KEY = 'secret_pass'; local redirect_url= ngx.unescape_uri(ngx.var['arg_r']); local arg_checksum = ngx.var['arg_m']; redirect_url_checksum = ngx.md5(redirect_url..HASH_KEY); if(redirect_url_checksum==arg_checksum) then return ngx.redirect(redirect_url, 302); else return ngx.exit(403); end "; }
  • 49.
    Case III: moreSecurity • expose iframe all site: specific site: X-Frame-Options: SAMEORIGIN Content-Security-Policy:frame-ancestors http://example.com
  • 50.
    Case IV: Facebook •Facebook doesn’t like cloudfront domain… d8adrk2lu91bp.cloudfront.net —> malicious domain cdn1r.here2shop.com
  • 51.
    Case V: caching •10k transactions in 16 hours • concurrent: ~500 • hanging on single table —> move to Redis • transaction: from 5 min to 10 seconds
  • 52.
  • 53.