[246] foursquare데이터라이프사이클 설현준

DeView 2016
Data
Lifecycle @
Foursquare

@pabpbapapb
설현준Data Infrastructure Engineer

What is Foursquare?
Foursquare Codebase
Data Lifecycle
Aggregation
Processing
Serving
Storing
Monitoring
Agenda

#Foursquare
#Swarm
#Enterprise
#Pilgrim
#API
What is
Foursquare?

#scala
#python
#monorepo
#pants
Foursquare
Codebase

Architecture
Foursquare의 서버 코드는 Scala로
작성됩니다.

Architecture
Foursquare의 각종 Tool들은
Python으로 작성됩니다.

Architecture
Foursquare의 DB는
MongoDB입니다.

Architecture
Foursquare의 type system은
Thrift입니다.

Architecture
● Scala build time?
● Scala difficulty?
● Python + Scala?
● Thrift with Scala?
● Mongo <-> Thrift?

Architecture
Foursquare의 build tool은
Pants입니다.

Open Source
● Thrift <-> Scala <-> Mongo
○ Spindle
● Multi-language codebase
○ Pants
● SOA-based Monorepo
○ Works well with Pants!
○ fsq.io

Open Source
https://github.com/foursquare/fsqio

Themes
1. Save money
2. Solve problems
3. Make everything work together
4. Iron Triangle

#aggregation
#processing
#serving
#storing
Data
Lifecycle

Aggregation
● Application transactions
○ MongoDB
● Server-side logging
○ Thrift / json
● 3rd party data
○ S3

Aggregation - Mongo
Old method

Aggregation - Mongo
Old method
● Take LVM snapshot
● Upload snapshot to HDFS
○ Tar the data files, upload to HDFS.
● MongoDump to sequence files
○ downloads untar, start a mongod process
○ Scan all records, write out to bson sequence files in HDFS
○ One time conversion of bson → thrift sequence files

Aggregation - Mongo
New method

Aggregation - Mongo
New method
● log tailer copies Mongo transactions
● Upload to HDFS
● Diff vs. yesterday’s to make today’s data

Aggregation - Mongo
Incremental Approach

Aggregation - Thrift
● Server code logs Thrift events to Event Logger
● Log ingesting MapReduce job writes to HDFS

Aggregation - Thrift
미래에는 Kafka 10과 Gobblin을 이용할
예정

S3
● Used for Pinpoint
● 3rd party 데이터 배달이 S3로 이뤄짐
● 외부 데이터에 Foursquare 알고리즘
적용

S3
미래에는 EMR를 사용할 예정

Luigi
Open source project from Spotify

Luigi is a Python package that helps building complex
pipelines and handling all the plumbing typically
associated with long-running batch jobs.
It handles:
● dependency resolution
● workflow management
● visualization
● failures handling
● command line integration
● and much more...
Luigi

Luigi
● 손쉬운 작업간 의존성 정의
● Idempotency
● Hadoop 작업 외 다른 스크립팅 가능
● Scalding 지원
● Slack + Email alerts

• MapReduce
• Map / Reduce 모델이 모든 데이터 처리에 좋지는 않음
• Join 구현이 매우 복잡함
• Cascading
• MapReduce 대신 data flow를 구현하게 해주는 Java wrapper
• Data flow를 구현하면 계산 엔진이 작업을 MapReduce로 변환
• Java 특유의 verbosity 문제
• Scalding
• Scala로 구현한 Cascading
• 함수형 프로그래밍으로 데이터 처리를 구현
• 코드가 간결하고 유지보수가 쉬움
Scalding

Scalding
Data flow framework from Twitter
Open sourced!

• Data flow frameworks allow data
processing jobs to be expressed as a
series of operations on streams of data.
• Pros
• Composable - Share series of
operations between jobs.
• Simplifies - Complex joins are
much easier to write.
• Brevity - Faster iteration
• Functional programming style
is great for writing data flows.
• Cons
○ Adds complexity.
○ Debugging may require looking
behind the framework's
"magic."
○ Impacts performance
■ Time
■ Memory pressure
Scalding

Example: Word Count
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
Scalding
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

Example: Word Count
Scheme sourceScheme = new TextLine( new Fields( "line" ) );
Tap source = new Hfs( sourceScheme, inputPath );
Scheme sinkScheme = new TextLine( new Fields( "word", "count" ) );
Tap sink = new Hfs( sinkScheme, outputPath, SinkMode.REPLACE );
Pipe assembly = new Pipe( "wordcount" );
String regex = "(?<!pL)(?=pL)[^ ]*(?<=pL)(?!pL)";
Function function = new RegexGenerator( new Fields( "word" ), regex );
assembly = new Each( assembly, new Fields( "line" ), function );
assembly = new GroupBy( assembly, new Fields( "word" ) );
Aggregator count = new Count( new Fields( "count" ) );
assembly = new Every( assembly, count );
Properties properties = new Properties();
FlowConnector.setApplicationJarClass( properties, Main.class );
FlowConnector flowConnector = new FlowConnector( properties );
Flow flow = flowConnector.connect( "word-count", source, sink, assembly );
flow.complete();
Scalding

Example: Word Count
package com.twitter.scalding.examples
import com.twitter.scalding._
class WordCountJob(args : Args) extends Job(args) {
TextLine( args("input") )
.flatMap('line -> 'word) { line : String => tokenize(line) }
.groupBy('word) { _.size }
.write( Tsv( args("output") ) )
def tokenize(text : String) : Array[String] = {
text.toLowerCase.replaceAll("[^a-zA-Z0-9s]", "").split("s+")
}
}
Scalding

Scalding
● Join 구현: .join()로 끝
● 간단한 데이터 처리부터 Algebird를 이용한
분산 행렬 계산까지 모두 가능
● 그중 최고의 장점: Thrift
○ Type-safe API
● SpindleSequenceFile

Data Service
Serving해야 하는 데이터엔 어떤게
있을까?

Data Service
● Similar Venues (유사 장소)
● User Venue Visit (방문 데이터)
● User Region Aggregation (지역/유저 통계)
● and much more...

Data Service
● 금요일에 뉴욕 사람들이 가장 많이 밀집한
구역은?
● 서울에 매장을 내고 싶은데 유동인구가 가장
많은 구역은?
● 발렌타인 데이를 노린 광고의 광고효과가 가장
높을 성별/나이/위치 그룹은?

Data Service
Q: 온라인 서비스에서 serving이 가능하면서
Hadoop의 I/O로 사용할 수 있는 포맷이 있을까?

HFile
● 불변 K/V 저장 방식
● Thrift를 이용해 typing
● 파일 작성시 정렬이 되어있기 때문에 빠른 Seek
● Sharding으로 한 dataset에서 높은 QPS 지원
● Foursquare에서 자체제작한 파일서버로 손쉽게
관리
● MapReduce나 Scalding Job의 output으로 생성

HFile
Example: Similar Venues
사용자
(Mongo)
사용자/방문
(HFile)
장소
(Mongo)
사용자별
공동방문
계산
유사 장소
(HFile)

SQL
Non-engineer 인력이 손쉽게 데이터 분석을 할 수 있게
해주는 솔루션
● Hive
● Redshift
● Presto

Hive
● Thrift 기반의 기록들을 SQL 질의로 계산할 수 있음
● 현재까지 Foursquare 내에서 가장 많이 쓰이고 있는
SQL 솔루션
● Hadoop을 이미 사용하고 있기 때문에 추가적인
설치 필요없음
● 역시 Join이 편리함

Hive
Engineer -> MapReduce
Analyst -> SQL
Sales, business development -> ???

Hive
Beekeeper
● Templates
● User-defined functions
● Substitutions
● Scheduled tasks
● Search previous tasks

Hive
Stolen from http://www.slideshare.net/bayersven/sven-bayerhivequerycompilerdatageeks

Redshift
● Relational DB from AWS
● Hadoop을 거치지 않고 질의 가능
● Columnar DB라는 구조로 최적화에 신경쓰면 꽤
좋은 성능을 얻을 수 있음
● 주로 실험 결과 분석과 대시보드 계산에 사용
● HDFS <-> Redshift 데이터 이동 필요

Presto
● Open source SQL engine from Facebook
● Hadoop을 거치지 않고 질의 가능
● In-memory computation
● Really, Really, Really fast
● Hive connector를 사용해 Foursquare HDFS에 있는
thrift를 그대로 사용 가능

Presto
● Dedicated presto boxes
○ $$$, 질의가 없을 시 장비의 낭비
● Co-location on Hadoop boxes
○ 배포가 까다롭고 올바른 배포 과정을 찾을 때
까지의 iteration이 지나치게 힘듦
○ Netflix, Facebook에서 사용하는 방식
● Yarn
○ Hadoop이 해야 하는 작업과 리소스 경쟁

Presto
● Presto-Yarn
○ Apache Slider를 통해 Yarn이 presto를 배포하고
관리하게 하는 OSS
○ 올바른 설정을 찾기 까지 힘들지만 (직접
배포하는것보다는 훨씬 쉬움) 설정 이후
배포/관리가 굉장히 쉬움

Presto
Stolen from http://blog.tzolov.net/2014/07/hadoop-yarn-mr2-memory-model.html

Presto
From our experience:
● “Larger”, fewer boxes > “smaller” many boxes
● Each worker needs more than 1 vcores
● Container memory <-> JVM memory <-> Presto
memory
● Yarn labels can help debug

Backup
● HDFS space isn’t free
○ 매일 HDFS <-> S3 백업
○ 일정 기간이 지나면 HDFS에서 삭제
● S3 also isn’t free
○ 일정 기간이 지나면 Glacier로 변환
○ Glacier Pricing: $0.007 per GB / month

Retention
● HFile, Hive tables have retention policy
○ Collection 자체가 늘어나지 않으면 HDFS 용량
역시 일정 한도 내에서 머물 수 있음
● Retention때문에 필요한 데이터가 지워졌다?
○ Job을 다시 돌리면 다시 얻을 수 있음

Compression
● 기본 압축 방식: Snappy
○ Fast read, low compression
● HDFS에는 있어야 하지만 시간이 지나 많이
사용되지 않는 데이터: Gzip
○ Slow read, high compression
● 로그 백업
○ Snappy -> Gzip, Gzip to S3, replace Snappy
with Gzip after n days

#timberlake
#inviso
#elasticsearch
#kibana
Monitoring

Hardware Stats
Graphite
● Simple graph metric store/display

Hardware Stats
Statmon - Statistics monitor
● Counters
● Gauges
○ Heap space, CPU, queue length, etc...
● Timing
○ 병목 감지에 도움
Gmond
● exports hardware stats

Hardware Stats
Useful stats (Hadoop):
● Hadoop
○ CPU usage / role, rack
○ Network stats (HDFS <-> AWS)
● Kafka
○ Bytes In/Bytes Out
○ Producer requests/s, Consumer fetch/s
○ GC time
○ SSD read/write time

Hadoop Stats
Cloudera Manager
● HDFS alerts
○ HDFS Bytes/Blocks read/written
○ RPC Connections
● YARN alerts
○ RM health
○ Jobs that run too long
○ Failing tasks

Timberlake
● Problem: Jobtracker is slow.
● 생산성의 가장 큰 걸림돌: 오래 걸리는 프로세스
● Job 실행 -> Jobtracker이 느려서 모니터링하기
귀찮음 -> 모니터링하지 않고 자원 소모 -> 한
Job때문에 모두가 피해

Timberlake
● Problem: Jobtracker is uninformative
● 왜 내 Job이 실패하지?
● Log를 보기까지 click, click, click...

Timberlake
github.com/stripe/timberlake
● Real-time job tracker written in Golang/React
● 실시간으로 job tracking, MR 상태 파악
● Job의 I/O, workflow, 실패시 손쉬운 log 확인

Inviso
github.com/netflix/inviso
● Hadoop 분석을 위한 최고의 OSS
● 기본 원리
○ Python 프로세스가 1분마다 loop
■ Job 완료/실패 후 생성되는 xml/json -> ES
○ Tomcat 서버에서 ES를 query 후 view
● MR1, MR2, EMR, S3 지원

Inviso
● 기본적으로 Inviso는 ES 1.0+을 지원
● ES 2+로 포팅할 경우 Kibana 사용가능
○ . 를 모두 _로 변환
○ Timestamp handling
○ Inviso-imported stats != all available stats
○ 원하는 stat은 추가, 필요 없는 stat은 제거
■ CPU time
■ Pool-based resource usage

Dr. Elephant
● LinkedIn에서 나온 OSS
● Performance Monitoring for Hadoop &
Spark
● Heuristic을 이용해 job 성능에 점수를
매김

Dr. Elephant
● Server eng. writing jobs vs. Data eng.
writing jobs
● Dr. E is only good for former, not latter

Wrapping Up
● Engineering based on philosophy
● Solve problems
○ It would be better if we solved problems before
they became problems
● Always be monitoring
○ Monitoring isn’t really fun
○ So make it easier/more fun to monitor!

[246] foursquare데이터라이프사이클 설현준

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to [246] foursquare데이터라이프사이클 설현준

Similar to [246] foursquare데이터라이프사이클 설현준 (20)

More from NAVER D2

More from NAVER D2 (20)

[246] foursquare데이터라이프사이클 설현준