Scalding - Hadoop Word Count in LESS than 70 lines of code

  • 6,886 views
Uploaded on

Twitter Scalding is built on top of Cascading, which is built on top of Hadoop. It's basically a very nice to read and extend DSL for writing map reduce jobs.

Twitter Scalding is built on top of Cascading, which is built on top of Hadoop. It's basically a very nice to read and extend DSL for writing map reduce jobs.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
6,886
On Slideshare
0
From Embeds
0
Number of Embeds
11

Actions

Shares
Downloads
90
Comments
0
Likes
16

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. ScaldingHadoop Word Countin < 70 lines of code Konrad ktoso Malawski JARCamp #3 12.04.2013
  • 2. ScaldingHadoop Word Count in 4 lines of code Konrad ktoso Malawski JARCamp #3 12.04.2013
  • 3. softwaremill.com / java.pl / sckrk.com / geecon.org / krakowscala.pl / gdgkrakow.pl
  • 4. Agenda
  • 5. AgendaWhy Scalding? (10%)
  • 6. AgendaWhy Scalding? (10%) +
  • 7. AgendaWhy Scalding? (10%) +Hadoop Basics (20%)
  • 8. AgendaWhy Scalding? (10%) +Hadoop Basics (20%) +
  • 9. Agenda Why Scalding? (10%) + Hadoop Basics (20%) +Enter Cascading (40%)
  • 10. Agenda Why Scalding? (10%) + Hadoop Basics (20%) +Enter Cascading (40%) +
  • 11. Agenda Why Scalding? (10%) + Hadoop Basics (20%) +Enter Cascading (40%) + Hello Scalding (30%)
  • 12. Agenda Why Scalding? (10%) + Hadoop Basics (20%) +Enter Cascading (40%) + Hello Scalding (30%) =
  • 13. Agenda Why Scalding? (10%) + Hadoop Basics (20%) +Enter Cascading (40%) + Hello Scalding (30%) = 100%
  • 14. Why Scalding? Word Count in Typestype Word = Stringtype Count = IntString => Map[Word, Count]
  • 15. Why Scalding? Word Count in Scala
  • 16. Why Scalding? Word Count in Scalaval text = "a a a b b"
  • 17. Why Scalding? Word Count in Scalaval text = "a a a b b"def wordCount(text: String): Map[Word, Count] =
  • 18. Why Scalding? Word Count in Scalaval text = "a a a b b"def wordCount(text: String): Map[Word, Count] = text
  • 19. Why Scalding? Word Count in Scalaval text = "a a a b b"def wordCount(text: String): Map[Word, Count] = text .split(" ")
  • 20. Why Scalding? Word Count in Scalaval text = "a a a b b"def wordCount(text: String): Map[Word, Count] = text .split(" ") .map(a => (a, 1))
  • 21. Why Scalding? Word Count in Scalaval text = "a a a b b"def wordCount(text: String): Map[Word, Count] = text .split(" ") .map(a => (a, 1)) .groupBy(_._1)
  • 22. Why Scalding? Word Count in Scalaval text = "a a a b b"def wordCount(text: String): Map[Word, Count] = text .split(" ") .map(a => (a, 1)) .groupBy(_._1) .map { a => a._1 -> a._2.map(_._2).sum }
  • 23. Why Scalding? Word Count in Scalaval text = "a a a b b"def wordCount(text: String): Map[Word, Count] = text .split(" ") .map(a => (a, 1)) .groupBy(_._1) .map { a => a._1 -> a._2.map(_._2).sum }wordCount(text) should equal (Map("a" -> 3), ("b" -> 2)))
  • 24. Stuff > MemoryScala collections... fun but, memory bound!val text = "so many words... waaah! ..." text .split(" ") .map(a => (a, 1)) .groupBy(_._1) .map(a => (a._1, a._2.map(_._2).sum))
  • 25. Stuff > MemoryScala collections... fun but, memory bound! in Memoryval text = "so many words... waaah! ..." text .split(" ") .map(a => (a, 1)) .groupBy(_._1) .map(a => (a._1, a._2.map(_._2).sum))
  • 26. Stuff > MemoryScala collections... fun but, memory bound! in Memoryval text = "so many words... waaah! ..." in Memory text .split(" ") .map(a => (a, 1)) .groupBy(_._1) .map(a => (a._1, a._2.map(_._2).sum))
  • 27. Stuff > MemoryScala collections... fun but, memory bound! in Memoryval text = "so many words... waaah! ..." in Memory text in Memory .split(" ") .map(a => (a, 1)) .groupBy(_._1) .map(a => (a._1, a._2.map(_._2).sum))
  • 28. Stuff > MemoryScala collections... fun but, memory bound! in Memoryval text = "so many words... waaah! ..." in Memory text in Memory .split(" ") .map(a => (a, 1)) in Memory .groupBy(_._1) .map(a => (a._1, a._2.map(_._2).sum))
  • 29. Stuff > MemoryScala collections... fun but, memory bound! in Memoryval text = "so many words... waaah! ..." in Memory text in Memory .split(" ") .map(a => (a, 1)) in Memory .groupBy(_._1) .map(a => (a._1, a._2.map(_._2).sum)) in Memory
  • 30. Apache Hadoop (HDFS + MR) http://hadoop.apache.org/
  • 31. Why Scalding? Word Count in Hadoop MRpackage org.myorg;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapred.*;import java.io.IOException;import java.util.Iterator;import java.util.StringTokenizer;public class WordCount { public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throIOException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); output.collect(word, one);
  • 32. private final static IntWritable one = new IntWritable(1); Why Scalding? private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throIOException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); Word Count in Hadoop MR output.collect(word, one); } } } public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporterreporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } } public static void main(String[] args) throws Exception { JobConf conf = new JobConf(WordCount.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); JobClient.runJob(conf); }}
  • 33. Trivia: How old is Hadoop?
  • 34. Cascadingwww.cascading.org/
  • 35. Cascadingwww.cascading.org/
  • 36. Cascading is
  • 37. Cascading isTaps & Pipes
  • 38. Cascading isTaps & Pipes & Sinks
  • 39. 1: Distributed Copy
  • 40. 1: Distributed Copy
  • 41. 1: Distributed Copy// source TapTap inTap = new Hfs(new TextDelimited(true, "t"), inPath);
  • 42. 1: Distributed Copy// source TapTap inTap = new Hfs(new TextDelimited(true, "t"), inPath);// sink TapTap outTap = new Hfs(new TextDelimited(true, "t"), outPath);
  • 43. 1: Distributed Copy// source TapTap inTap = new Hfs(new TextDelimited(true, "t"), inPath);// sink TapTap outTap = new Hfs(new TextDelimited(true, "t"), outPath);// a Pipe, connects tapsPipe copyPipe = new Pipe("copy");
  • 44. 1: Distributed Copy// source TapTap inTap = new Hfs(new TextDelimited(true, "t"), inPath);// sink TapTap outTap = new Hfs(new TextDelimited(true, "t"), outPath);// a Pipe, connects tapsPipe copyPipe = new Pipe("copy");// build the FlowFlowDef flowDef = FlowDef.flowDef()
  • 45. 1: Distributed Copy// source TapTap inTap = new Hfs(new TextDelimited(true, "t"), inPath);// sink TapTap outTap = new Hfs(new TextDelimited(true, "t"), outPath);// a Pipe, connects tapsPipe copyPipe = new Pipe("copy");// build the FlowFlowDef flowDef = FlowDef.flowDef() .addSource( copyPipe, inTap )
  • 46. 1: Distributed Copy// source TapTap inTap = new Hfs(new TextDelimited(true, "t"), inPath);// sink TapTap outTap = new Hfs(new TextDelimited(true, "t"), outPath);// a Pipe, connects tapsPipe copyPipe = new Pipe("copy");// build the FlowFlowDef flowDef = FlowDef.flowDef() .addSource(copyPipe, inTap) .addTailSink(copyPipe, outTap);
  • 47. 1: Distributed Copy// source TapTap inTap = new Hfs(new TextDelimited(true, "t"), inPath);// sink TapTap outTap = new Hfs(new TextDelimited(true, "t"), outPath);// a Pipe, connects tapsPipe copyPipe = new Pipe("copy");// build the FlowFlowDef flowDef = FlowDef.flowDef() .addSource(copyPipe, inTap) .addTailSink(copyPipe, outTap);// run!flowConnector.connect(flowDef).complete();
  • 48. 1. DCP - Full Codepublic class Main {public static void main(String[] args ) { String inPath = args[0]; String outPath = args[1]; Properties props = new Properties(); AppProps.setApplicationJarClass(properties, Main.class); HadoopFlowConnector flowConnector = new HadoopFlowConnector(props); Tap inTap = new Hfs( new TextDelimited(true, "t"), inPath); Tap outTap = new Hfs(new TextDelimited(true, "t"), outPath); Pipe copyPipe = new Pipe("copy"); FlowDef flowDef = FlowDef.flowDef() .addSource(copyPipe, inTap) .addTailSink(copyPipe, outTap); flowConnector.connect(flowDef).complete();}}
  • 49. 1. DCP - Full Codepublic class Main {public static void main(String[] args ) { String inPath = args[0]; String outPath = args[1]; Properties props = new Properties(); AppProps.setApplicationJarClass(properties, Main.class); HadoopFlowConnector flowConnector = new HadoopFlowConnector(props); Tap inTap = new Hfs( new TextDelimited(true, "t"), inPath); Tap outTap = new Hfs(new TextDelimited(true, "t"), outPath); Pipe copyPipe = new Pipe("copy"); FlowDef flowDef = FlowDef.flowDef() .addSource(copyPipe, inTap) .addTailSink(copyPipe, outTap); flowConnector.connect(flowDef).complete();}}
  • 50. 1. DCP - Full Codepublic class Main {public static void main(String[] args ) { String inPath = args[0]; String outPath = args[1]; Properties props = new Properties(); AppProps.setApplicationJarClass(properties, Main.class); HadoopFlowConnector flowConnector = new HadoopFlowConnector(props); Tap inTap = new Hfs( new TextDelimited(true, "t"), inPath); Tap outTap = new Hfs(new TextDelimited(true, "t"), outPath); Pipe copyPipe = new Pipe("copy"); FlowDef flowDef = FlowDef.flowDef() .addSource(copyPipe, inTap) .addTailSink(copyPipe, outTap); flowConnector.connect(flowDef).complete();}}
  • 51. 1. DCP - Full Codepublic class Main {public static void main(String[] args ) { String inPath = args[0]; String outPath = args[1]; Properties props = new Properties(); AppProps.setApplicationJarClass(properties, Main.class); HadoopFlowConnector flowConnector = new HadoopFlowConnector(props); Tap inTap = new Hfs( new TextDelimited(true, "t"), inPath); Tap outTap = new Hfs(new TextDelimited(true, "t"), outPath); Pipe copyPipe = new Pipe("copy"); FlowDef flowDef = FlowDef.flowDef() .addSource(copyPipe, inTap) .addTailSink(copyPipe, outTap); flowConnector.connect(flowDef).complete();}}
  • 52. 1. DCP - Full Codepublic class Main {public static void main(String[] args ) { String inPath = args[0]; String outPath = args[1]; Properties props = new Properties(); AppProps.setApplicationJarClass(properties, Main.class); HadoopFlowConnector flowConnector = new HadoopFlowConnector(props); Tap inTap = new Hfs( new TextDelimited(true, "t"), inPath); Tap outTap = new Hfs(new TextDelimited(true, "t"), outPath); Pipe copyPipe = new Pipe("copy"); FlowDef flowDef = FlowDef.flowDef() .addSource(copyPipe, inTap) .addTailSink(copyPipe, outTap); flowConnector.connect(flowDef).complete();}}
  • 53. 1. DCP - Full Codepublic class Main {public static void main(String[] args ) { String inPath = args[0]; String outPath = args[1]; Properties props = new Properties(); AppProps.setApplicationJarClass(properties, Main.class); HadoopFlowConnector flowConnector = new HadoopFlowConnector(props); Tap inTap = new Hfs( new TextDelimited(true, "t"), inPath); Tap outTap = new Hfs(new TextDelimited(true, "t"), outPath); Pipe copyPipe = new Pipe("copy"); FlowDef flowDef = FlowDef.flowDef() .addSource(copyPipe, inTap) .addTailSink(copyPipe, outTap); flowConnector.connect(flowDef).complete();}}
  • 54. 2: Word CountString docPath = args[ 0 ];String wcPath = args[ 1 ];Properties properties = new Properties();AppProps.setApplicationJarClass( props, Main.class );HadoopFlowConnector flowConnector = new HadoopFlowConnector( props );// create source and sink tapsTap docTap = new Hfs( new TextDelimited( true, "t" ), docPath );Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath );
  • 55. 2: Word Count String docPath = args[ 0 ]; String wcPath = args[ 1 ]; Properties properties = new Properties(); AppProps.setApplicationJarClass( props, Main.class ); HadoopFlowConnector flowConnector = new HadoopFlowConnector( props ); // create source and sink taps Tap docTap = new Hfs( new TextDelimited( true, "t" ), docPath ); Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath ); // specify a regex operation to split the "document" text lines into aken stream
  • 56. 2: Word Count String docPath = args[ 0 ]; String wcPath = args[ 1 ]; Properties properties = new Properties(); AppProps.setApplicationJarClass( props, Main.class ); HadoopFlowConnector flowConnector = new HadoopFlowConnector( props ); // create source and sink taps Tap docTap = new Hfs( new TextDelimited( true, "t" ), docPath ); Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath ); // specify a regex operation to split the "document" text lines into aken stream Fields token = new Fields( "token" ); Fields text = new Fields( "text" ); RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [),.]" );
  • 57. 2: Word CountString docPath = args[ 0 ];String wcPath = args[ 1 ];Properties properties = new Properties();AppProps.setApplicationJarClass( props, Main.class );HadoopFlowConnector flowConnector = new HadoopFlowConnector( props );// create source and sink tapsTap docTap = new Hfs( new TextDelimited( true, "t" ), docPath );Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath );Fields token = new Fields( "token" );Fields text = new Fields( "text" );RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" );// only returns "token"Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS );// determine the word countsPipe wcPipe = new Pipe( "wc", docPipe );wcPipe = new GroupBy( wcPipe, token );wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL );
  • 58. String wcPath = args[ 1 ]; 2: Word Count 2: Word CountProperties properties = new Properties();AppProps.setApplicationJarClass( props, Main.class );HadoopFlowConnector flowConnector = new HadoopFlowConnector( props );// create source and sink tapsTap docTap = new Hfs( new TextDelimited( true, "t" ), docPath );Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath );Fields token = new Fields( "token" );Fields text = new Fields( "text" );RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" );// only returns "token"Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS );// determine the word countsPipe wcPipe = new Pipe( "wc", docPipe );wcPipe = new GroupBy( wcPipe, token );wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL );// connect the taps, pipes, etc., into a flowFlowDef flowDef = FlowDef.flowDef() .setName( "wc" )
  • 59. AppProps.setApplicationJarClass( props, Main.class );HadoopFlowConnector flowConnector = new HadoopFlowConnector( props ); 2: Word Count// create source and sink tapsTap docTap = new Hfs( new TextDelimited( true, "t" ), docPath );Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath );Fields token = new Fields( "token" );Fields text = new Fields( "text" );RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" );// only returns "token"Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS );// determine the word countsPipe wcPipe = new Pipe( "wc", docPipe );wcPipe = new GroupBy( wcPipe, token );wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL );// connect the taps, pipes, etc., into a flowFlowDef flowDef = FlowDef.flowDef() .setName( "wc" ) .addSource( docPipe, docTap ) .addTailSink( wcPipe, wcTap );// write a DOT file and run the flow
  • 60. AppProps.setApplicationJarClass( props, Main.class );HadoopFlowConnector flowConnector = new HadoopFlowConnector( props ); 2: Word Count// create source and sink tapsTap docTap = new Hfs( new TextDelimited( true, "t" ), docPath );Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath );Fields token = new Fields( "token" );Fields text = new Fields( "text" );RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" );// only returns "token"Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS );// determine the word countsPipe wcPipe = new Pipe( "wc", docPipe );wcPipe = new GroupBy( wcPipe, token );wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL );// connect the taps, pipes, etc., into a flowFlowDef flowDef = FlowDef.flowDef() .setName( "wc" ) .addSource( docPipe, docTap ) .addTailSink( wcPipe, wcTap );// write a DOT file and run the flow
  • 61. AppProps.setApplicationJarClass( props, Main.class );HadoopFlowConnector flowConnector = new HadoopFlowConnector( props ); 2: Word Count// create source and sink tapsTap docTap = new Hfs( new TextDelimited( true, "t" ), docPath );Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath );Fields token = new Fields( "token" );Fields text = new Fields( "text" );RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" );// only returns "token"Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS );// determine the word countsPipe wcPipe = new Pipe( "wc", docPipe );wcPipe = new GroupBy( wcPipe, token );wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL );// connect the taps, pipes, etc., into a flowFlowDef flowDef = FlowDef.flowDef() .setName( "wc" ) .addSource( docPipe, docTap ) .addTailSink( wcPipe, wcTap );// write a DOT file and run the flow
  • 62. AppProps.setApplicationJarClass( props, Main.class );HadoopFlowConnector flowConnector = new HadoopFlowConnector( props ); 2: Word Count// create source and sink tapsTap docTap = new Hfs( new TextDelimited( true, "t" ), docPath );Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath );Fields token = new Fields( "token" );Fields text = new Fields( "text" );RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" );// only returns "token"Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS );// determine the word countsPipe wcPipe = new Pipe( "wc", docPipe );wcPipe = new GroupBy( wcPipe, token );wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL );// connect the taps, pipes, etc., into a flowFlowDef flowDef = FlowDef.flowDef() .setName( "wc" ) .addSource( docPipe, docTap ) .addTailSink( wcPipe, wcTap );// write a DOT file and run the flow
  • 63. AppProps.setApplicationJarClass( props, Main.class );HadoopFlowConnector flowConnector = new HadoopFlowConnector( props ); 2: Word Count// create source and sink tapsTap docTap = new Hfs( new TextDelimited( true, "t" ), docPath );Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath );Fields token = new Fields( "token" );Fields text = new Fields( "text" );RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" );// only returns "token"Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS );// determine the word countsPipe wcPipe = new Pipe( "wc", docPipe );wcPipe = new GroupBy( wcPipe, token );wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL );// connect the taps, pipes, etc., into a flowFlowDef flowDef = FlowDef.flowDef() .setName( "wc" ) .addSource( docPipe, docTap ) .addTailSink( wcPipe, wcTap );// write a DOT file and run the flow
  • 64. AppProps.setApplicationJarClass( props, Main.class );HadoopFlowConnector flowConnector = new HadoopFlowConnector( props ); 2: Word Count// create source and sink tapsTap docTap = new Hfs( new TextDelimited( true, "t" ), docPath );Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath );Fields token = new Fields( "token" );Fields text = new Fields( "text" );RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" );// only returns "token"Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS );// determine the word countsPipe wcPipe = new Pipe( "wc", docPipe );wcPipe = new GroupBy( wcPipe, token );wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL );// connect the taps, pipes, etc., into a flowFlowDef flowDef = FlowDef.flowDef() .setName( "wc" ) .addSource( docPipe, docTap ) .addTailSink( wcPipe, wcTap );// write a DOT file and run the flow
  • 65. AppProps.setApplicationJarClass( props, Main.class );HadoopFlowConnector flowConnector = new HadoopFlowConnector( props ); 2: Word Count// create source and sink tapsTap docTap = new Hfs( new TextDelimited( true, "t" ), docPath );Tap wcTap = new Hfs( new TextDelimited( true, "t" ), wcPath );Fields token = new Fields( "token" );Fields text = new Fields( "text" );RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" );// only returns "token"Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS );// determine the word countsPipe wcPipe = new Pipe( "wc", docPipe );wcPipe = new GroupBy( wcPipe, token );wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL );// connect the taps, pipes, etc., into a flowFlowDef flowDef = FlowDef.flowDef() .setName( "wc" ) .addSource( docPipe, docTap ) .addTailSink( wcPipe, wcTap );// write a DOT file and run the flow
  • 66. Fields token = new Fields( "token" ); 2: Word Count Fields text = new Fields( "text" ); RegexSplitGenerator splitter = new RegexSplitGenerator( token, "[ [](),.]" ); // only returns "token" Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS ); // determine the word counts Pipe wcPipe = new Pipe( "wc", docPipe ); wcPipe = new GroupBy( wcPipe, token ); wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL ); // connect the taps, pipes, etc., into a flow FlowDef flowDef = FlowDef.flowDef() .setName( "wc" ) .addSource( docPipe, docTap ) .addTailSink( wcPipe, wcTap ); // write a DOT file and run the flow Flow wcFlow = flowConnector.connect( flowDef ); wcFlow.writeDOT( "dot/wc.dot" ); wcFlow.complete(); }}
  • 67. Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS ); 2: Word Count How its made // determine the word counts Pipe wcPipe = new Pipe( "wc", docPipe ); wcPipe = new GroupBy( wcPipe, token ); wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL ); // connect the taps, pipes, etc., into a flow FlowDef flowDef = FlowDef.flowDef() .setName( "wc" ) .addSource( docPipe, docTap ) .addTailSink( wcPipe, wcTap ); // write a DOT file and run the flow Flow wcFlow = flowConnector.connect( flowDef ); wcFlow.writeDOT( "dot/wc.dot" ); wcFlow.complete(); }}
  • 68. Pipe docPipe = new Each( "token", text, splitter, Fields.RESULTS ); 2: Word Count How its made // determine the word counts Pipe wcPipe = new Pipe( "wc", docPipe ); wcPipe = new GroupBy( wcPipe, token ); wcPipe = new Every( wcPipe, Fields.ALL, new Count(), Fields.ALL ); // connect the taps, pipes, etc., into a flow FlowDef flowDef = FlowDef.flowDef() .setName( "wc" ) .addSource( docPipe, docTap ) .addTailSink( wcPipe, wcTap ); // write a DOT file and run the flow Flow wcFlow = flowConnector.connect( flowDef ); wcFlow.writeDOT( "dot/wc.dot" ); wcFlow.complete(); } Graph representation of jobs!}
  • 69. 2: Word CountHow its madehttp://www.cascading.org/2012/07/09/cascading-for-the-impatient-part-2/
  • 70. How its made
  • 71. How its madeval flow = FlowDef
  • 72. How its madeval flow = FlowDef// pseudo code...
  • 73. How its madeval flow = FlowDef// pseudo code...val jobs: List[MRJob] = flowConnector(flow)
  • 74. How its madeval flow = FlowDef// pseudo code...val jobs: List[MRJob] = flowConnector(flow)// pseudo code...
  • 75. How its madeval flow = FlowDef// pseudo code...val jobs: List[MRJob] = flowConnector(flow)// pseudo code...HadoopCluster.execute(jobs)
  • 76. How its madeval flow = FlowDef// pseudo code...val jobs: List[MRJob] = flowConnector(flow)// pseudo code...HadoopCluster.execute(jobs)
  • 77. Cascading tipsPipe assembly = new Pipe( "assembly" );assembly = new Each( assembly, DebugLevel.VERBOSE, new Debug() );// ...// head and tail have same nameFlowDef flowDef = new FlowDef() .setName( "debug" ) .addSource( "assembly", source ) .addSink( "assembly", sink ) .addTail( assembly );
  • 78. Cascading tipsPipe assembly = new Pipe( "assembly" );assembly = new Each( assembly, DebugLevel.VERBOSE, new Debug() );// ...// head and tail have same nameFlowDef flowDef = new FlowDef() .setName( "debug" ) .addSource( "assembly", source ) .addSink( "assembly", sink ) .addTail( assembly );flowDef.setDebugLevel( DebugLevel.NONE );
  • 79. Cascading tipsPipe assembly = new Pipe( "assembly" );assembly = new Each( assembly, DebugLevel.VERBOSE, new Debug() );// ...// head and tail have same nameFlowDef flowDef = new FlowDef() .setName( "debug" ) .addSource( "assembly", source ) .addSink( "assembly", sink ) .addTail( assembly );flowDef.setDebugLevel( DebugLevel.NONE ); flowConnector will NOT create the Debug pipe!
  • 80. Scalding = + Twitter Scaldinggithub.com/twitter/scalding
  • 81. Scalding API
  • 82. map
  • 83. mapScala:val data = 1 :: 2 :: 3 :: Nil
  • 84. mapScala:val data = 1 :: 2 :: 3 :: Nilval doubled = data map { _ * 2 }
  • 85. mapScala:val data = 1 :: 2 :: 3 :: Nilval doubled = data map { _ * 2 } // Int => Int
  • 86. mapScala:val data = 1 :: 2 :: 3 :: Nilval doubled = data map { _ * 2 } // Int => Int
  • 87. map Scala: val data = 1 :: 2 :: 3 :: Nil val doubled = data map { _ * 2 } // Int => IntScalding: IterableSource(data)
  • 88. map Scala: val data = 1 :: 2 :: 3 :: Nil val doubled = data map { _ * 2 } // Int => IntScalding: IterableSource(data) .map(number -> doubled) { n: Int => n * 2 }
  • 89. map Scala: val data = 1 :: 2 :: 3 :: Nil val doubled = data map { _ * 2 } // Int => IntScalding: IterableSource(data) .map(number -> doubled) { n: Int => n * 2 } // Int => Int
  • 90. map Scala: val data = 1 :: 2 :: 3 :: Nil val doubled = data map { _ * 2 } // Int => IntScalding: IterableSource(data) .map(number -> doubled) { n: Int => n * 2 } available in Pipe // Int => Int
  • 91. map Scala: val data = 1 :: 2 :: 3 :: Nil val doubled = data map { _ * 2 } // Int => IntScalding: IterableSource(data) .map(number -> doubled) { n: Int => n * 2 } stays in Pipe available in Pipe // Int => Int
  • 92. map Scala: val data = 1 :: 2 :: 3 :: Nil val doubled = data map { _ * 2 } // Int => IntScalding: IterableSource(data) .map(number -> doubled) { n: Int => n * 2 } must choose type! // Int => Int
  • 93. mapTo
  • 94. mapToScala:var data = 1 :: 2 :: 3 :: Nil
  • 95. mapToScala:var data = 1 :: 2 :: 3 :: Nilval doubled = data map { _ * 2 }
  • 96. mapToScala:var data = 1 :: 2 :: 3 :: Nilval doubled = data map { _ * 2 }data = null
  • 97. mapToScala:var data = 1 :: 2 :: 3 :: Nilval doubled = data map { _ * 2 }data = null // Int => Int
  • 98. mapToScala:var data = 1 :: 2 :: 3 :: Nilval doubled = data map { _ * 2 }data = null // Int => Int release reference
  • 99. mapToScala:var data = 1 :: 2 :: 3 :: Nilval doubled = data map { _ * 2 }data = null // Int => Int release reference
  • 100. mapTo Scala: var data = 1 :: 2 :: 3 :: Nil val doubled = data map { _ * 2 } data = null // Int => Int release referenceScalding: IterableSource(data)
  • 101. mapTo Scala: var data = 1 :: 2 :: 3 :: Nil val doubled = data map { _ * 2 } data = null // Int => Int release referenceScalding: IterableSource(data) .mapTo(doubled) { n: Int => n * 2 }
  • 102. mapTo Scala: var data = 1 :: 2 :: 3 :: Nil val doubled = data map { _ * 2 } data = null // Int => Int release referenceScalding: IterableSource(data) .mapTo(doubled) { n: Int => n * 2 } // Int => Int
  • 103. mapTo Scala: var data = 1 :: 2 :: 3 :: Nil val doubled = data map { _ * 2 } data = null // Int => Int release referenceScalding: IterableSource(data) .mapTo(doubled) { n: Int => n * 2 } doubled stays in Pipe // Int => Int
  • 104. mapTo Scala: var data = 1 :: 2 :: 3 :: Nil val doubled = data map { _ * 2 } data = null // Int => Int release referenceScalding: IterableSource(data) .mapTo(doubled) { n: Int => n * 2 } number is removed doubled stays in Pipe // Int => Int
  • 105. flatMap
  • 106. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]
  • 107. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String
  • 108. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String line.split(",") // Array[String]
  • 109. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String line.split(",") // Array[String]} map { _.toInt } // List[Int]
  • 110. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String line.split(",") // Array[String]} map { _.toInt } // List[Int]numbers // List[Int]
  • 111. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String line.split(",") // Array[String]} map { _.toInt } // List[Int]numbers // List[Int]numbers should equal (List(1, 2, 2, 3, 3, 3))
  • 112. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String line.split(",") // Array[String]} map { _.toInt } // List[Int]numbers // List[Int]numbers should equal (List(1, 2, 2, 3, 3, 3))
  • 113. flatMap Scala: val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String] val numbers = data flatMap { line => // String line.split(",") // Array[String] } map { _.toInt } // List[Int] numbers // List[Int] numbers should equal (List(1, 2, 2, 3, 3, 3))Scalding: TextLine(data) // like List[String]
  • 114. flatMap Scala: val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String] val numbers = data flatMap { line => // String line.split(",") // Array[String] } map { _.toInt } // List[Int] numbers // List[Int] numbers should equal (List(1, 2, 2, 3, 3, 3))Scalding: TextLine(data) // like List[String] .flatMap(line -> word) { _.split(",") } // like List[String]
  • 115. flatMap Scala: val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String] val numbers = data flatMap { line => // String line.split(",") // Array[String] } map { _.toInt } // List[Int] numbers // List[Int] numbers should equal (List(1, 2, 2, 3, 3, 3))Scalding: TextLine(data) // like List[String] .flatMap(line -> word) { _.split(",") } // like List[String] .map(word -> number) { _.toInt } // like List[Int]
  • 116. flatMap Scala: val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String] val numbers = data flatMap { line => // String line.split(",") // Array[String] } map { _.toInt } // List[Int] numbers // List[Int] numbers should equal (List(1, 2, 2, 3, 3, 3))Scalding: TextLine(data) // like List[String] .flatMap(line -> word) { _.split(",") } // like List[String] .map(word -> number) { _.toInt } // like List[Int] MR map outside
  • 117. flatMap
  • 118. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]
  • 119. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String
  • 120. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String line.split(",").map(_.toInt) // Array[Int]
  • 121. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String line.split(",").map(_.toInt) // Array[Int]}
  • 122. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String line.split(",").map(_.toInt) // Array[Int]}numbers // List[Int]
  • 123. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String line.split(",").map(_.toInt) // Array[Int]}numbers // List[Int]numbers should equal (List(1, 2, 2, 3, 3, 3))
  • 124. flatMapScala:val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String]val numbers = data flatMap { line => // String line.split(",").map(_.toInt) // Array[Int]}numbers // List[Int]numbers should equal (List(1, 2, 2, 3, 3, 3))
  • 125. flatMap Scala: val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String] val numbers = data flatMap { line => // String line.split(",").map(_.toInt) // Array[Int] } numbers // List[Int] numbers should equal (List(1, 2, 2, 3, 3, 3))Scalding: TextLine(data) // like List[String]
  • 126. flatMap Scala: val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String] val numbers = data flatMap { line => // String line.split(",").map(_.toInt) // Array[Int] } numbers // List[Int] numbers should equal (List(1, 2, 2, 3, 3, 3))Scalding: TextLine(data) // like List[String] .flatMap(line -> word) { _.split(",").map(_.toInt) }
  • 127. flatMap Scala: val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String] val numbers = data flatMap { line => // String line.split(",").map(_.toInt) // Array[Int] } numbers // List[Int] numbers should equal (List(1, 2, 2, 3, 3, 3))Scalding: TextLine(data) // like List[String] .flatMap(line -> word) { _.split(",").map(_.toInt) } // like List[Int]
  • 128. flatMap Scala: val data = "1" :: "2,2" :: "3,3,3" :: Nil // List[String] val numbers = data flatMap { line => // String line.split(",").map(_.toInt) // Array[Int] } numbers // List[Int] numbers should equal (List(1, 2, 2, 3, 3, 3))Scalding: TextLine(data) // like List[String] .flatMap(line -> word) { _.split(",").map(_.toInt) } // like List[Int] map inside Scala
  • 129. groupBy
  • 130. groupByScala:val data = 1 :: 2 :: 30 :: 42 :: Nil // List[Int]
  • 131. groupByScala:val data = 1 :: 2 :: 30 :: 42 :: Nil // List[Int]val groups = data groupBy { _ < 10 }
  • 132. groupByScala:val data = 1 :: 2 :: 30 :: 42 :: Nil // List[Int]val groups = data groupBy { _ < 10 }groups // Map[Boolean, Int]
  • 133. groupByScala:val data = 1 :: 2 :: 30 :: 42 :: Nil // List[Int]val groups = data groupBy { _ < 10 }groups // Map[Boolean, Int]groups(true) should equal (List(1, 2))
  • 134. groupByScala:val data = 1 :: 2 :: 30 :: 42 :: Nil // List[Int]val groups = data groupBy { _ < 10 }groups // Map[Boolean, Int]groups(true) should equal (List(1, 2))groups(false) should equal (List(30, 42))
  • 135. groupByScala:val data = 1 :: 2 :: 30 :: 42 :: Nil // List[Int]val groups = data groupBy { _ < 10 }groups // Map[Boolean, Int]groups(true) should equal (List(1, 2))groups(false) should equal (List(30, 42))
  • 136. groupBy Scala: val data = 1 :: 2 :: 30 :: 42 :: Nil // List[Int] val groups = data groupBy { _ < 10 } groups // Map[Boolean, Int] groups(true) should equal (List(1, 2)) groups(false) should equal (List(30, 42))Scalding: IterableSource(List(1, 2, 30, 42), num)
  • 137. groupBy Scala: val data = 1 :: 2 :: 30 :: 42 :: Nil // List[Int] val groups = data groupBy { _ < 10 } groups // Map[Boolean, Int] groups(true) should equal (List(1, 2)) groups(false) should equal (List(30, 42))Scalding: IterableSource(List(1, 2, 30, 42), num) .map(num -> lessThanTen) { i: Int => i < 10 }
  • 138. groupBy Scala: val data = 1 :: 2 :: 30 :: 42 :: Nil // List[Int] val groups = data groupBy { _ < 10 } groups // Map[Boolean, Int] groups(true) should equal (List(1, 2)) groups(false) should equal (List(30, 42))Scalding: IterableSource(List(1, 2, 30, 42), num) .map(num -> lessThanTen) { i: Int => i < 10 } .groupBy(lessThanTen) { _.size(size) }
  • 139. groupBy Scala: val data = 1 :: 2 :: 30 :: 42 :: Nil // List[Int] val groups = data groupBy { _ < 10 } groups // Map[Boolean, Int] groups(true) should equal (List(1, 2)) groups(false) should equal (List(30, 42))Scalding: IterableSource(List(1, 2, 30, 42), num) .map(num -> lessThanTen) { i: Int => i < 10 } .groupBy(lessThanTen) { _.size(size) } groups all with == value
  • 140. groupBy Scala: val data = 1 :: 2 :: 30 :: 42 :: Nil // List[Int] val groups = data groupBy { _ < 10 } groups // Map[Boolean, Int] groups(true) should equal (List(1, 2)) groups(false) should equal (List(30, 42))Scalding: IterableSource(List(1, 2, 30, 42), num) .map(num -> lessThanTen) { i: Int => i < 10 } .groupBy(lessThanTen) { _.size(size) } groups all with == value => size
  • 141. groupByScalding:
  • 142. groupByScalding: IterableSource(List(1, 2, 30, 42), num)
  • 143. groupByScalding: IterableSource(List(1, 2, 30, 42), num) .map(num -> lessThanTen) { i: Int => i < 10 }
  • 144. groupByScalding: IterableSource(List(1, 2, 30, 42), num) .map(num -> lessThanTen) { i: Int => i < 10 } .groupBy(lessThanTen) { _.sum(total) }
  • 145. groupByScalding: IterableSource(List(1, 2, 30, 42), num) .map(num -> lessThanTen) { i: Int => i < 10 } .groupBy(lessThanTen) { _.sum(total) } total = [3, 74]
  • 146. Scalding API
  • 147. Scalding API project / discard
  • 148. Scalding API project / discard map / mapTo
  • 149. Scalding API project / discard map / mapTo flatMap / flatMapTo
  • 150. Scalding API project / discard map / mapTo flatMap / flatMapTo rename
  • 151. Scalding API project / discard map / mapTo flatMap / flatMapTo rename filter
  • 152. Scalding API project / discard map / mapTo flatMap / flatMapTo rename filter unique
  • 153. Scalding API project / discard map / mapTo flatMap / flatMapTo rename filter uniquegroupBy / groupAll / groupRandom / shuffle
  • 154. Scalding API project / discard map / mapTo flatMap / flatMapTo rename filter uniquegroupBy / groupAll / groupRandom / shuffle limit
  • 155. Scalding API project / discard map / mapTo flatMap / flatMapTo rename filter uniquegroupBy / groupAll / groupRandom / shuffle limit debug
  • 156. Scalding API project / discard map / mapTo flatMap / flatMapTo rename filter uniquegroupBy / groupAll / groupRandom / shuffle limit debug Group operations
  • 157. Scalding API project / discard map / mapTo flatMap / flatMapTo rename filter uniquegroupBy / groupAll / groupRandom / shuffle limit debug Group operations joins
  • 158. Distributed Copy in Scaldingclass WordCountJob(args: Args) extends Job(args) {
  • 159. Distributed Copy in Scaldingclass WordCountJob(args: Args) extends Job(args) { val input = Tsv(args("input")) val output = Tsv(args("output"))
  • 160. Distributed Copy in Scaldingclass WordCountJob(args: Args) extends Job(args) { val input = Tsv(args("input")) val output = Tsv(args("output")) input.read.write(output)}
  • 161. Distributed Copy in Scaldingclass WordCountJob(args: Args) extends Job(args) { val input = Tsv(args("input")) val output = Tsv(args("output")) input.read.write(output)} The End.
  • 162. Main Class - "Runner"import org.apache.hadoop.util.ToolRunnerimport com.twitter.scaldingobject ScaldingJobRunner extends App { ToolRunner.run(new Configuration, new scalding.Tool, args)}
  • 163. Main Class - "Runner"import org.apache.hadoop.util.ToolRunnerimport com.twitter.scaldingobject ScaldingJobRunner extends App { from App ToolRunner.run(new Configuration, new scalding.Tool, args)}
  • 164. Word Count in Scaldingclass WordCountJob(args: Args) extends Job(args) {}
  • 165. Word Count in Scaldingclass WordCountJob(args: Args) extends Job(args) { val inputFile = args("input") val outputFile = args("output")}
  • 166. Word Count in Scaldingclass WordCountJob(args: Args) extends Job(args) { val inputFile = args("input") val outputFile = args("output") TextLine(inputFile)}
  • 167. Word Count in Scaldingclass WordCountJob(args: Args) extends Job(args) { val inputFile = args("input") val outputFile = args("output") TextLine(inputFile) .flatMap(line -> word) { line: String => tokenize(line) } def tokenize(text: String): Array[String] = implemented}
  • 168. Word Count in Scaldingclass WordCountJob(args: Args) extends Job(args) { val inputFile = args("input") val outputFile = args("output") TextLine(inputFile) .flatMap(line -> word) { line: String => tokenize(line) } .groupBy(word) { group => group.size(count) } def tokenize(text: String): Array[String] = implemented}
  • 169. Word Count in Scaldingclass WordCountJob(args: Args) extends Job(args) { val inputFile = args("input") val outputFile = args("output") TextLine(inputFile) .flatMap(line -> word) { line: String => tokenize(line) } .groupBy(word) { group => group.size } def tokenize(text: String): Array[String] = implemented}
  • 170. Word Count in Scaldingclass WordCountJob(args: Args) extends Job(args) { val inputFile = args("input") val outputFile = args("output") TextLine(inputFile) .flatMap(line -> word) { line: String => tokenize(line) } .groupBy(word) { _.size } def tokenize(text: String): Array[String] = implemented}
  • 171. Word Count in Scaldingclass WordCountJob(args: Args) extends Job(args) { val inputFile = args("input") val outputFile = args("output") TextLine(inputFile) .flatMap(line -> word) { line: String => tokenize(line) } .groupBy(word) { _.size } .write(Tsv(outputFile)) def tokenize(text: String): Array[String] = implemented}
  • 172. Word Count in Scalding class WordCountJob(args: Args) extends Job(args) { val inputFile = args("input") val outputFile = args("output")4{ TextLine(inputFile) .flatMap(line -> word) { line: String => tokenize(line) } .groupBy(word) { _.size } .write(Tsv(outputFile)) def tokenize(text: String): Array[String] = implemented }
  • 173. Word Count in Scalding
  • 174. Word Count in Scaldingrun pl.project13.scala.oculus.job.WordCountJob --tool.graph
  • 175. Word Count in Scaldingrun pl.project13.scala.oculus.job.WordCountJob --tool.graph=> pl.project13.scala.oculus.job.WordCountJob0.dot
  • 176. Word Count in Scaldingrun pl.project13.scala.oculus.job.WordCountJob --tool.graph=> pl.project13.scala.oculus.job.WordCountJob0.dotMAP
  • 177. Word Count in Scaldingrun pl.project13.scala.oculus.job.WordCountJob --tool.graph=> pl.project13.scala.oculus.job.WordCountJob0.dotMAPRED
  • 178. Word Count in ScaldingTextLine(inputFile) .flatMap(line -> word) { line: String => tokenize(line) } .groupBy(word) { _.size(count) } .write(Tsv(outputFile))
  • 179. Word Count in ScaldingTextLine(inputFile) .flatMap(line -> word) { line: String => tokenize(line) } .groupBy(word) { _.size(count) } .write(Tsv(outputFile))
  • 180. Word Count in ScaldingTextLine(inputFile) .flatMap(line -> word) { line: String => tokenize(line) } .groupBy(word) { _.size(count) } .write(Tsv(outputFile))
  • 181. Why Scalding?
  • 182. Why Scalding? Hadoop inside
  • 183. Why Scalding? Hadoop insideCascading abstractions
  • 184. Why Scalding? Hadoop insideCascading abstractions Scala conciseness
  • 185. Ask Stuff! Dzięki! Thanks! ありがとう!Konrad Malawski @ java.plt: ktosopl / g: ktosob: blog.project13.pl