Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Flink Forward Berlin 2017: Maciek Próchniak - TouK Nussknacker - creating Flink Jobs with GUI

361 views

Published on

Last year we (TouK) introduced Flink in one of the biggest polish telcoms in the domain of real time marketing and fraud detection. One of the most significant problems in adoption was lack of programming skills at our client - the users were supposed to be analytics/business people. Therefore, we developed a custom platform - TouK Nussknacker - which allows users to design processes with GUI by drawing diagrams. Our project is going to be open-sourced soon - this will happen before Flink Forward. We believe it can make stream processing with Flink more accessible in many use cases, especially in companies that don't have their own development teams. During the talk I’m going to describe architecture of our platform, why we made certain design decisions and about our future plans. I’ll also describe our experiences - when being able to use GUI is great and when it’s better to develop jobs as normal code. If time permits I’ll also show a quick demo of our solution.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Flink Forward Berlin 2017: Maciek Próchniak - TouK Nussknacker - creating Flink Jobs with GUI

  1. 1. TouK Nussknacker GUI for Flink
  2. 2. Maciek Próchniak (that’s me)
  3. 3. How it all began? POC with Apache Flink Great results :)
  4. 4. Configuration Client not (really) happy
  5. 5. Scala expression for filters Client still not very happy
  6. 6. Ok, we’ll build you a GUI Finally, approving nod...
  7. 7. Main assumptions ● Model & integration in “normal” code ● Expressions accessible for semi-technicals - like SQL ● Facilitate testing and experimentation
  8. 8. Basics
  9. 9. Architecture
  10. 10. Model + process JSON Config JAR JSON JSON
  11. 11. Model
  12. 12. Model ● Data - POJOs, case classes ● Source/Sink - Flink API + ε ● Processor/Enricher - Services
  13. 13. Config creator trait ProcessConfigCreator extends Serializable { def services(config: Config) : Map[String, WithCategories[Service]] def sourceFactories(config: Config): Map[String, WithCategories[SourceFactory[_]]] def sinkFactories(config: Config): Map[String, WithCategories[SinkFactory]] ... }
  14. 14. Service/Enricher @MethodToInvoke def invoke(@ParamName("clientId") clientId: String) (implicit ec: ExecutionContext): Future[Client] = { ... }
  15. 15. Business rules - expressions ● Filters ● Aggregation definitions ● Actions - mail subject, sms content How to define them...?
  16. 16. Expressions Spring Expression Language! ● Accessible enough ● More or less fast enough ● Ability to do basic validation/code-completion
  17. 17. Limitations of SPEL ● Speed ● Synchronous ● Type safety??
  18. 18. Advanced?
  19. 19. What about... ● Windows? ● State? ● Joins? ● Other Flink goodies?
  20. 20. Knowing where to stop Completness Difficulty
  21. 21. Knowing where to stop
  22. 22. Custom stream transformer - POFO @MethodToInvoke(returnType = classOf[EventCount]) def execute(@ParamName("key") key: LazyInterpreter[String], @ParamName("length") length: String) = FlinkCustomStreamTransformation((start: DataStream[InterpretationResult]) => { start.keyBy(key.syncInterpretationFunction) .transform("eventsCounter", new CounterFunction(lengthInMillis)) })
  23. 23. Tests/Ops
  24. 24. “Convincing” to test We’ll do it “our” way: first we run on production, then deploy to test
  25. 25. Tests - running
  26. 26. Tests - running
  27. 27. Generate test data
  28. 28. Taking control ● Do we process data? ● What is latency? ● Where do we filter out events? ● Are there errors?
  29. 29. Taking control
  30. 30. DEMO
  31. 31. Where can I use it? ● Telco ● Banking ● Media ● RTM ● Fraud detection
  32. 32. Adhoc jobs vs process Ad-hoc Query SQL RBEA CI CD Process Monitoring SAS Scala Testing
  33. 33. Does it work in production? ● 1 year in production ● One of largest Polish telco’s ● RTM + Fraud detection
  34. 34. Does it work in production?
  35. 35. Road ahead? ● Asynchronous services ● Pluggable security ● Define model via GUI/Schema registry
  36. 36. Road ahead? ● Integration with CEP and SQL ● DAGs instead of Trees in UI? ● Governance? ● Nussknacker as a Service?
  37. 37. Try it out! https://github.com/touk/nussknacker
  38. 38. Thanks @mpproch @touk_pl

×