Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

RFX - Full-Stack Technology for Real-time Big Data

992 views

Published on

Introducing RFX as full-stack framework for Real-time Big Data Processing.

Published in: Data & Analytics
  • Did you try ⇒ www.WritePaper.info ⇐?. They know how to do an amazing essay, research papers or dissertations.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

RFX - Full-Stack Technology for Real-time Big Data

  1. 1. RFX - Full-Stack Technology for Real-time Big Data Key questions 1. What is RFX ? 2. Why is RFX ? 3. How to use RFX ? 4. The vision ... by TrieuNT@fpt.com.vn on 27/01/2016 http://engineering.adsplay.net
  2. 2. History ● Applied Lambda Architecture ○ https://en.wikipedia.org/wiki/Lambda_architecture ● In 2012, we used Apache Storm http://storm.apache.org (version 0.7) ● but we want to improve it and made it as full-stack framework ● In 2013, I started RFX with “Reactive philosophy in Mind” for common Big Data problems ● Since 2014 to now, RFX as main tool for our daily real-time big data tasks at FPT ● Core engineers: ○ TrieuNT@fpt.com.vn ○ DuHC@fpt.com.vn
  3. 3. What is RFX ? ● RFX is “Reactive Function X” ● “Function X” is a feature in specific product ● “Reactive” means every function can be “feel” and “react” to optimize UX for user in specific context. ● The framework, is built from open source projects: ○ Computing Unit with Akka Actor ( http://akka.io ) ○ Network Communication with Netty ( http://netty.io ) ○ Data Processing with Apache { Kafka, Hadoop , Spark } ○ Redis ( http://redis.io ) ○ Front-end with MEAN stack (MongoDB, ExpressJS, AngularJS , NodeJS)
  4. 4. Projects and Products using RFX 1. http://vnexpress.net a. counting article pageview b. recommendation engine 2. https://eclick.vn a. click analytics b. impression analytics 3. http://itvad.vn a. Video PlayView Analytics b. User Behaviour Analytics c. Heatmap Analytics d. Device Analytics e. Revenue Ad Optimization 4. …
  5. 5. Projects and Products using RFX
  6. 6. Projects and Products using RFX
  7. 7. ● Divide code into Micro-Services: ○ Analytical layer ( rfx-stream ) ○ Business logic layer ( rfx-query ) ○ Machine Learning layer (Apache Spark) ○ Database layer (Redis, Mongo, Hadoop) ○ Front-end layer (MEAN stack) ● Focus on best practices and reusability ● Foundation for scalability (system and business) ● Test-driven development for Real-Time Analytics ● Continuous integration & improvement Why is RFX ?
  8. 8. Why is RFX ?
  9. 9. Why is RFX ?
  10. 10. Reactive Function (X) Philosophy
  11. 11. Core elements of rfx-stream
  12. 12. Why is RFX ?
  13. 13. Core backend modules rfx-track: ● collecting all events from JavaScript delivery rfx-stream: ● processing stream data (PipelineProcessing pattern) ● processing real-time analytics ● processing business logic (by reactive function) rfx-cronjob: ● synchronizing real-time data to report database (copy data from Redis to MongoDB)
  14. 14. Core frontend modules rfx-report: ● visualizing data in real-time ● monitoring real-time event rfx-agent: ● tracking user activity: heatmap data, ... ● logging user activity to rfx-track (via network protocol: HTTP, TCP or UDP)
  15. 15. What problems could be solved with RFX 1. Processing Logs: a. Pageview b. Ad Impression c. Click analytics d. Heatmap User Data 2. real-time user segmentation 3. react to user behaviour 4. auto UX optimization
  16. 16. Vision for RFX
  17. 17. Vision for RFX http://engineering.adsplay.net/2015/10/08/iris-big-data-query-for-human
  18. 18. Vision for RFX to be Fast Data Intelligence Platform
  19. 19. Quick demo for playview analytics deployed at http://itvad.vn
  20. 20. Quick demo for device analytics
  21. 21. ● Ad Click Prediction: http://research.google.com/pubs/pub41159.html ● Software Engineering for Machine Learning https://sites.google. com/site/software4ml/accepted-papers ● Fault-tolerant and Scalable Joining of Continuous Data Streams http: //research.google.com/pubs/pub41318.html ● Dynamic Ad Layout Revenue Optimization for Display Advertising http: //wan.poly.edu/KDD2012/forms/workshop/ADKDD12/doc/a2.pdf Behavioral analytics http://en.wikipedia.org/wiki/Behavioral_analytics ● Real-time User Segmentation http://www.slideshare. net/Hadoop_Summit/doctor-nguyen-june27425pmroom230av2 ● Implementing a real-time data pipeline https://chimpler.wordpress. com/2014/07/01/implementing-a-real-time-data-pipeline-with-spark- streaming/ ● Distributed Event Processing Rule Engine http://eugenedvorkin. com/distributed-event-processing-rule-engine-with-storm-spring-and- groovy/ Research links
  22. 22. http://www.rfxlab.com http://engineering.adsplay.net

×