Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PNDA - Platform for Network Data Analytics

106,575 views

Published on

PNDA brings together a number of open source technologies to provide a simple, scalable open big data analytics Platform for Network Data Analytics.

Copyright © 2018 PNDA Project a Series of LF Projects, LLC
For web site terms of use, trademark policy and other project policies please see https://lfprojects.org

Published in: Data & Analytics

PNDA - Platform for Network Data Analytics

  1. 1. PNDA
  2. 2. • Volume of network data into terabytes • Siloed data limits ability to perform correlation and causal analysis • Relational databases limit the ability to • Application of big data analytics to the network dataset is key to providing both real- time and historical insights • Data science is driving the bifurcation of the OSS stack Network data is becoming a big data problem 3-fold increase in total IP Traffic >60% increase in devices and connections Telemetry data streamed in near real-time Source: Cisco VNI/GCI Global IP Traffic Forecast
  3. 3. What changes? PMO FMO Orientation Single domain Cross domain Realisation Small data, tool driven Big data, data driven Data collection Polled Streamed Data aggregation and analysis Coupled Decoupled Domain data schema Schema-on-write Schema-on-read Analysis Prescriptive Prescriptive + Stochastic + ML Customisation Design time Run time
  4. 4. • Tight coupling of data aggregation/store/ analysis • Multiple analytics pipelines implemented from open source components • Common design patterns ~75% of effort wasted / duplicated • Siloes limit the potential of big data analytics and lead to industry divergence Today’s siloed analytics pipelines Telemetry Metrics Data sources HDFS Data store Spark Streaming MapR Data analysis Hbase Storm Kafka Streamsets Data aggregation Kafka Impala Query Outputs Dashboard & ReportingNiFi Logs
  5. 5. What is PNDA? PNDA brings together a number of open source technologies to provide a simple, scalable open big data analytics Platform for Network Data Analytics Linux Foundation Collaborative Project based on the Apache ecosystem
  6. 6. • Simple, scalable open data platform • Provides a common set of services for developing analytics applications • Accelerates the process of developing big data analytics applications whilst significantly reducing the TCO • PNDAprovides a platform for convergence of network data analytics PNDA PNDA Plugins ODL Logstash OpenBPM pmacct XR Telemetry Real -time DataDistribution File Store Platform Services: Installation, Mgmt, Security, Data Privacy App Packaging and Mgmt Stream Batch Processing SQL Query OLAP Cube Search/ Lucene NoSQL Time Series Data Exploration Metric Visualisation Event Visualisation PNDA Mnged App PNDA Mnged App Unmnged App Unmnged App Query Visualisation and Exploration PNDA Applications PNDA Producer API PNDA Consumer API
  7. 7. • Horizontally scalable platform for analytics and data processing applications • Support for near-real-time stream processing and in-depth batch analysis on massive datasets • PNDA decouples data aggregation from data analysis • Consuming applications can be either platform apps developed for PNDA or client apps integrated with PNDA • Client apps can use one of several structured query interfaces or consume streams directly. • Leverages best current practise in big data analytics PNDA PNDA Plugins ODL Logstash OpenBPM pmacct XR Telemetry Real -time DataDistribution File Store Platform Services: Installation, Mgmt, Security, Data Privacy App Packaging and Mgmt Stream Batch Processing SQL Query OLAP Cube Search/ Lucene NoSQL Time Series Data Exploration Metric Visualisation Event Visualisation PNDA Mnged App PNDA Mnged App Unmnged App Unmnged App Query Visualisation and Exploration PNDA Applications PNDA Producer API PNDA Consumer API
  8. 8. Why PNDA? There are a bewildering number of big data technologies out there, so how do you decide what to use? We've evaluated and chosen the best tools, based on technical capability and community support. PNDA combines them to streamline the process of developing data processing applications.
  9. 9. PNDA Technologies
  10. 10. Why PNDA? Innovation in the big data space is extremely rapid, but combining multiple technologies into an end-to-end solution can be extremely complex and time-consuming PNDA removes this complexity and allows you to focus on developing the analytics applications, not on developing the pipeline – significantly reducing the effort required and time-to- insight
  11. 11. PNDA Software Components
  12. 12. • Platform for data aggregation, distribution, processing and storage • Automated installation, creation, and configuration • Openstack, AWS and baremetal • Typical install ~1hr • Modular install • Open producer and consumer APIs • Avro platform schema • Plugins for Logstash, pmacct, OpenBMP, OpenDaylight, Cisco XR- telemetry, bulk ingest … • Data distribution – Apache Kafka • Data store: • Automated data partitioning and storage (HDFS) • OpenTSDB – time series analysis • Hbase - NoSQL • Support for batch and stream processing: • Apache Spark and Spark Streaming • Jupyter notebook server for app prototyping and data exploration • Impala-based SQL query support • Grafana for time series visualisation • PNDAapplication packaging • PNDAmanagement and dashboard PNDA 3.4 Capabilities
  13. 13. • The PNDA console provides a dashboard across all components in a cluster • Inbuilt platform test agents verify the operation of all components • Active platform testing verifies the end-to-end data pipeline PNDA Console
  14. 14. • Ingested data should be encapsulated in PNDA Avro schema and published on a pre-defined Kafka topic or set of topics Publishing Data to PNDA
  15. 15. PNDA Plugins Data Type Data Aggregator Data Aggregator Reference PNDA Producer Plugin Reference BGP (inc. BGP LS) OpenBMP http://www.openbmp.org/#!index.md#Usi ng_Kafka_for_Collector_Integration http://pnda.io/pnda- guide/producer/openbmp.html BGP PMACCT (BGP listener) http://www.pmacct.net/ http://pnda.io/pnda- guide/producer/pmacct.html Bulk Ingest PNDA Bulk Ingest Tool http://pnda.io/pnda-guide/bulkingest/ ISIS PMACCT (ISIS listener) http://www.pmacct.net/ http://pnda.io/pnda- guide/producer/pmacct.html Cisco XR streaming telemetry Pipeline https://github.com/cisco/bigmuddy- network-telemetry-collector CollectD (CollectD supports multiple plugins as listed here https://collectd.org/wiki/index.php/T able_of_Plugins) Logstash https://www.elastic.co/guide/en/logstash/ current/plugins-codecs-collectd.html http://pnda.io/pnda-guide/repos/prod- logstash-codec-avro/ IoT sensor via HTTP Node-RED https://nodered.org Logstash (Logstash supports multiple plugins as listed here https://www.elastic.co/guide/en/logs tash/current/input-plugins.html) Logstash http://pnda.io/pnda-guide/repos/prod- logstash-codec-avro/ NETCONF Notifications ODL http://www.opendaylight.org/ http://pnda.io/pnda- guide/producer/opendl.html Netflow / IPFIX Logstash https://www.elastic.co/guide/en/logstash/ current/plugins-codecs-netflow.html http://pnda.io/pnda-guide/repos/prod- logstash-codec-avro/ Netflow / IPFIX / sFlow pmacct http://www.pmacct.net/ http://pnda.io/pnda- guide/producer/pmacct.html Openstack Work in progress sFlow Logstash https://github.com/ashangit/logstash- codec-sflow http://pnda.io/pnda-guide/repos/prod- logstash-codec-avro/ SNMP Metrics and Traps ODL https://wiki.opendaylight.org/view/SNMP_ Plugin:Getting_Started http://pnda.io/pnda- guide/producer/opendl.html SNMP Traps Logstash https://www.elastic.co/guide/en/logstash/ current/plugins-inputs-snmptrap.html http://pnda.io/pnda-guide/repos/prod- logstash-codec-avro/ Syslog Logstash https://www.elastic.co/guide/en/logstash/ current/plugins-inputs-syslog.html http://pnda.io/pnda-guide/repos/prod- logstash-codec-avro/ Syslog (RFC3164 or RFC5424 - needed for newer IOS/IOS XR/ NX OS etc.) Logstash https://gist.github.com/donaldh/89b73049 81f96497c94fe4d98bb03d71 http://pnda.io/pnda-guide/repos/prod- logstash-codec-avro/
  16. 16. Design time vs. runtime pico standard
  17. 17. BGP Analytics Pipeline Open BPM Collector BGPBGP BMP Logstash PNDA Cluster Gobblin HDFS Kafka Spark OpenTSDB BGP Data Service UI Impala
  18. 18. PNDA Applied to NFV Infrastruct ure Analytics Data Aggregators Open Data Platform (PNDA) Analytics Applications Open Source Custom Licensed Alerts Metrics Telemetry Logs Data Sources Inventory Orchestration NFVO VNFM VIM NFVI VNF Data CenterCoreUser State Data Access Aggregation Related as loosely coupled systems Context Network Control
  19. 19. Convergence of network data analytics Operational Intelligence Planning Intelligence Security Intelligence
  20. 20. • PNDA 3.5 • ElasticSearch integration • CentOS / RHEL • Offline install • Future • Apache Kylin • Apache Ambari • Containerisation • Deep-learning framework • Red PNDA – the smallest PNDA yet! What’s coming?
  21. 21. Come and join us!

×