Big Data for Testing
Heading for post process and analytics
Speakers
Yujun Zhang
NFV System Engineer from ZTE Corporation.
He is current PTL of QTIP in OPNFV, and creator of
MitmStack in OpenStack
His main interest focuses on performance testing,
analysis and tuning
Donald Hunter
Principal Engineer in the Chief Technology and
Architecture Office at Cisco.
He leads the MEF OpenLSO Analytics project which
uses PNDA.io as a reference implementation for big
data analytics in the MEF LSO Framework.
Donald's long-term focus has been software
architecture leadership for element management
systems, diagnostics and network provisioning
applications in Cisco's product portfolio.
Content
NOW - what does current test data look like
FUTURE - what is expected by the community
ANALYTICS - introducing PNDA.io, a platform for analytics
SAMPLES - what has been done in other domains
NEXT - what shall we do in Euphrates
NOW
What does current test data look like?
Till 22nd May, 2017
● ~160k result records
● 30 projects
● 142 cases
● 45 Pods
● 23 Scenarios
Test Data Collected
OPNFV TestResults site: http://testresults.opnfv.org/test/swagger/spec.html
Data Schema
Top level model
project : project name
case : case name
pod : pod name
version : platform version (Arno-R1, ...)
installer (fuel, ...)
build_tag : Jenkins build tag name
scenario : the test scenario (previously version)
criteria : the global criteria status passed or failed
trust_indicator : evaluate the stability of the test case
start_date: date time test started
stop_date: date time test stopped
details
Key Points
- Common for all records
- Customizable schema in
details
Schema for results: http://testresults.opnfv.org/test/swagger/spec.html#!/APIs/queryTestResults
Typical Func Test Details
FuncTest Details
- "details":
"duration": " 27.79",
"success": "100.00",
"nb tests": 12
"module": "authenticate "
- "details":
"duration": " 80.06",
"success": "100.00",
"nb tests": 11
"module": "glance "
Key Points
- Success rate as indicator
- Breakdown into modules
rally sanity results: http://testresults.opnfv.org:80/test/api/v1/results?case=rally_sanity&last=10&project=functest
Typical Perf Test Details
StorPerf Details
"status": "OK",
"agent_count": 4,
"metrics": {...},
"timestart": 1479912550.192721,
"volume_size": 1,
"pod_name": "intel-pod9",
"public_network": "ext-net",
"duration": 152.46885204315186,
"scenario_name": "ceph_warmup",
"disk_type": "SSD"
Key Points
- Test conditions included in details
- Breakdown in metrics
storperf results: http://testresults.opnfv.org:80/test/api/v1/results?last=10&project=storperf
Typical Perf Test Metrics
StorPerf Metrics
"ws.queue-depth.8.block-size.16384.read.iops": 0,
"ws.queue-depth.8.block-size.16384.write.latency":
18333.634166666667,
"ws.queue-depth.8.block-size.16384.duration": 152,
"ws.queue-depth.8.block-size.16384.read.latency": 0,
"ws.queue-depth.8.block-size.16384.write.iops":
436.33833333333337,
"ws.queue-depth.8.block-size.16384.write.throughput":
6979.75,
"ws.queue-depth.8.block-size.16384.read.throughput": 0
Key Points:
- Flattened dictionary (not nested)
- Dict keys concatenated from metric
properties
Report data embedded
StorPerf Report Data
- "rs.queue-depth.2.block-size.16384":
"iops":
"read":
"steady_state": true,
"series": [...],
"range": 80.7440000000006,
"average": 2566.9578000000006,
"slope": -7.916618181818701
"write":
...
- “wr.queue-depth.2.block-size.2048”:
...
Key Points
- Metrics grouped in multi level dict
- Data broken down into series
- Statistics for each metric generated
-
Scenario Reporting
functest status: http://testresults.opnfv.org/reporting/functest/release/danube/index-status-fuel.html
yardstick status: http://testresults.opnfv.org/reporting/yardstick/release/danube/index-status-compass.html
Testing could be expensive
FUTURE
What is expected by the community?
Values expected from the test data
Trend over time
Comparison of test results between different SUT or condition
Traceability from performance indicator to collected metrics and raw data
Detection of anomaly
Correlation analysis between performance and SUT factors
Share data, develop collaboratively
TESTING PIPELINE
TEST COLLECT AGGREGATECALCULATE REPORT
Collect metrics by
parsing the raw data
Calculate indicators and
statistics from metrics
Aggregate data to
create a synthesis from
different test cases and
iterations
Produce raw data Push synthesis data
for reporting
Introducing PNDA.io
A Platform For Analytics
What is PNDA?
PNDA brings together a number of open source technologies to
provide a simple, scalable open big data analytics Platform for
Network Data Analytics
Linux Foundation Collaborative Project based on the Apache
ecosystem
Why PNDA?
There are a bewildering number of big data technologies out there,
so how do you decide what to use?
We've evaluated and chosen the best tools, based on technical
capability and community support.
PNDA combines them to streamline the process of developing data
processing applications.
• Simple, scalable open data platform
• Provides a common set of services
for developing analytics applications
• Accelerates the process of
developing big data analytics
applications whilst significantly
reducing the TCO
• PNDA provides a platform for
convergence of network data
analytics
PNDA
Plugins
ODL
Logstash
OpenBPM
pmacct
Telemetry
Real
-time
DataDistribution
File
Store
Platform Services: Installation, Mgmt,
Security, Data Privacy
App Packaging
and Mgmt
Stream
Batch
Processing
SQL
Query
OLAP
Cube
Search/
Lucene
NoSQL Time
Series
Data
Exploration
Metric
Visualisation
Event
Visualisation PNDA
Managed App
PNDA
Managed App
Unmanaged
App
Unmanaged
App
Query
Visualisation
and Exploration
PNDA
Applications
PNDA
Producer API
PNDA
Consumer API
PNDA
• Horizontally scalable platform for
analytics and data processing
applications
• Support for near-real-time stream
processing and in-depth batch analysis on
massive datasets
• PNDA decouples data aggregation from
data analysis
• Consuming applications can be either
platform apps developed for PNDA or
client apps integrated with PNDA
• Client apps can use one of several
structured query interfaces or consume
streams directly.
• Leverages best current practise in big
data analytics
PNDA
Plugins
ODL
Logstash
OpenBP
M
pmacct
Telemetr
y
Real
-time
DataDistribution
File
Store
Platform Services: Installation, Mgmt,
Security, Data Privacy
App Packaging
and Mgmt
Stream
Batch
Processing
SQL
Query
OLAP
Cube
Search/
Lucene
NoSQ
L
Time
Series
Data
Exploration
Metric
Visualisation
Event
Visualisation PNDA
Managed App
PNDA
Managed App
Unmanaged
App
Unmanaged
App
Query
Visualisation
and Exploration
PNDA
Applications
PNDA
Producer API
PNDA
Consumer API
PNDA
SAMPLES
What has been done in other domains?
Examples from other domains
Event analytics to detect recurring failures, malicious behaviour, future reliability
trends
https://pndablog.wordpress.com/2017/05/25/an-analytics-based-approach-to-service-assurance-part-2-is
-analytics-the-answer/
BGP message analytics to identify cause of unstable AS paths over time
https://pndablog.wordpress.com/2017/05/25/bgp-security-how-big-data-can-help-detect-attacks/
Analysis of Openstack VM metrics to detect patterns that lead to loss of service
http://pnda.io/usecases
https://pndablog.wordpress.com/
Operational
Intelligence
Planning
Intelligence
Security
Intelligence
NEXT
What shall we do in Euphrates?
Roadmap in Euphrates
Deploy a PNDA instance in OPNFV infrastructure
Sink output from upstream test projects into PNDA instance
Develop value-add analysis with dashboards to augment what
http://testresults.opnfv.org/reporting/index.html already provides
Focus on providing “test intelligence”
Prepare path to using PNDA analytics in a production OPNFV world
Questions?
https://wiki.opnfv.org/display/testing
https://wiki.opnfv.org/display/bamboo/

Big Data for Testing - Heading for Post Process and Analytics

  • 2.
    Big Data forTesting Heading for post process and analytics
  • 3.
    Speakers Yujun Zhang NFV SystemEngineer from ZTE Corporation. He is current PTL of QTIP in OPNFV, and creator of MitmStack in OpenStack His main interest focuses on performance testing, analysis and tuning Donald Hunter Principal Engineer in the Chief Technology and Architecture Office at Cisco. He leads the MEF OpenLSO Analytics project which uses PNDA.io as a reference implementation for big data analytics in the MEF LSO Framework. Donald's long-term focus has been software architecture leadership for element management systems, diagnostics and network provisioning applications in Cisco's product portfolio.
  • 4.
    Content NOW - whatdoes current test data look like FUTURE - what is expected by the community ANALYTICS - introducing PNDA.io, a platform for analytics SAMPLES - what has been done in other domains NEXT - what shall we do in Euphrates
  • 5.
    NOW What does currenttest data look like?
  • 7.
    Till 22nd May,2017 ● ~160k result records ● 30 projects ● 142 cases ● 45 Pods ● 23 Scenarios Test Data Collected OPNFV TestResults site: http://testresults.opnfv.org/test/swagger/spec.html
  • 8.
    Data Schema Top levelmodel project : project name case : case name pod : pod name version : platform version (Arno-R1, ...) installer (fuel, ...) build_tag : Jenkins build tag name scenario : the test scenario (previously version) criteria : the global criteria status passed or failed trust_indicator : evaluate the stability of the test case start_date: date time test started stop_date: date time test stopped details Key Points - Common for all records - Customizable schema in details Schema for results: http://testresults.opnfv.org/test/swagger/spec.html#!/APIs/queryTestResults
  • 9.
    Typical Func TestDetails FuncTest Details - "details": "duration": " 27.79", "success": "100.00", "nb tests": 12 "module": "authenticate " - "details": "duration": " 80.06", "success": "100.00", "nb tests": 11 "module": "glance " Key Points - Success rate as indicator - Breakdown into modules rally sanity results: http://testresults.opnfv.org:80/test/api/v1/results?case=rally_sanity&last=10&project=functest
  • 10.
    Typical Perf TestDetails StorPerf Details "status": "OK", "agent_count": 4, "metrics": {...}, "timestart": 1479912550.192721, "volume_size": 1, "pod_name": "intel-pod9", "public_network": "ext-net", "duration": 152.46885204315186, "scenario_name": "ceph_warmup", "disk_type": "SSD" Key Points - Test conditions included in details - Breakdown in metrics storperf results: http://testresults.opnfv.org:80/test/api/v1/results?last=10&project=storperf
  • 11.
    Typical Perf TestMetrics StorPerf Metrics "ws.queue-depth.8.block-size.16384.read.iops": 0, "ws.queue-depth.8.block-size.16384.write.latency": 18333.634166666667, "ws.queue-depth.8.block-size.16384.duration": 152, "ws.queue-depth.8.block-size.16384.read.latency": 0, "ws.queue-depth.8.block-size.16384.write.iops": 436.33833333333337, "ws.queue-depth.8.block-size.16384.write.throughput": 6979.75, "ws.queue-depth.8.block-size.16384.read.throughput": 0 Key Points: - Flattened dictionary (not nested) - Dict keys concatenated from metric properties
  • 12.
    Report data embedded StorPerfReport Data - "rs.queue-depth.2.block-size.16384": "iops": "read": "steady_state": true, "series": [...], "range": 80.7440000000006, "average": 2566.9578000000006, "slope": -7.916618181818701 "write": ... - “wr.queue-depth.2.block-size.2048”: ... Key Points - Metrics grouped in multi level dict - Data broken down into series - Statistics for each metric generated -
  • 13.
    Scenario Reporting functest status:http://testresults.opnfv.org/reporting/functest/release/danube/index-status-fuel.html yardstick status: http://testresults.opnfv.org/reporting/yardstick/release/danube/index-status-compass.html
  • 14.
  • 15.
    FUTURE What is expectedby the community?
  • 16.
    Values expected fromthe test data Trend over time Comparison of test results between different SUT or condition Traceability from performance indicator to collected metrics and raw data Detection of anomaly Correlation analysis between performance and SUT factors
  • 17.
    Share data, developcollaboratively TESTING PIPELINE TEST COLLECT AGGREGATECALCULATE REPORT Collect metrics by parsing the raw data Calculate indicators and statistics from metrics Aggregate data to create a synthesis from different test cases and iterations Produce raw data Push synthesis data for reporting
  • 18.
  • 19.
    What is PNDA? PNDAbrings together a number of open source technologies to provide a simple, scalable open big data analytics Platform for Network Data Analytics Linux Foundation Collaborative Project based on the Apache ecosystem
  • 20.
    Why PNDA? There area bewildering number of big data technologies out there, so how do you decide what to use? We've evaluated and chosen the best tools, based on technical capability and community support. PNDA combines them to streamline the process of developing data processing applications.
  • 21.
    • Simple, scalableopen data platform • Provides a common set of services for developing analytics applications • Accelerates the process of developing big data analytics applications whilst significantly reducing the TCO • PNDA provides a platform for convergence of network data analytics PNDA Plugins ODL Logstash OpenBPM pmacct Telemetry Real -time DataDistribution File Store Platform Services: Installation, Mgmt, Security, Data Privacy App Packaging and Mgmt Stream Batch Processing SQL Query OLAP Cube Search/ Lucene NoSQL Time Series Data Exploration Metric Visualisation Event Visualisation PNDA Managed App PNDA Managed App Unmanaged App Unmanaged App Query Visualisation and Exploration PNDA Applications PNDA Producer API PNDA Consumer API PNDA
  • 22.
    • Horizontally scalableplatform for analytics and data processing applications • Support for near-real-time stream processing and in-depth batch analysis on massive datasets • PNDA decouples data aggregation from data analysis • Consuming applications can be either platform apps developed for PNDA or client apps integrated with PNDA • Client apps can use one of several structured query interfaces or consume streams directly. • Leverages best current practise in big data analytics PNDA Plugins ODL Logstash OpenBP M pmacct Telemetr y Real -time DataDistribution File Store Platform Services: Installation, Mgmt, Security, Data Privacy App Packaging and Mgmt Stream Batch Processing SQL Query OLAP Cube Search/ Lucene NoSQ L Time Series Data Exploration Metric Visualisation Event Visualisation PNDA Managed App PNDA Managed App Unmanaged App Unmanaged App Query Visualisation and Exploration PNDA Applications PNDA Producer API PNDA Consumer API PNDA
  • 23.
    SAMPLES What has beendone in other domains?
  • 24.
    Examples from otherdomains Event analytics to detect recurring failures, malicious behaviour, future reliability trends https://pndablog.wordpress.com/2017/05/25/an-analytics-based-approach-to-service-assurance-part-2-is -analytics-the-answer/ BGP message analytics to identify cause of unstable AS paths over time https://pndablog.wordpress.com/2017/05/25/bgp-security-how-big-data-can-help-detect-attacks/ Analysis of Openstack VM metrics to detect patterns that lead to loss of service http://pnda.io/usecases https://pndablog.wordpress.com/
  • 26.
  • 27.
    NEXT What shall wedo in Euphrates?
  • 28.
    Roadmap in Euphrates Deploya PNDA instance in OPNFV infrastructure Sink output from upstream test projects into PNDA instance Develop value-add analysis with dashboards to augment what http://testresults.opnfv.org/reporting/index.html already provides Focus on providing “test intelligence” Prepare path to using PNDA analytics in a production OPNFV world
  • 29.