Big Data Europe in 4th Edition Wind Power Big Data and IoT forum
1. Big Data Europe
for System Monitoring
BigDataEurope in 4th Wind Power Big Data and IoT Forum9-nov.-17
F. Mouzakis, D. Foussekis
and BDE consortium
2. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.2
Overview
Big Data landscape
Project outline
Big Data Europe platform
Data acquisition challenge
Case in WT CMS research
BDE opportunities
www.big-data-europe.eu
Thanks to BIS for inviting BDE !
3. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.3
Big Data
Landscape
(Matt Turck)
www.big-data-europe.eu
4. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.4
Big Data
Landscape
(Matt Turck)
www.big-data-europe.eu
OpenSource
5. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.5
www.big-data-europe.eu
Develop an adaptable, simple to get started solution
to boost adoption of Big Data technologies in EU
Push the use of data technologies within the 7 key societal sectors:
Project Scope
6. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.6
BDE Consortium
www.big-data-europe.eu
7. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.7
Tool groups
www.big-data-europe.eu
Big Data
Technologies
Data Storage
Technologies
Data
Processing
Workflow
Coordination
Querying/
Processing
Search
Data
Export/
Import
Data
Analysis
Text Mining
8. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.8
Open source technologies for Big Data Apps
www.big-data-europe.eu 8
9. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.9
Big Data Technologies vs 3Vs
www.big-data-europe.eu
Volume
VelocityVariety
Storm
10. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.10
www.big-data-europe.eu
Apache Hadoop
A highly scalable storage platform designed to process very large data sets across hundreds to thousands
of computing nodes that operate in parallel. It provides a cost-effective storage solution for large data
volumes with no format requirements.
YARN provides the resource management and HDFS provides the scalable, fault-tolerant, cost-efficient
storage for big data.
Hadoop YARN / MapReduce:. Framework for job scheduling for parallel processing of large data sets
and cluster resource management.
Hadoop Distributed File System (HDFS™): A scalable, fault-tolerant, distributed file system that
provides high-throughput access to application data. Demonstrated case: 200PB in 4500 servers.
11. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.11
www.big-data-europe.eu
Architecture: Operates in blocks of 128MB. Decouples file system metadata from data in different servers
(name-nodes vs data-nodes). HDFS clients first contact the name-node for data location and then transfer data
to or from the specified data-nodes.
Fault-tolerance - Replication: Each block is replicated at least at 3 severs.
Replication is increased to provide high availability of data in high demand, through MapReduce.
Rack awareness: Considers a node’s physical location when allocating storage and scheduling tasks.
Minimal data motion: Hadoop moves compute processes to the data on HDFS and not the other way around.
Processing tasks can occur on the physical node where the data resides, which significantly reduces network I/O
and provides very high aggregate bandwidth.
Optimized for Read operations: Hadoop was designed for large scale processing, i.e. you usually read a
large amount of data, process it and save the results.
Typical fail rate in long running clusters is 2-3 nodes per 1000 nodes a day. On new (recently out of the factory) nodes, the rate is three times higher (Yahoo
report).
12. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.12
www.big-data-europe.eu
A fast and general engine for large-scale data processing.
An in-memory data processing engine -> extremely faster than Hadoop YARN/MapReduce
Creates chunks of data (dataset from external data), then applies parallel operations.
Provides APIs in Java, Python and Scala.
Works with Resilient Distributed Datasets (RDDs) - fault-tolerant collections of elements that can be
operated on in parallel.
Spark applications scale automatically when augmenting the number of Spark worker nodes in the cluster.
Typically runs on Hadoop but also standalone.
Can access diverse data sources: HDFS, databases Hadoop’s Hbase, Cassandra, S3, etc
13. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.13
www.big-data-europe.eu
Virtual Machines vs Containers
VIRTUAL MACHINES
Virtual machines (VMs) are an abstraction of physical hardware turning one server into
many servers. The hypervisor allows multiple VMs to run on a single machine. Each VM
includes a full copy of an operating system, one or more apps, necessary binaries and
libraries - taking up tens of GBs. VMs can also be slow to boot.
CONTAINERS
Containers are an abstraction at the app layer that packages code and
dependencies together. Multiple containers can run on the same machine and
share the OS kernel with other containers, each running as isolated processes in
user space. Containers take up less space than VMs (container images are
typically tens of MBs in size), and start almost instantly.
14. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.14
www.big-data-europe.eu
An open platform for developing, shipping, and running applications. Docker provides the ability
to package and run an application in an isolated environment called a container.
Docker container is a lightweight, stand-alone, executable package of a piece of software that
includes everything needed to run it: code, runtime, system tools, system libraries and settings.
This guarantees that the software will always run the same, regardless of its environment.
Docker Compose : A tool for defining and running multi-container Docker applications..
Docker Swarm : A native clustering for Docker. It turns a pool of Docker hosts into a single,
virtual host. A swarm is a group of machines that are running Docker and joined into a cluster.
15. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.15
Big Data Europe Platform goals
www.big-data-europe.eu
Low total cost of ownership
Simple to get started with Big Data
Cater for widely varying use cases
Embrace emerging Big Data technologies
Simple integration with custom components
16. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.16
BDE Platform
www.big-data-europe.eu
BDE platform consists of 3 layers:
• the hardware layer
• a resource manager – Docker Swarm – and
• Big Data applications running on top
An application can be seen as a pipeline consisting of multiple components,
like HDFS, Spark and Kafka, which are wired together in order to solve a specific Big Data
problem. The components will be packaged in Docker containers and glued together
with Docker Compose.
17. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.17
Platform architecture
www.big-data-europe.eu
18. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.18
www.big-data-europe.eu
What problems does BDE solve?
Big Data is famously characterized by the 4 Vs:
Volume: the platform is designed to handle arbitrarily large amounts of data.
Velocity: the platform is designed to handle real time data, such as climate, energy and
transport sensor data. More complex computational tasks can be handled through batch
processing, that is, one chunk at a time, with results returned after processing has been
completed.
Variety: the platform makes use of Linked Data technologies to ‘semantify the data,’ that is, to
add meaning to the data in whatever format it is in, allowing data from different sources, from
different domains and with different licensing conditions to be integrated with relative ease.
Veracity: the provenance of all data handled by the platform is tracked.
19. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.19
www.big-data-europe.eu
https://www.big-data-europe.eu/howto-install-the-bde-platform/
https://github.com/big-data-europe
You tube: Getting Started With BDE Platform
Install BDE platform
Technical support: TENFORCE
Aad Versteden aad.versteden@tenforce.com
Join the tech Webinar on 16th November
20. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.20
www.big-data-europe.eu
CRES Wind Department - Testing
Wind potential measurement campaigns & Lidar
Power Performance Measurements
Electrical Power Quality of Wind Turbine Generators
Mechanical Load measurements
Blade testing (<30m) and coupon testing
Research on WT testing
21. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.21
www.big-data-europe.eu
CRES challenges
Exploit the capabilities of the new data distributed DAQ with high-throughput
rates for research
Utilize open Big Data technology tools
Case infrastructure:
Fully instrumented WT in WF with no fiber optic connection (24MBps ADSL,
Upload speeds <1Mbps)
Cluster of 16 x [ 8core-CPUs, 16GB RAM + 256GB SSD ]
150TB in 5 x NAS [8 x 4TB in 1disk-redundancy RAID]
22. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.22
www.big-data-europe.eu
23. General requirements in System monitoring
Modular distributed system using standard Ethernet network
Specs compliant with International Standards
Robust and reliable for 24/7 standalone non-stop operation
Scalable and reconfigurable
Time Synchronization across all modules (GPS or Master module)
Embedded processing capabilities
Send notifications and alarms
Data storage
www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.23
24. Typical DAQ system core component (1/2)
NI-CompactRIO platform http://www.ni.com/compactrio/
Designed for harsh environments
-20º to -55ºC temperature, 5grms vibration, 30g shock
Low power requirements
~10W at 9-30V for battery powered standalone operation
Data acquisition based on FPGA hardware offering fast I/O response times
and increased reliability
www.big-data-europe.eu
Source: NI.com
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.24
25. Typical DAQ system core component (2/2)
Various modules for Analogue I/O and Digital I/O
High precision (24-bit delta-sigma) A/D converters (s.g. modules)
16-bit 100kHz Simultaneous Sampling AI modules.
XML - Configuration file
Raw (high speed) data sent over the Ethernet and stored on a NAS.
All data packets are time-stamped with common GPS or local time.
Backup storage on each module USB port, in case of network loss.
www.big-data-europe.eu
Source: NI.com
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.25
26. Field-Programmable Gate Array (FPGA)
www.big-data-europe.eu
- Reconfigurable Hardware chip.
- Analogous to a printed circuit board with unconnected components on it.
- Connections in an FPGA circuit are dynamically defined in software.
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.26
27. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.27
Example of a digital debounce filter:
Rejection of pulses that do not hold their value for predefined clock cycles (here: 2).
Benefits of FPGA in wind energy applications
LabVIEW code running in
circuitry at 40MHz rate:
28. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.28
Benefits of FPGA in wind energy applications
Inline processing:
Reduce the noise of a periodic signal by averaging the samples as acquired on the FPGA
Average 1,024 records, 1 million samples long with 16 bits of resolution
29. Typical configuration
www.big-data-europe.eu
. . .
Gigabit Local Area Network (LAN)
Network Attached Storage (NAS)System Monitoring PC
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.29
30. Raw Data storage – Existing File Formats
www.big-data-europe.eu
ASCII files
+ Human-readable
+ Portable / MS Excel
- Significantly larger disk footprint
- Slow read and write
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.30
Binary
+ Compact file size
+ Fast streaming (read & write)
- Not human-readable
- Not easily exchangeable
XML
+ Stores complex data structures
+ Web browser / text editor
- Even larger disk footprint
- Front-end schema design
- Does not streamDatabase files
+ Store data centrally
+ Organize and query test results with SQL
- Time intensive startup effort
- Requires maintenance
- Potentially high cost
31. Raw Data storage – Structured binary
www.big-data-europe.eu
Binary Structured (TDMS – NI Technical Data Management Technology)
Characteristics
• Single streaming binary file
• Three levels of hierarchy for better organization
File, groups, and channels
• Customizable, descriptive properties at each level
• User-defined Meta data for campaign properties
Considerations
• Third party format
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.31
32. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.32
Type: SCADA and CMS data on a Wind Turbine specifically instrumented for BDE
Status: Open
Sensors: Operational parameters, Vibration, Mechanical Loading, Power Quality etc
Format: Third party (TDMS technical data management streaming) – preprocessing foreseen
Acquisition technology: Field Programmable Gate Arrays (FPGA)
Sampling rate: from 10s/s for operational parameters up to 64ks/s for vibration & power quality up
to 10Ms/s for acoustic emission system
Streaming volume: 4 distributed units yield ~30Gb/hour continuously
Analytics: engineering signal analysis, research on parametrics and loop with updated methodologies
on raw data
Operation of pilot:1 year (on-going) ; target volume 150TB
CMS research based on long term data: Primary Content/Data Involved
33. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.33
CRES Wind Farm
Monitored WT
Neg-Micon 750kW
34. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.34
The monitoring system provides indicatively the following data
for the condition monitoring system:
- system operational statuses/parameters
- wind turbine output electrical power
- nacelle yaw position, yaw motor electrical power
- wind speed from nacelle anemometer (1~10s/s)
and additionally:
- mechanical loads on tower top and base cross section
(~100s/s)
- rotor thrust & tower torsion
- HSS torque (on shaft coupling the gearbox with the
generator) (64ks/s)
- power quality current & voltage (64ks/s)
- vibration at gearbox various stages (64ks/s)
- acoustic emission signals at gearbox (10Ms/s)
CRES CM system on NegMicon 750kW WT
35. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.35
SC3 Pilot measuring system
Gearbox and Drive train DAQ systems
Operation DAQ
system
Supervising PC
Local storage
unit
36. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.36
Bearing vibration
sensors at
gearbox HSS
(accelerometers)
Power Quality voltage and current probes
SC3 Pilot sample sensors
37. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.37
SC3 Pilot sample sensors
Conventional vibration CMS
(dynamic content ~3kHz)
Acoustic emission CMS
(dynamic content 100~400kHz)
38. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.3
Big Data
Landscape
(Matt Turck)
www.big-data-europe.eu
39. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.39
Pilot Data Acquisition module
Access through TeamViewer
40. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.40
Pilot Data Acquisition module
Bending Moment
signals for
Mechanical Loading
Monitoring
41. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.41
Pilot Data Acquisition module
Electrical signals for
Power Quality Monitoring
42. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.42
Pilot Data Acquisition module
Vibration signal for
Gearbox monitoring
43. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.43
OPC server monitor on DAQ app
Pilot Data Acquisition module – On line component
44. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.44
OPC
An Industrial Interoperability
Standard for Open Platform
Communications
https://opcfoundation.org/
Pilot Data Acquisition module – On line component
45. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.45
Pilot Data Acquisition module – On line component
OPC Server operating at: opc.tcp://193.92.104.xxx:49580
Object name: PrjBDE.WT.StatAve
Encryption: SHA256
Data type: 1D- float Array- 52 elements
Transferred parameters: On-line statistical properties
Refresh rate: 2Hz
Server: CompactRIO running NI Real-Time Linux OS
Client: Industrial PC running LabVIEW
46. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.46
Pilot Data Acquisition module – AE component
◎ 4 AE (R30) sensors were installed
◎ DAQ system records data at a
rate of 10~20GB/hour
depending on threshold (research
item)
◎ Data preprocessor module binary-
binary
◎ Additional feature module
47. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.47
Basic analytics
Raw time series
Statistics and
correlations
Dynamic analysis
Fatigue analysis
48. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.48
◎ Addition of new descriptive
features and optimize parameters
◎ Describe normal operation
signature
◎ Back-to-back to vibration CMS
Source: Hit Detection and Determination in AE Bursts, InTechOpen
Research aspects
50. www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring – Research pilots p.50
Pilot concept
150TB Raw Data
Preprocessing
(ie: 60sec chunks)
Analysis
Code
Cluster
execution
Storage of
Intermediate
Results
Database
Final Results
Standalone NAS grid
51. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.51
www.big-data-europe.eu
Final Remarks – BDE opportunities
Big Data Europe offers a easy-to-use generic platform to encourage community’s
entrance to big data technologies
BDE platform will be maintained and be part in future research proposals
You are invited to implement your case with BDE:
https://www.big-data-europe.eu/bdi-components/
https://www.big-data-europe.eu/howto-install-the-bde-platform/
https://github.com/big-data-europe
Getting Started With BDE Platform
Tech. Support is provided by BDE tech team. Contact: TENFORCE Aad Versteden aad.versteden@tenforce.com
52. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.52
www.big-data-europe.eu
Final Remarks – BDE opportunities
Stay tuned www.big-data-europe.eu
Join the tech Webinar on 16th November when the technical team will be setting
out what the Big Data Europe Integrator Platform can do, how it does it, and how
you can use it to derive more value from your data. This launch webinar includes
also the latest achievements we made in Semantifying the Big Data Stack
Meet the tech team in EDF 2017
53. BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.53
Join the 3rd Big Data Europe Workshop in:
www.big-data-europe.eu
Final Remarks – BDE opportunities
Join the 3rd
Webinar in
December
Join the 3rd
Webinar in
December
54. Thank you for your
attention
BDE consortium
www.big-data-europe.eu
BigDataEurope in 4th Wind Power Big Data and IoT Forum Workshop, Berlin 8&9/11/17
System Monitoring p.54