Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
© 2013 IBM Corporation 
Real time video analytics with InfoSphere Streams, OpenCV and R 
data2day conference 2014, Karlsru...
© 2014 IBM Corporation 
Motivation: Use machine data to make machines smarter 
Modern machines produces an incredible amo...
© 2014 IBM Corporation 
The demo scenario: Imagine a tunnel drill equipment where the conveyor belt is continuously superv...
© 2014 IBM Corporation 
Traditional approach 
– Historical fact finding 
– Analyze persisted data 
– (Micro-) Batch philos...
© 2014 IBM Corporation 
LIVE - DEMO 
5
© 2014 IBM Corporation 
We have used standard algorithms from OpenCV to extract the inte- resting part of the pictures by ...
© 2014 IBM Corporation 
The background removal included sevaral steps such as data preparation & cleansing and background ...
© 2014 IBM Corporation 
The background removal included sevaral steps such as data preparation & cleansing and background ...
© 2014 IBM Corporation 
Features 
Background 
Frequencies 
Spectrum 
Edges 
Camera Motion 
Energy 
Zero-crossings 
Models ...
© 2014 IBM Corporation 
Visual Features 
Spatial Granularities 
Spatial-Frequency Information 
Spatial Information 
Distri...
© 2014 IBM Corporation 
We have calculated several color features and the object‘s area, now we can use it for calculation...
© 2014 IBM Corporation 
We have „marked“ the structured data from the color analytics application and used it to train a m...
© 2014 IBM Corporation 
The model is created when the application is started based on the training data, and predicts the ...
© 2014 IBM Corporation 
Features 
Background 
Frequencies 
Spectrum 
Edges 
Camera Motion 
Energy 
Zero-crossings 
Models ...
© 2014 IBM Corporation 
A freely available Webcast from IBM Research provides further insights into image and video analyt...
© 2014 IBM Corporation 
R 
–Open Source software for statistics and advanced analytics 
–http://cran.r-project.org/ 
We ha...
© 2014 IBM Corporation 
InfoSphere Streams is the result of an IBM research project, designed for high-throughput, low lat...
© 2014 IBM Corporation 
Telecommunication 
Transport 
Manufacturing 
Security 
Radio astronomy 
Healthcare 
Industrie 4.0 ...
© 2014 IBM Corporation 
Where technology meets business potential: Start making sense of your data (in real time), it is p...
Upcoming SlideShare
Loading in …5
×

Real time video analytics with InfoSphere Streams, OpenCV and R

4,314 views

Published on

Unstructured data are a fast growing area and a source for many innovative Big Data & Analytics solutions. Often the first idea of unstructured data seems to be that it's probably text data, even though that is just a small part. A lot of that "new data" is sensor data and especially multimedia (audio, video). Even though this part is growing extremly fast, it is very rarely used in analytics today. And even less in a real time context.

In order to experience what does it mean and how does it feel (and if it is possible to make sense of it) to work with this new data in real time, Wilfried Hoge and I have created a demo that shows our own experience and explains important concepts & implementation. approaches. The demo we created shows a drill equipment as it is used to build tunnels and how to analyze the output on the conveyor belt visually with machine learning approaches.

Published in: Technology

Real time video analytics with InfoSphere Streams, OpenCV and R

  1. 1. © 2013 IBM Corporation Real time video analytics with InfoSphere Streams, OpenCV and R data2day conference 2014, Karlsruhe Stephan Reimann – IT Specialist Big Data – stephan.reimann@de.ibm.com @stereimann Wilfried Hoge – IT Architect Big Data – hoge@de.ibm.com @wilfriedhoge
  2. 2. © 2014 IBM Corporation Motivation: Use machine data to make machines smarter Modern machines produces an incredible amount of data Use machine generated data to –make machines more efficient –reduce downtimes with better maintenance management –prevent failures -> make machines smarter Also use unstructured data such as video Use that data in real time 2
  3. 3. © 2014 IBM Corporation The demo scenario: Imagine a tunnel drill equipment where the conveyor belt is continuously supervised by a video camera What if you can detect a problem in real time, and take an appropriate action such as stopping the machine to prevent damage? Our demo focuses on analyzing the data from a single camera to make it easy to understand; in a real life scenario there are usually many structured and unstructured data sources that are most likely combined (e.g. analyzing the image data together with speed info) And since we did not have one, we created one  3
  4. 4. © 2014 IBM Corporation Traditional approach – Historical fact finding – Analyze persisted data – (Micro-) Batch philosophy – PULL approach Streaming analytics – Analyze the current moment / the now – Analyze data directly “in Motion” – without storing it – Analyze data at the speed it is created – PUSH approach Streaming analytics is a paradigm shift from pull to push analytics in real time, directly „on the wire“, data does not need to be persisted Data Repository Analysis Insight Data Analysis Insight 4
  5. 5. © 2014 IBM Corporation LIVE - DEMO 5
  6. 6. © 2014 IBM Corporation We have used standard algorithms from OpenCV to extract the inte- resting part of the pictures by learning and removing the background We are only interested in the objects that are on the conveyor belt, not in the conveyor belt But we don‘t know which objects will pass there, there may be many different One approach is to describe the background and filter it out, in other words: outlier analysis We have used a standard algorithm (CodeBook) from OpenCV (open source image analytics library) 6
  7. 7. © 2014 IBM Corporation The background removal included sevaral steps such as data preparation & cleansing and background detection & removal 7
  8. 8. © 2014 IBM Corporation The background removal included sevaral steps such as data preparation & cleansing and background detection & removal Filter: Select the area of interest Cleanse: Reduce the noise level Analyze & Transform: Learn the background and create a mask: black=background, white=foreground Cleanse: Reduce the noise level of the back- ground detection Transform: Combine the background detec- tion with the original image, it‘s basically a logical AND Just for visualization: Create the blue separator image Publish: The Export operators provide the data to other streaming analytics applications (here: the visuali- zation & the color analytics) via publish & subscribe 8
  9. 9. © 2014 IBM Corporation Features Background Frequencies Spectrum Edges Camera Motion Energy Zero-crossings Models P P P P P P P P P P Positive Examples Negative Examples N N N N N N N N N N Labeled Data Unlabeled Data Addaboost K-means Regression Bayes Net Nearest Neighbor Neural Net Deep Belief Nets GMM Clustering Markov Model Decision Tree Expectation Maximization Factor Graph Shot Boundaries Semantics Multimedia Data Scenes Locations Settings Objects Activities Actions Objects Actions Behaviors People Objects Living Cars Animals People Vehicles Activities Scenes People Places Faces Objects Events Activities GMM SVMs Shape Texture Ensemble Classifiers Motion Moving Objects Active Learning Regions Scene Dynamics Tracks Color One approach to image analytics is extracting features and using a variety of statistical/mathematical concepts to deduce the semantics 9
  10. 10. © 2014 IBM Corporation Visual Features Spatial Granularities Spatial-Frequency Information Spatial Information Distribution Local Texture Color Wavelet Tamura Texture Wavelet Texture Color Wavelet Texture Spatial Relation Edges/Shape Shape Moments Edge Histogram Siftogram Fourier Shape Image Type Image Statistics Dominant Colors Spatial Scales Scale- Orientation Hough Circle Max- Response Filters Curvelets Color (Pixels) Color Correlogram Color Moments Interest Points Thumbnail Image Local Binary Patterns Color Histogram Complexity 1 3 2 Global Pyramid3 Horiz. Parts Vertical Horizontal Layout Pyramid Grid Cross Center Typical image features used for analytics include color, shapes, texture and many more, we have focussed on color for the demo 10
  11. 11. © 2014 IBM Corporation We have calculated several color features and the object‘s area, now we can use it for calculations / analytics Area (in pixel) Absolut Color Values Color Histogram The cool thing: Now you have attributes! It‘s structured! You can directly use it or combine it with other data sources, e.g. calculate conveyor belt throughput based on area and speed information. Analytics: Calculates the color attributes and the area Import: Receives the data from the background separation app via subscribe Visualization: Write the text and draw the color histogram 11
  12. 12. © 2014 IBM Corporation We have „marked“ the structured data from the color analytics application and used it to train a model to detect object classes Describing explicitly what is characteristic for an object class is difficult/impossible. We have used the numbers to let the algorithm behind the model learn it. The algorithm just needs the marked data (=training data set). Marked data means we provided the information which object class was visible at which time. 12
  13. 13. © 2014 IBM Corporation The model is created when the application is started based on the training data, and predicts the object class for each image in real time We have used R (an Open Source package for statistics and advanced analytics) to create the predictive model The model is created when the streaming analytics application is started Once the application is running, the individual score and the prediction are calculated for each individual image (or in other words: the predictive model is applied), this is called scoring In our demo the model is only trained once at startup and maintains constant afterwards, but it is also possible to refresh models continuously or in certain intervals Import: Receives the data from or analytics app via subscribe Visualization: Visualizes the results Visualization: Write the prediction as text on the image 13
  14. 14. © 2014 IBM Corporation Features Background Frequencies Spectrum Edges Camera Motion Energy Zero-crossings Models P P P P P P P P P P Positive Examples Negative Examples N N N N N N N N N N Labeled Data Unlabeled Data Addaboost K-means Regression Bayes Net Nearest Neighbor Neural Net Deep Belief Nets GMM Clustering Markov Model Decision Tree Expectation Maximization Factor Graph Shot Boundaries Semantics Multimedia Data Scenes Locations Settings Objects Activities Actions Objects Actions Behaviors People Objects Living Cars Animals People Vehicles Activities Scenes People Places Faces Objects Events Activities GMM SVMs Shape Texture Ensemble Classifiers Motion Moving Objects Active Learning Regions Scene Dynamics Tracks Color Color Decision Tree The demo has shown image analytics on one feature and model, in reality a combination of several features & models is used 14
  15. 15. © 2014 IBM Corporation A freely available Webcast from IBM Research provides further insights into image and video analytics and the theorie behind IBM Analytics Education Series: Lecture 7 - Multimedia - Image and Video Analytics 15
  16. 16. © 2014 IBM Corporation R –Open Source software for statistics and advanced analytics –http://cran.r-project.org/ We have used InfoSphere Streams for the real time analytics and have extended it with R and OpenCV for the implementation OpenCV –Open Source computer vision and machine learning software library –http://opencv.org/ & InfoSphere Streams OpenCV Toolkit on GitHub InfoSphere Streams –Software for real time analytics on any kind of Big Data Free Quickstart Edition Developer Community www.ibmdw.net/streamsdev/ ibm.co/streamsqs + Tutorials, Labs, Forum, ... GitHub Community github.com/IBMStreams + Toolkits, Toolkits, Toolkits 16
  17. 17. © 2014 IBM Corporation InfoSphere Streams is the result of an IBM research project, designed for high-throughput, low latency and to make streaming analytics easy Scale out Millions of Events per Second Complex Data & Analytics All kinds of data Complex analytics: Everything you can express via an algorithm Low Latency Analyzes data at the speed it is created Latencies down to μs Immediate action in real time + + InfoSphere Streams Capabilities How it works –Define apps as flow graphs consisting of sources (inputs), operators & sinks (outputs) –Extend the functionality with your code if required for full flexibility –The clustered, distributed runtime on commodity HW scales nearly limitless –GUIs for rapid development and operations make streaming analytics easy 17
  18. 18. © 2014 IBM Corporation Telecommunication Transport Manufacturing Security Radio astronomy Healthcare Industrie 4.0 Energy & Utilities Connected Car ... optimizes the traffic in Stockholm and Dublin ... analyzes acoustic signals to protect sensible areas ... optimizes the quality of mobile networks ... is the foundation for real-time campaign to increase customer satis- faction and revenues ... analyzes and selects images in real-time within the world‘s largest radio telescope ... and is a core component within many innovation initiatives Present / In production Trends Prototypes InfoSphere Streams is already used in a broad range of real time analytics applications across industries 18
  19. 19. © 2014 IBM Corporation Where technology meets business potential: Start making sense of your data (in real time), it is possible! Gain value from your data 19 There are many opportu- nities to gain value from data. Let‘s talk how to make sense of your data! http://www-05.ibm.com/de/events/workshop/bigdata/ Make maintenance more predictable to reduce downtimes Detect error patterns to prevent failures Better understand complex systems and their dependencies to improve efficiency

×