RayNewmarkI’d like to thank everyone for joining us today to learn a little more about how they can achieve the performance they require. My name is Ray Newmark and I’m the Vice President of Sales and Marketing for Pervasive Software’s DataRush business and I’ll be your host. But don’t worry, I’m going to get out of the way very shortly and let our Chief Technologist Jim Falgout have the floor.A few housekeeping notes:This webinar is being recorded, and will be available on our website for viewing.At any time during the webinar, you may enter questions in the Q&A window. We will address them at the end of the presentation. If we can’t get to your question during the time allotted, we will respond to you by email.We will have a few survey questions for you as we go through the presentation. You’ll have the opportunity to enter your answers, and see the polling results.
Big Data and Dataflow:Made for each other
The Need for Other Compute Models “… in addition, these data stores often expose a proprietary interface for application programming (e.g. PL/SQL or TSQL), but not the full power of procedural programming. More programmer-friendly parallel dataflow languages await discovery, I think. MapReduce is one (small) step in that direction.” Engineer-to-Engineer Lectures Jeff Hammerbacher June 2010 2
Support for Other Programming Paradigms “MapReduceNextGen provides a completely generic computation framework to support MapReduceand other paradigms.” The Next Generation of Apache Hadoop MapReduce Arun C Murthy February 2011 3
What is dataflow Based on operators that provide a specific function (nodes) Data queues (edges) connecting operators High Productivity Message Passing Architecture Natural Fit for Big Data 4 find grep awk sort
Where it’s been applied Bioinformatics Next Generation Sequencing Nearly 1 TCUP throughput using Smith Waterman Scalable BFAST implementation Telecom Analyzing Call Data Records (network logs) Operational intelligence Fraud and waste detection Public Sector State income tax revenue recovery Cyber security Financial Services Mortgage analysis Healthcare Claims processing and analysis Fraud detection Network Analyzing network log data Cyber security 5