The Briefing Room with Robin Bloor and Pervasive Software
Slides from the Live Webcast on May 1, 2012
The old methods of delivering data for analysts and other business users will simply not scale to meet new demands. Hadoop is rapidly emerging as a powerful and economic platform for storing and processing Big Data. And yet, the biggest obstacle to implementing Hadoop solutions is the scarcity of Hadoop programming skills.
Check out this episode of The Briefing Room to learn from veteran Analyst Robin Bloor, who will explain why modern information architectures must embrace the new, massively parallel world of computing as it relates to several enterprise roles: traditional business analysts, data scientists, and line-of-business workers. He'll be briefed by David Inbar and Jim Falgout of Pervasive Software, who will explain how Pervasive RushAnalyzer™ was designed to accommodate the new reality of Big Data.
For more information visit: http://www.insideanalysis.com
Watch us on YouTube: http://www.youtube.com/playlist?list=PL5EE76E2EEEC8CF9E
3. Reveal the essential characteristics of enterprise
software, good and bad
Provide a forum for detailed analysis of today’s
innovative technologies
Give vendors a chance to explain their product to
savvy analysts
Allow audience members to pose serious questions...
and get answers!
Twitter Tag: #briefr
Tuesday, May 1, 12
5. Ultimately analytics is about businesses making optimal
decisions, although the range of technologies that inhabit
this area is wide: statistical analysis, data mining, process
mining, predictive analytics, predictive modeling, business
process modeling and additionally complex event
processing.
With the advent of big data, analytics has become “big
analytics” with organizations diving into large heaps of data
that previously was not available or usable.
Open source technologies (Hadoop, etc.) in conjunction with
the cloud have expanded the range of what is possible in
the cloud and considerably reduced the price of leveraging
new and, often very substantial data sources.
Twitter Tag: #briefr
Tuesday, May 1, 12
6. Robin Bloor is Chief
Analyst at The
Bloor Group.
Robin.Bloor@Bloorgroup.com
Twitter Tag: #briefr
Tuesday, May 1, 12
7. Pervasive Software, a provider of data integration and
database software, introduced Pervasive DataRush, a
parallel data flow development platform several years
ago.
Aside from marketing that capability it has been using it
to build data integration and data flow enabled BI
products that exploits the DataRush capability.
Pervasive RushAnalyzer is one the new parallel BI products
that has been built using DataRush. It is aimed squarely at
solving problems of in the management and analysis of big
data, and delivering new capabilities.
Twitter Tag: #briefr
Tuesday, May 1, 12
8. David Inbar is Senior Director, Pervasive Big Data Products &
Solutions leading the business and product management
functions for Pervasive’s Big Data Products group. Previously
he led the global marketing and international channels
teams for Pervasive’s Integration Products group as well as
the company’s Innovation Lab. David has driven innovative
business models and technology adoption strategies for many
application development and data management products.
Jim Falgout is Chief Technologist, Pervasive Big Data
Products and Solutions. As Chief Technologist for Pervasive’s
Big Data team, Jim Falgout is responsible for setting
innovative design principles that guide Pervasive engineering
teams as they develop new big data-focused releases and
products. Jim is responsible for the architectural design of a
software development platform for parallel applications that
deliver high throughput on big data.
Twitter Tag: #briefr
Tuesday, May 1, 12
9. May 1, 2012
Drinking from the Fire Hose:
Practical Approaches to
Big Data Preparation and Analytics
The Briefing Room
bigdata.pervasive.com
10. The Internet is the Fuel for the Fire
Source: IBM Corporation
2
11. The Real Culprit: an Internet of Things
Source: McKinsey Global Institute report on Big Data, May 2011
3
15. Big Data Analytics Software Requirements
Additional Requirements
• Must be usable by business users and analysts
• Graphical/visual environment
• Option to extend via scripting
• Scalable and cross-platform: laptop, desktop, Hadoop cluster
7
19. Pervasive RushAnalyzer Key Differentiators
! Comprehensive ETL and data preparation
! Analytics data scientists will love: machine learning
! Works with existing toolsets
! No cost to get started
! Scales from laptop to server to Hadoop clusters
! True distributed computing on Hadoop clusters
11
22. At the moment Big Data is often managed as “a project on
the side” - isolated from the normal data flows associated
with data warehousing
This situation will not last. Either the large data heaps are
ephemeral or they are here to stay. But once your start
gathering data you don’t usually stop treated.
If the big data heaps are here to stay they require data
flow architecture. In that sense the Hadoop - Hive- HBase-
Pig arrangement is really just a big prototype.
That data flow architecture must serve both big data
analysis and traditional data warehousing.
Tuesday, May 1, 12
24. We not only have the challenges of big data and big data
flow, we also have the problem of data pool proliferation
and the opportunities provided by data mashup/discovery
If we extrapolate from now we run into a complexity of
data flows that can no longer be managed by point-to-
point thinking.
In effect we get a combinatorial explosion - which
dictates the need - in fact the necessity - for data flow
architecture and data analysis architecture.
If it didn’t deliver value, no-one would do it.
Tuesday, May 1, 12
25. The PC Revolution, The Internet Revolution, The mobile
revolution were all surprises even for those who saw them
coming. They all brought more data and more data
distribution.
The coming Embedded revolution could be characterized
as “the web of intelligent things” - things that know their
state, report their state, can respond to their state or can
respond collectively.
Think of:
A cup that knows what’s in it
A house that knows whose home
A car that knows how much you had to drink
Tuesday, May 1, 12
26. The Challenge is Speed and
Complexity
Big Data has only just begun:
Think of current big data
projects as the early
spreadsheets
Data flow architecture is already
an issue.
Complexity is increasing
Speed is the enabler or the
barrier
Twitter Tag: #briefr
Tuesday, May 1, 12
27. Questions
It is not clear to me what product classification this falls
under. It appears to be a data flow architecture design and
implementation capability. Is that the case?
What does RushAnalyzer complement? What does it
compete with?
What interfaces does it have to different data sources?
Clearly this is very fast operationally, because of the
underlying parallelism. Can you give us some idea of how
this compares in speed terms with, for example, a Hadoop
arrangement aimed at a similar set of capabilities
What skills are required to make best use of this capability?
Twitter Tag: #briefr
Tuesday, May 1, 12
28. Questions
Who have been the early adopters of this kind of capability
and what kind of business problems are they trying to solve?
Which vertical business sectors have shown most interest
and which have shown least interest?
Quo vadis?
Twitter Tag: #briefr
Tuesday, May 1, 12