Grab some coffee and enjoy
the pre-show banter before
the top of the hour!
Five Critical Success Factors for Big Data and Traditional BI

The Briefing Room
Welcome

Host:
Eric Kavanagh
eric.kavanagh@bloorgroup.com

Twitter Tag: #briefr

The Briefing Room
Mission

!   Reveal the essential characteristics of enterprise software,
good and bad
!   Provide a forum for detailed an...
Topics

This Month: INNOVATORS
January: ANALYTICS
February: BIG DATA
2014 Editorial Calendar at

www.insideanalysis.com/we...
Data Discovery & Visualization

INNOVATORS
Twitter Tag: #briefr

The Briefing Room
Analyst: Robin Bloor

Robin Bloor is
Chief Analyst at
The Bloor Group	
	

robin.bloor@bloorgroup.com

Twitter Tag: #briefr...
VelociData
! VelociData offers purpose-built big data operations
appliances
!   Its solutions combine field-programmable g...
Guests: Ron Indeck and Chris O’Malley
Ron Indeck is President,
CTO and Founder
of VelociData

Chris O’Malley is
CEO of Vel...
VelociData

Solving the Need for Speed in Big DataOps

The Bloor Group – December 10, 2013	

Fall 2013

10

www.velocidata...
Dr. Ronald Indeck – Founder and President, VelociData
•  Founder and CTO, Exegy
•  Former Professor, Washington University...
Five Critical Success Factors for Leveraging Data

1.  Don’t ignore data ingest and transformation
2.  Data Integration sp...
Why Data is Breaking the Seams of Conventional Options
Competitive advantage is achieved in seizing the opportunity presen...
Complexity

Cost

•  high volume (e.g., 10M+ row, densely populated tables)
•  high growth (e.g., >60% annually)
•  multip...
VelociData Solution Palette
VelociData Suites

VelociData Solutions

Lookup and Replace

Examples

Conventional

VelociDat...
The New World Data Challenges Being Solved
•  Credit card company reduces MIPS and improves performance to
integrate histo...
VelociData: Continuous Innovation
• 3Q13
• Format Preserving Encryption and Data Masking
• Extensive Mainframe Data Conver...
Let’s Start the Conversation Now

For more information visit: http://velocidata.com
Helpful Resources:
Alternatives for Da...
Questions?

19

www.velocidata.com

info@velocidata.com
How We Achieve Orders of Magnitude in Acceleration
VelociData Big Data Operations Appliance
•  Purpose built solutions tha...
Business Value for Most Architectures
CSV	


XML	


Big Data Operations Appliance
to Maximize Data
Transformation Accelera...
Platform Processes Offloaded to VelociData
Wire-rate transformations – purpose-built for better price performance

VelociDa...
Common ETL Bottlenecks
Extract

Transform

Load

ETL Server
CSV	


Lookup & replace
Field validation: datatype

Mainframe	...
ETL Processes Offloaded to VelociData
Extract

Transform

Keep Existing Input Interfaces	


Load

Accelerate Bottlenecks
at...
Perceptions & Questions

Analyst:
Robin Bloor

Twitter Tag: #briefr

The Briefing Room
Technology Evolution (Bloor Curve)
Disruption on Disruption
u  We

are no longer certain
that the pattern still holds
u  We used to encounter new
technolog...
Parallelism Will Become the Norm

u  This is not just about
software
u  It is also about hardware
architectures
u  But ...
CPUs, GPUs and FPGAs

u  CPUs, GPUs and FPGAs are
commodities
u  They can be harnessed to
deliver extreme
parallelism on...
The Memory Cascade
u  On chip speed v RAM
•  L1(32K) = 100x
•  L2(246K) = 30x
•  L3(8-20Mb) = 8.6x
u  RAM v SSD
•  RAM =...
Going Forward

The old limitations
are no longer

SO LIMITING
u  Can

one VelociData Appliance serve many
applications?

u  What

of data cleansing functionality (e.g.,
cleansing rul...
u  How

long does it take to implement and
what is the process? Please describe.

u  With

Hadoop, what are the possibil...
Twitter Tag: #briefr

The Briefing Room
Upcoming Topics

This Month: INNOVATORS
January: ANALYTICS
February: BIG DATA
2014 Editorial Calendar at

www.insideanalys...
Thank You
for Your
Attention

Twitter Tag: #briefr

The Briefing Room
Five Critical Success Factors for Big Data and Traditional BI
Upcoming SlideShare
Loading in...5
×

Five Critical Success Factors for Big Data and Traditional BI

2,320

Published on

The Briefing Room with Dr. Robin Bloor and VelociData
Live Webcast Dec. 10, 2013
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?AT=pb&SP=EC&rID=7909837&rKey=b0bac7d09bf1a638

Most Big Data discussions focus on analytics, but business users need more than that. They need speed, because most opportunities these days are transient and must be acted on quickly. Bottlenecks in the delivery of analytic results often occur on the gathering and transformation side, where massive volumes of data must be validated, converted, masked or otherwise transformed before hitting the analytics engine. Big Data is rapidly overrunning conventional approaches, creating requirements for accelerated, hybrid systems.

Register for this episode of the Briefing Room to hear veteran IT Analyst Dr. Robin Bloor, as he explains how a combination of innovations is dramatically changing how companies can solve serious data transformation challenges. Robin will be briefed by Ron Indeck of VelociData, who will tout their record-breaking data operations appliance. He'll also discuss five critical success factors for achieving optimal performance, including the necessary infrastructure for executing data transformations at wire speed.

Visit InsideAnalysis.com for more information

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,320
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
34
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Five Critical Success Factors for Big Data and Traditional BI

  1. 1. Grab some coffee and enjoy the pre-show banter before the top of the hour!
  2. 2. Five Critical Success Factors for Big Data and Traditional BI The Briefing Room
  3. 3. Welcome Host: Eric Kavanagh eric.kavanagh@bloorgroup.com Twitter Tag: #briefr The Briefing Room
  4. 4. Mission !   Reveal the essential characteristics of enterprise software, good and bad !   Provide a forum for detailed analysis of today s innovative technologies !   Give vendors a chance to explain their product to savvy analysts !   Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr The Briefing Room
  5. 5. Topics This Month: INNOVATORS January: ANALYTICS February: BIG DATA 2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room Twitter Tag: #briefr The Briefing Room
  6. 6. Data Discovery & Visualization INNOVATORS Twitter Tag: #briefr The Briefing Room
  7. 7. Analyst: Robin Bloor Robin Bloor is Chief Analyst at The Bloor Group robin.bloor@bloorgroup.com Twitter Tag: #briefr The Briefing Room
  8. 8. VelociData ! VelociData offers purpose-built big data operations appliances !   Its solutions combine field-programmable gate arrays (FPGAs), graphics processing units (GPUs) and central processing units (CPUs) to enable high speed parallelism ! VelociData can improve data transformation and data quality performance by several orders of magnitude Twitter Tag: #briefr The Briefing Room
  9. 9. Guests: Ron Indeck and Chris O’Malley Ron Indeck is President, CTO and Founder of VelociData Chris O’Malley is CEO of VelociData Twitter Tag: #briefr The Briefing Room
  10. 10. VelociData Solving the Need for Speed in Big DataOps The Bloor Group – December 10, 2013 Fall 2013 10 www.velocidata.com @velocidat tel.: 314.785.0601 www.velocidata.com a info@velocidata.com info@velocidata.com
  11. 11. Dr. Ronald Indeck – Founder and President, VelociData •  Founder and CTO, Exegy •  Former Professor, Washington University •  Das Family Distinguished Professor •  Director, Center for Security Technologies •  Former President, Institute of Electrical & Electronics Engineers (IEEE) Magnetics Society •  Past Recipient Bar Association Inventor of the Year 11
  12. 12. Five Critical Success Factors for Leveraging Data 1.  Don’t ignore data ingest and transformation 2.  Data Integration speed and cost really count 3.  Hadoop alone does not solve the problem 4.  VelociData eliminates data ingest bottlenecks 5.  Big Data project risks can be mitigated effectively 12 www.velocidata.com info@velocidata.com
  13. 13. Why Data is Breaking the Seams of Conventional Options Competitive advantage is achieved in seizing the opportunity presented in transient business moments; this is creating a crisis between the growth of data sources and the relentless quest for faster insights •  Volume: Data volume growing exponentially at 55% annually •  Variety: Must harness numerous new data sources •  Velocity: Reconcile data moving at differing speeds; batch, streaming, archived These factors are compounded by Hadoop that offers data management at ~80% less cost than conventional approaches, justifying storage of everything over longer periods of time; this is spawning business ideas for monetizing the use of data creating use cases requiring massive acceleration of data operations that must handle the scale and complexity of the 3Vs Following conventional best practices no longer satisfies critical business applications CSF #1: Don’t ignore data ingest and transformation 13 www.velocidata.com info@velocidata.com
  14. 14. Complexity Cost •  high volume (e.g., 10M+ row, densely populated tables) •  high growth (e.g., >60% annually) •  multiple varieties and sources (structured and unstructured) •  high velocity (e.g., data available in less than an hour) Scalability Conventional options for improving data operations performance under the following requirements: Performance What are Conventional Options for Accelerating DataOps? Add cores to existing ETL processes Add MIPS to existing IBM mainframe data integration jobs Push down optimization (ELT) Hadoop (ELT) Entirely new engineered system platform CSF #2: Data integration speed and cost really count CSF #3: Hadoop alone doesn’t solve the problem 14 www.velocidata.com info@velocidata.com
  15. 15. VelociData Solution Palette VelociData Suites VelociData Solutions Lookup and Replace Examples Conventional VelociData (records/second) (records/second) Data enrichment by populating fields from a master file <3000 600,000 500 700,000 XML à Fixed; Binary à Char 1000-2000 800,000 2013-01-02 à 01/02/2013 1000-3000 800,000 Cardio Pulmonologist à CP Type Conversions Format Conversions Rearrange, add, drop, or resize fields to change layouts 1000 650,000 Surrogate Key Generation Hash multiple field values into a unique pseudo-key 3000 > 1,000,000 Generate MD5 or SHA hash keys 3000 > 1,000,000 Data Masking Data Transform Obfuscate data for non-production uses: Persistent or Dynamic; Format preserving encryption; AES-256 500-1000 > 1,000,000 600 400,000 Validate a value based on a list of acceptable values (e.g., all states in the US; all countries in the world) 1000-3000 750,000 Validates based on patterns such as emails, dates, phone numbers, … 1000-3000 > 1,000,000 3000 > 1,000,000 200 > 200,000 Standardization, verification, and cleansing USPS Address Processing (CASS certification in process) Data Quality Domain Data Validation Field Validation Data type validation and bounds checking Data Platform Offload Mainframe Data Offload Copybook parsing & data layout discovery; EBCDIC, COMP, COMP-3, … à ASCII, Integer, Float,… Results are system dependent but data intended to provide magnitude comparison 15 CSF #4: VelociData eliminates data ingest bottlenecks www.velocidata.com info@velocidata.com
  16. 16. The New World Data Challenges Being Solved •  Credit card company reduces MIPS and improves performance to integrate historical and fresh data into Hadoop analytics process by processing 10 million records per minute •  Financial processing network masks 5 million fields per second of production data to sell opportunity information to retailers •  To enable customer support for a health benefits provider by shortening a data integration process from 16 hours to 45 seconds •  Property casualty company shortens a daily task of processing 450 million records from 5 hours to less than 1 hour •  Retailer now processes xml data to integrate 360 degree customer data from in-store, on-line, and mobile sources in real time CSF #5: Big Data project risks can be mitigated effectively 16 www.velocidata.com info@velocidata.com
  17. 17. VelociData: Continuous Innovation • 3Q13 • Format Preserving Encryption and Data Masking • Extensive Mainframe Data Conversion • Extensive XML Processing • 4Q13 • Expanded Hashing and Key Generation Options • Additional Mainframe Record Types • Scalable Deployment Management 17 www.velocidata.com info@velocidata.com
  18. 18. Let’s Start the Conversation Now For more information visit: http://velocidata.com Helpful Resources: Alternatives for Data Integration: http://velocidata.com/our-solution Industry Analyst Research Reports: http://velocidata.com/resources Data Ops – Meeting Big Data Organizational Challenges: http://velocidata.com/blog Join us on social media: Twitter: @VelociData LinkedIn: http://www.linkedin.com/company/velocidata?trk=company_name Google+: https://plus.google.com/112063174918659483670/posts Phone: +1-314-785-0601 E-Mail: rindeck@VelociData.com / info@VelociData.com We will send a follow-up email containing this presentation and links to contact us 18 www.velocidata.com info@velocidata.com
  19. 19. Questions? 19 www.velocidata.com info@velocidata.com
  20. 20. How We Achieve Orders of Magnitude in Acceleration VelociData Big Data Operations Appliance •  Purpose built solutions that combine a mix of software, firmware, and massively parallel hardware to provide acceleration often approaching wirespeeds •  Heterogeneous compute environment that includes FPGAs, GPUs, and CPUs to offer a level of internal parallelism that can dramatically outperform software on general purpose computers •  Business Micro Supercomputer in a 4U rack form factor 20 www.velocidata.com info@velocidata.com
  21. 21. Business Value for Most Architectures CSV XML Big Data Operations Appliance to Maximize Data Transformation Acceleration to Wire Speed zOS Data RDBMS Wire Rate Transformations •  Normalize •  Encrypt/Mask •  Cleanse •  Enrich Social Media •  Hadoop •  ETL Server •  Data Warehouse •  Database Appliances •  BI Tools •  Downstream zOS Process •  Cloud Sensor Hadoop 21 www.velocidata.com info@velocidata.com
  22. 22. Platform Processes Offloaded to VelociData Wire-rate transformations – purpose-built for better price performance VelociData feeds Hadoop pre-processed, quality data for real-time BI efforts Mainframe Too expensive to keep adding mainframe MIPS? Hadoop Are self-service business analytics users frustrated with the time required to transform unstructured and legacy data into something useful for decision making? Seamlessly offload to VelociData the heavy lifting ETL/ELT processes from Ab Initio, IBM, and Informatica MPP Platforms (Teradata, Netezza) Is using the MPP Platform for ELT and Push Down Optimization not an optimal use of resources? ETL Server ETL server having trouble keeping up with exploding data growth? 22 www.velocidata.com info@velocidata.com
  23. 23. Common ETL Bottlenecks Extract Transform Load ETL Server CSV Lookup & replace Field validation: datatype Mainframe validation Candidates for Acceleration Field validation: bounds checking Aggregation XML USPS address standardization Business rules RDBMS Entity resolution Exception / error handling Social Media Primary RDBMS Sensor Hadoop Staging DB www.velocidata.com info@velocidata.com
  24. 24. ETL Processes Offloaded to VelociData Extract Transform Keep Existing Input Interfaces Load Accelerate Bottlenecks at Wire Speed Reduce ETL Server Workload CSV Faster Total Processing Time Mainframe ETL Server Lookup & replace XML Aggregation Field validation: datatype Business rules validation RDBMS Primary RDBMS Entity resolution Field validation: bounds checking USPS address standardization Exception / error handling Social Media Sensor Hadoop Staging DB 24 www.velocidata.com info@velocidata.com
  25. 25. Perceptions & Questions Analyst: Robin Bloor Twitter Tag: #briefr The Briefing Room
  26. 26. Technology Evolution (Bloor Curve)
  27. 27. Disruption on Disruption u  We are no longer certain that the pattern still holds u  We used to encounter new technologies that were 10x because of Moore’s Law u  Now we encounter new technologies that are 100x or even 1000x u  This is not because of Moore’s Law but because of parallelism
  28. 28. Parallelism Will Become the Norm u  This is not just about software u  It is also about hardware architectures u  But it affects all software u  Eventually everything will execute in parallel u  Everything will go much faster
  29. 29. CPUs, GPUs and FPGAs u  CPUs, GPUs and FPGAs are commodities u  They can be harnessed to deliver extreme parallelism on a single server u  The use of such chips can deliver acceleration above 100x for some applications
  30. 30. The Memory Cascade u  On chip speed v RAM •  L1(32K) = 100x •  L2(246K) = 30x •  L3(8-20Mb) = 8.6x u  RAM v SSD •  RAM = 300x u  SSD v Disk •  SSD = 10x
  31. 31. Going Forward The old limitations are no longer SO LIMITING
  32. 32. u  Can one VelociData Appliance serve many applications? u  What of data cleansing functionality (e.g., cleansing rules, deduplication, etc.)? u  Please detail. explain wire-speed in a little more
  33. 33. u  How long does it take to implement and what is the process? Please describe. u  With Hadoop, what are the possibilities? u  What does the roadmap look like?
  34. 34. Twitter Tag: #briefr The Briefing Room
  35. 35. Upcoming Topics This Month: INNOVATORS January: ANALYTICS February: BIG DATA 2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room www.insideanalysis.com Twitter Tag: #briefr The Briefing Room
  36. 36. Thank You for Your Attention Twitter Tag: #briefr The Briefing Room
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×