Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Big DataA really ‘Big’ deal or just anotherhammer looking for a nail?-Keshav DeshpandeSoftware Developerkdeshpande1@verizo...
A little bit of theory - The V’s of BigData•Volume  scaled at terabyte/petabyte levels•Variety  structured, unstructured...
If you can’t go to the mountain, let the mountain come to -you !Proposed ‘solution’Ship processing to where the data is lo...
• Besides storing/retrieving/processing data at scale• parallel and distributed nature - necessitated by the 3 (or 4) V’s•...
Information is –•not just confined to relationships between data entities (like in vanilla RDBMS) –• both data and associa...
What is involved?•Retrieving data from large, distributed data stores  miningof data for nuggets of information•Analysis ...
An emerging trend – data in constant motion• Conventionally, data is at rest. Implication  data isstale instantly• any an...
By no means, an exhaustive listing –•Business Intelligence  derive Insights  better Decision-making•Insights  crystal b...
Please stay in touch at - kdeshpande1@verizon.net
Please stay in touch at - kdeshpande1@verizon.net
Upcoming SlideShare
Loading in …5
×

Big data - A Really Big Enchilada?

561 views

Published on

A deep dive - came out of a recent discussion with colleagues

Published in: Technology, Education
  • Login to see the comments

  • Be the first to like this

Big data - A Really Big Enchilada?

  1. 1. Big DataA really ‘Big’ deal or just anotherhammer looking for a nail?-Keshav DeshpandeSoftware Developerkdeshpande1@verizon.net
  2. 2. A little bit of theory - The V’s of BigData•Volume  scaled at terabyte/petabyte levels•Variety  structured, unstructured, hybrid data formats•Velocity  data generated at internet speeds (tera, exa – range)Often, Veracity is added to this list  reliability of the dataImplications on IT Solutions Architectures•Current computing paradigm  Data layer/Middleware/UI layer (n-tierarchitectures)• fetch data from Data Layer• ship data to Middleware for processing (or to UI layer fordisplay)• ship data back to Data Layer for storage.At ‘Big Data’ scale, this approach simply does notperform/scale!What is so ‘big’ about Big Data
  3. 3. If you can’t go to the mountain, let the mountain come to -you !Proposed ‘solution’Ship processing to where the data is located, instead of shipping datato where process is locatedProcess smaller chunks of data, in parallel, then combine the resultsOK, so with this scheme, we are assured of ‘scale’ and even‘performance’ – so what do I do with it?Remember the hammer and nail?It seems we have ourselves a hammer,So lets look for the ‘nails’…..
  4. 4. • Besides storing/retrieving/processing data at scale• parallel and distributed nature - necessitated by the 3 (or 4) V’s• high level of concurrency - storing, retrieving or processing• high level of asynchrony• non-blocking, fire-and-forget• call and then notify when “answer” is ready• However Data is still ‘raw’• Needs to be retrieved (mined) and processed (analyzed) to get at‘Information’ or ‘Actionable IntelligenceBig Data Characteristics
  5. 5. Information is –•not just confined to relationships between data entities (like in vanilla RDBMS) –• both data and associated meta-data are information• increasingly expressed as graphs (sparse or dense)  entity relations arestill important, but they are now multi-dimensional• very rich, data (and metadata) include•• data entities (vertices)• inter-relationships (links and edges)• degrees of separation between vertices, links and edges•RDBMS-like design approaches fall short, under-perform, and do not scaleThe real Big Data challenges, then are -
  6. 6. What is involved?•Retrieving data from large, distributed data stores  miningof data for nuggets of information•Analysis of data, but at internet scale  to provideactionable intelligence• Analytics  processing required to wring intelligenceout of raw data•Information Visualization  present analysis to the user• Dashboards/UI CompositesAll of the above, but in real-time (or near real-time)Big Data Processing
  7. 7. An emerging trend – data in constant motion• Conventionally, data is at rest. Implication  data isstale instantly• any analysis on at-rest is after-the-fact or post-mortem, if you will…• Data in motion  implies as-it-happens, event-based,very loosely-coupled, asynchronous, non-blocking• Analytics and BI at the point of streaming  real-time,complex event processingBig Data Processing
  8. 8. By no means, an exhaustive listing –•Business Intelligence  derive Insights  better Decision-making•Insights  crystal ball possible future states• predictive and prescriptive analytics•Automating development of such insight, developing algorithms• machine learningOutcomes•Predictive Analytics from both historical, and real time data•Automated (and perpetual) Machine LearningApplications of Big Data
  9. 9. Please stay in touch at - kdeshpande1@verizon.net
  10. 10. Please stay in touch at - kdeshpande1@verizon.net

×