Your SlideShare is downloading. ×
0
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Satyam open analytics nyc
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Satyam open analytics nyc

3,519

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
3,519
On Slideshare
0
From Embeds
0
Number of Embeds
38
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. 1BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyBIG DATA ANALYTICS&PITFALLS TO AVOIDDr. Satyam PriyadarshyJune 17, 2013 – New York City
  • 2. 2BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyAgenda The Big Data buzz word is creating a lot of confusion forcompanies. One needs to understand Big Data within theircontext, and the 7V’s of Big Data along with the KARMA scoreto avoid some of the serious pitfalls in leveraging Big Data. CaseStudy will be presented in how to drive value out of Big Data, in ameaningful manner
  • 3. 3BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyBIG DATA Buzz - Should Business Care?Big Data future is bright.Organizations that caneffectively leverage BigData without sinking inthe Big Data Hole willrealize additionalbusiness value, a loyalcustomer base andincreased profits.2.5 Exa bytes of newdata/day generatedWhat we know?A top business priorityBig opportunities availableEveryone is talking about itBut...Emerging technology helpsAdds value definitelyDefinition, Leverage is not clearBig challenges for companiesThe path to execute is less understoodRealization is complex but getting easierExpertise is demand but supply is short
  • 4. 4BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyBIG DATA - 7 V’s that describeVELOCITYMoving away frombatch processing toreal-time addition ofmassive data for nearreal-time analysisVARIETYStructured andunstructured data - e.g.POS data, Sensor Data,transaction data, callcenter data, supply chaindata, new media data,etc.VERACITYReliability andpredictability of ‘notso’ precise data types.E.g. Sentiment data,Weather data and itsimpact on business.VOLUMEThe ever growing dataform Terra bytes toPeta bytes to ZettabytesBig Data definition isevolving. The origin ofword dates back to 1990.Typically 4 V’s definedBig Data, but I stronglyrecommend the 7 V’sthat describe Big Data.(Source:chiefknowledgeguru.com)80% of data generated isunstructuredVALUEUnless value isrealized, Big Data isa just Big HoleVIRTUALData resides in virtualenvironment - e.g.POS, Private and PublicClouds, Geo-located, inside andoutside firewallsVARIATIONNo single configurationof the 6 V’s below fitseveryone. There isvariation for eachbusiness.
  • 5. 5BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyKARMA mattersKnowledge• Business, Technology, PeopleStrategy• Big DataSources, Lifecycle• Re-investbased onactionsAction• ScalableArchitecture, Infrastructure, Tools &Technology, Resources• Mining the BigData withtargeted andopen mind tofind Gold andother itemsRecognition• Revenue BySell NewInsights• IncreaseProfitMargins• Add newfeatures toproducts &servicesMarket• Grow Share• CustomerCentricityAdvance• Innovatewith help ofBig analytics• Gather evenmore BigData andkeep goingthrough thiscycle
  • 6. 6BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyKARMA SCORE is calculated using maturity level of thesecapabilities• ParallelProcessing, API, Query,Reporting• DataMining, Analytics, Pattern, Statistics• MachineLearning, Inferencepredictions• Tools, Technologies, Human Resources• Service to support business–Data, Information, Knowledge, Process• Presentation –Visualization, Mobility, Collaboration, Exploration• Actions – ImproveProduct/Services, GrowRevenue/Profits, Agility• Collection of Raw Data,Structured&Unstructured, Discovery,Staging• Extract, Load, Transform• Data Connectors, Access,Use, Move• Data Storage: Hadoop,NoSQL, Key-value, MPP,In-memory, blobs, etc.• Policy, Privacy, Security, Metadata, Risk, Total cost ofownership, Access control• Data Lifecycle, DataAssets, SLA, ROI, ROA, DataQuality• Physical Store, VirtualStorage, Encryption, Masking, Archive, DisasterRecoveryDataGovernanceandManagementBig DataBig Math andBig AnalyticsBig Value, BigActions
  • 7. 7BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyWhat ever your KARMA Score is?One can leverage Big Data eventuallyThe Great Enabler is OPEN SOURCE RevolutionIn the last decade or so.
  • 8. 8BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyIn a Zoo In an Open EnvironmentOPEN SOURCE Creates a HAPPY, FLOURISHINGEnvironment
  • 9. 9BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyOpen Source – Key CharacteristicsFREE (*)NOT CAGED, NOTBLACK BOXMODIFICATIONSALLOWEDMODIFIEDVERSIONSREDISTRBUTABLELIVES INHARMONY WITHOTHERS
  • 10. 10BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyOpen Source – BIG DATA PLAYERSTHESE TOOLS ENABLE YOU TO DIG THE GOLD IN BIG DATA(This is not a comprehensive list of tools/technologies)
  • 11. 11BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyACTION for finding the GOLDPROBLEMSOLVINGOPERATIONALSTRATEGIC –FUTURISTICBasic AnalyticsAdvanced AnalyticsHolistic AnalyticsGO FOR THE GOLDADDRESSESCurrent ConcernsReduce CostsEliminate IssuesADDRESSES GROWTHCustomer CentricEasily Incorporate New DataInnovation RelatedEmerging Trends AdoptionBIG DATA,BIG MATH,BIG ANALYTICSDescriptive StatisticsInferential Statistics
  • 12. 12BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyTHAT’S A GOLD MINE
  • 13. 13BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyWHAT’S IN A GOLD MINE?Gold SuiteBASE SuiteIron-ManganeseSuiteGoldArsenicMercuryTungstenSilverCopperLeadZincBismuthCadmiumMolybdenumSilverIronManganeseCobaltNickelYttriumTo GET GOLD ONE HAS TO DIG DEEPERIF YOU FOUNDSILVER WHILE DIGGING FOR GOLDWHAT WOULD YOU DO?
  • 14. 14BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyCASE STUDY – DDoS AttackPROBLEMBIGANALYTICSTHE GOLDKNOWLEDGEACTIONSRECOGINITIONSource of attacks identified• After integrating• Distributed targets• Multiple attack types• Slow performance overbinary data sets• A step closer tosolution, but requires morework to get it near real-timefor actionable insights.• Feedback loop to knowndatasets to enhance thepredictability andperformance45 days laterIt’s Science not BIDNS Servers are persistentlyattacked to create DdoSAttacks. Can we predict?CHALLENGES:• 7+ TB / Day• Varied Formats based onRequest and type ofattacksHadoop based data storageAPPROACH• Hive / MapR queries andR for statistical analysis• Interconnection of datawith known “data” sourcesfor identification• Tableau and (Opensource DS3.js andPloticus) for Visualization• Iteratively optimizedqueries for speed
  • 15. 15BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyCASE STUDY- DDoS Attack – Pattern Based Study-200-1000100200300400500600700800900-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2Single Day - Outlier Events - 10K Size ::Zones Hit from Multiple Sources-200-1000100200300400500600700800900-0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2Single Day - Outlier Events - 2K Size:: Zones Hit from Multiple SourcesABC.TLDABC.TLDSBGOLD.TLDTrafficVolumeUnique ZRatioAFTER DIGGING FURTHERUnique ZRatio
  • 16. 16BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyPITFALLS…Lack of knowledge – Tools, DataScienceToo Much Data. Initially mostof it was discardedHOW TO OVERCOME Deploy Hadoop Clusters withcheap storage and store withbest possible compressionBIG DATA PITFALLS Expert, Education, ExecutionBig Data can help MOSTBUSINESSESExecutives Not SureBelief Big DATA has all theanswers The Whole Mine is NOTGOLD.. Shows insights andcoach Education, Best Practices andInsights after mining and finduseful patterns initially
  • 17. 17BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyPITFALLS…Silo CultureMultiple copies of ‘same’ datain different formatsHOW TO OVERCOME Keep Raw Data (along withDR site), Transform duringAnalysisBIG DATA PITFALLS Devastating for companies.Single Source of Truth Key toSuccessBig Data can help MOSTBUSINESSESWell Established Enterprise DataWarehouseIntuition Based Culture Can only focus on Gold, ifyou find Silver and otherprecious metal, you miss themark. Show Insights andMove On To Gold Keep it for Simple,Operational Analytics,Augment with Big Data forInnovation and FutureGrowth
  • 18. 18BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshySimple way to see some Big Data Challenges• Data acquisition• Storage• Processing1st• Data transport & dissemination• Data management & curation• Big Analytics – Tools, Technology, Know-How2nd• Privacy, Security and Disaster Recovey• Technical/Scientific Talent• Cost of all of the above3rd
  • 19. 19BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyKARMA mattersKnowledge• Business, Technology, PeopleStrategy• Big DataSources, Lifecycle• Re-investbased onactionsAction• ScalableArchitecture, Infrastructure, Tools &Technology, Resources• Mining the BigData withtargeted andopen mind tofind Gold andother itemsRecognition• Revenue BySell NewInsights• IncreaseProfitMargins• Add newfeatures toproducts &servicesMarket• Grow Share• CustomerCentricityAdvance• Innovatewith help ofBig analytics• Gather evenmore BigData andkeep goingthrough thiscycle
  • 20. 20BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyTHANK YOU UNDERSTAND YOUR BIG DATA KARMA SCORE ANDUnderstand the Big Picture, THE Direction and LEADHelps BuildStrongFoundationFocus on OUR MOSTVALUED CUSTOMESINCREASEPROFITABiLITY
  • 21. 21BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyAppendix
  • 22. 22BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyThe Pitfalls for Adopting Big Data The Big Data Definition of 4 V’s – Velocity,Volume, Variety, Veracity is incomplete. The Belief that Big Data solves everythingfor Everyone. Big Data is Abound, but Dimensions of itare to be understood The Loudest Often Wins (LOW) or thehighest paid person’s opinion (HIPPO)prevails Data Driven approach trumps intuition is ahard nut to crack. Really!! Data for Data’s Sake Talent Gap Data, Data Everywhere Infighting Aiming Too High Reference: Wall Street Journal March 11,2013 on page R4
  • 23. 23BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyTime Management (ByFrederick Winslow Taylor)Zero Defects Analysis andPacing of Assemby Line(Ford)Statistical Process Control(Walter Shewhart)Operational ResearchPopularized (Royal AirForce)Social NetworkAnalysisBusiness IntelligenceTerm coined (H. P.Luhn)Artificial Intelligence(John McCarthy)Exploratory Data Analysis- visualization (JohnTurkey)Business IntellgiencePopularized (Gartner)Expert Systems (using AI)The Visual Display ofQuantitative Information(Edward Tufte)Data Mining (part ofAI) and Web analyticsBig Analytics1890 1920 1950 1980 2010Brief History of Analytics
  • 24. 24BIG DATA ANALYTICS & PITFALLS TO AVOID© Dr. Satyam PriyadarshyDEFINITIONS of Analytics for Business ANALYTICS– Any data-driven process that provides insights ADVANCED ANALYTICS– Helps understanding cause-effect relationship, prediction of future events,best possible action• BIG ANALYTICS FOR BUSINESS– Relevant for the business, actionable insights forincreasing revenue/profit, value measurement andleverages “Big Data”.

×