Applying Big Data


Published on

What discourse should you take?

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Thank you for coming to Big Data for BusinessNo other speakers, see if there is interest
  • Concentrate on Veracity…not too muchDevelopment is not rigid, and agility may not be the only option.We have evidence that other approaches may yield just as good or better results.Data modeling is tantamount to a proper deployment
  • Discuss why these are important to recognize for deployment and designThese styles all have similar issues with finding the right value
  • Real time data flow is now the next step in finding answers.We still have to develop the right questions, or the right methods to finding the right questions
  • Thank Michael Walker for the graphicThis illustrates a great abstraction for discourses at the business necessity perspective
  • That’s a lot of data!End with the possibility of data aggregation sources, novel and extablished
  • IBM has a pretty good grasp of Big Data’s implementations
  • There are, fortunately or unfortunately, far fewer use cases than there are companies to provide solutions for those use cases
  • Decision journeys for customers and predicting usage/purchasing patternsUtilized heavily in the Amazon space (both by Amazon and by their market source partners)
  • Return on investment is proven, but not guaranteed for your businessFinding one massive return might justify the costs, but guaranteeing small returns will win more arguments
  • There are a slew of resources available, and I will post this presentation along with other materials on the meetup page.Thanks again for coming, let’s have a discussion, and make sure to fill up
  • This will be available online in a few days
  • Applying Big Data

    1. 1. Applying Big DataPresented by John Dougherty, Viriton4/25/
    2. 2. Big Data Buzzwords• Volume, Velocity, and Variety• Agility/Agile Development• Modeling DataThe 3V’s originated in the early 2000’s. META (Gartner, now)Volume…self contained. Velocity = Speed of transactionVariety = Data profiling from multiple data sourcesThe Agile Manifesto, created February 2001 (Remember Scrum?)Incorporation into Big Data software becoming mandatoryAdaptive and Predictive approaches are hotly contestedData Modeling is paramount, given Big or Small datasetsDesign must be confronted at ingress and egressHybrid data modeling and remodeling existing modelsVeracity has been added, but has not yet been fully adopted
    3. 3. Big Data Buzzwords – Agile Dev.• Informatics• Daily Batch• Classic Dept.Informaticists are leveraged across multiple disciplinesThere is no strict definition for a data scientist/informaticistGreatest likelihood to adopt an agile/adaptive modelDevelopment _should_ be incorporated into existing processworkflows. Seamlessness should be the goal.Utilizing an agile approach to finding new uses to existing dataLeast likely to need/adopt new development approachesRelevant data must still be filtered throughStaff should not be re-learning the wheel with deployment
    4. 4. • Example of Hybrid Modeling• Every project/objective must have properlydefined models to reach maximum efficacy• Data silos are losing their complicit positioning• Transitioning modeling to enumerationBig Data Buzzwords – Data Models
    5. 5. Big Data Buzzwords – Question InceptionConnecting these lines is a greatexample of the work that lies ahead inidentifying the objectives and goals ofthe business environment
    6. 6. Big PictureThere is a lot of dataAs of 2009, Google generates at least >2 EB peryear, >2TB indexed URLs, >9B page views per dayFacebook houses one billion users; utilizing >500TBper day, housing 35% or more of the world’s photosYouTube houses >1EB of data, >72 hours of videoper minute, >4B views per dayTwitter >125B tweets per year, >390M perday, approximately 4500 per second~2.3B people use the internet today, of which, 90% ofthe world’s data has been generated within the lasttwo yearsThe Internet of Things(connected devices and data)What will you be aggregating?In 2002, recorded media and electronic information flowsgenerated about 22 exabytes (1018) of informationIn 2006, the amount of digital informationcreated, captured, and replicated was 161 EB
    7. 7. Use CasesIBM’s 5 High Value Use CasesBig Data ExplorationFind, visualize, understand all big data to improve decision making. Big data exploration addresses the challengethat every large organization faces: information is stored in many different systems and silos and people needaccess to that data to do their day-to-day work and make important decisions.Enhanced 360º View of the CustomerExtend existing customer views by incorporating additional internal and external information sources. Gain a fullunderstanding of customers—what makes them tick, why they buy, how they prefer to shop, why they switch, whatthey’ll buy next, and what factors lead them to recommend a company to others.Security/Intelligence ExtensionLower risk, detect fraud and monitor cyber security in real time. Augment and enhance cyber security andintelligence analysis platforms with big data technologies to process and analyze new types (e.g. socialmedia, emails, sensors, Telco) and sources of under-leveraged data to significantly improve intelligence, securityand law enforcement insightOperations AnalysisAnalyze a variety of machine and operational data for improved business results. The abundance and growth ofmachine data, which can include anything from IT machines to sensors and meters and GPS devices requirescomplex analysis and correlation across different types of data sets. By using big data for operationsanalysis, organizations can gain real-time visibility into operations, customer experience, transactions and behavior.Data Warehouse AugmentationIntegrate big data and data warehouse capabilities to increase operational efficiency. Optimize your datawarehouse to enable new types of analysis. Use big data technologies to set up a staging area or landing zone foryour new data before determining what data should be moved to the data warehouse. Offload infrequently accessedor aged data from warehouse and application databases using information integration software and tools.
    8. 8. • Applied since data science began, 1970’s• Many different products available, augmenting existingsolutions, and providing all-in-one• SAS• SAP (though called predictive analytics, still fits)• Same problems incur with extensibility as do withdesign/deploymentUse CasesVisual AnalyticsScience: Visual analytics is thescience of analytical reasoningfacilitated by interactive visualinterfacesSensor AnalyticsInternet of Things: The first speaking of the gargantuan brontobyte(1 Bit = Binary Digit · 8 Bits = 1 Byte · 1024 Bytes = 1 Kilobyte · 1024 Kilobytes = 1 Megabyte · 1024 Megabytes = 1 Gigabyte · 1024 Gigabytes = 1 Terabyte · 1024Terabytes = 1 Petabyte · 1024 Petabytes = 1 Exabyte· 1024 Exabytes = 1 Zettabyte · 1024 Zettabytes = 1 Yottabyte · 1024 Yottabytes = 1 Brontobyte· 1024 Brontobytes =1 Geopbyte)
    9. 9. • ROI Metrics are difficult to predict, but follow atrend of double and triple digits• What keeps the CEO up at night, decisionjourneys• An anecdotal report (questionarre) shows44% of CMO’s can measure their ROI• Design and development will continue to betantamount to a successful returnROI?
    10. 10. ROINucleus Research• Becoming an analytic enterprise requires Big Data• Average ROI of 241%• Increased productivity• A major metropolitan police department achieved an 863 percent ROI when it combined its criminalrecords database with a national crime database created by a major university.• Reduced labor costs• A major resort earned an ROI of 1,822 percent when it integrated shift scheduling processes with data from anational weather service, enabling managers to avoid unnecessary shift assignments and increase staffutilization.Four Stages of anAnalytic EnterpriseTelco reduces costs associated to CO managementand circuit deployment by 230%QoS data expected to expand well into Petabytes forthe Telco industry
    11. 11. Moving ForwardHow to formulate the right questions• Communication between C-Suite and VP isn’t enough• Considering old data, wholistic approaches work best• Objectives and goals begin with dialogue at the highestlevels• What are the questions we should be asking?Should we start now? Yes.Brought to you by:
    12. 12. BIBLIOGRAPHY - Big Data for All Businesses - NYC Mayor’s use case - Data Modeling, the big challenges - Origin of VVV - CIA CTO Presentation - Rose Business Technologies, Workflow - Nucleus Research