Value proposition for big data isv partners 0714


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
    Key Concept: So why do we need to pursue this opportunity together? Big data and analytics are two sides of the same coin. You can have all the information in the world, but if you can’t make sense of how to use it…what is the point? And you can have all of the most advanced analytics in the world but if you have no information to fuel it, then what value can it provide…none.
    Key Points:
    On one side of the coin there is a mass of data - data that is complex, data that is moving at high speed, and often contains lots of unstructured text. It needs to be captured, managed and delivered.
    On the flip side there is analytics. The analytics is tasked with finding interesting and relevant trends and patterns in big data to inform decisions, optimize processes and even drive new business models.
    And just as a coin can’t have only one side, you can’t have big data without analytics, or indeed analytics without big data – you need both to create the value.
    That is why IBM is defining the Big Data AND Analytics as the category and as the growth market…it isn’t one or the other, it is and MUST BE both.
    And as Ginni Rommety stated in her speech to the Council of Foreign Affairs…we have the capability to help clients move from gut-feel to fact-based decisions…and those organizations that fail to make that shift will lose.
  • We’re all familiar by now with the 4V’s of Big Data.
    Volume (data at scale) is about rising volumes of data in all of your systems – which presents a challenge for both scaling those systems and also the integration points among them. And Variety (data in many forms) is about managing many types of data, and understanding and analyzing them in their native form.
    I hear too often that big data is exclusively founded on volume and variety and that equates to Hadoop. While it is an entry, it’s not the only entry point, nor does it provide a comprehensive picture about a customer or the operations of the business.
    The two biggest areas of interest for Smarter Enterprise is harnessing: Velocity (data in motion) which is the ingesting data and analyzing in-motion opening up the opportunity to compete on speed to insight and speed to act.
    According to the IBM GTO study in 2012, by 2015, 80% of all available data will be uncertain and rising uncertainty = declining confidence. As complexity of big data rises it becomes harder to establish veracity, which is essential for confident decisions.
    Transition: People’s thinking is shifting about their usage of analytics and their data. Yes, people use reporting and analysis as table stakes for their business operations, but with the transformation of the front office, and the emergence of systems of engagement, those old notions are changing.
  • Big Data & Analytics is proven and the potential value is only limited by imagination, time and money.
    The Ottawa Hospital – located in Ottawa, Canada.
    Enable the hospital to capture its current baseline response times to consult requests, and model and implement procedural improvements to reach a goal of 15 minutes or less.
    Business Problem:
    Hospitals have vast amounts of silo’ed patient data that cannot be used to help hospital decision-makers analyze and improve patient care easily. This can lead to inconsistent procedures and policies, such as in patient discharge procedures, or ordering Emergency Department (ED) physician consults.
    When decision-making is based on experience or anecdotal information, there is no way to analyze and systematically improve problems. Worse, patients might stay in the hospital longer, wait longer in the ED, and their care might be compromised by poor decision-making.
    Ottawa Hospital is implementing a collaborative Care Process Management system that uses advanced analytics and statistical modeling to help improve its staffing and better coordinate care based on actual patient needs.
    Solution improves accessibility to information as well as collaboration across the care team, reducing time now spent in trying to find data or even people within the hospital system.
    Scenario modeling and simulation help the hospital select the physicians who can provide the best and fastest response to ED consult requests, matching patient situation and provider skills.
    Business Analytics - Cognos BI, SPSS
    GBS S&T, Global Services - Application Services® (GBS-AIS)
    Lotus Connections®; Lotus Sametime®
    WebSphere®: IBM Blueworks Live® ILOG JRules®
    Industry Solutions: Healthcare: Analytics and Reporting; Collaborative Care and Health Information Exchange
    FleetRisk Advisors – located in Alpharetta, Georgia.
    FleetRisk Advisors’ services have helped their customers (trucking companies) reduce the incidence of minor accidents by 20 percent and serious accidents by as much as 80 percent.
    Customers have also seen an increase in driver retention of roughly 30 percent in an industry in which roughly 100 percent turnover is normal.
    Business Problem:
    FleetRisk’s wanted to turn its ability to capture a lot of data and its expertise into helping trucking companies reduce accidents and retain their valuable drivers.
    But to truly succeed, FleetRisk Advisors needed to extract even deeper predictive insights than its manual processes made possible and generate these insights faster so that customers would have the time to take truly preventive action.
    For each of the company’s customers’ truck drivers, a predictive model translates some 4,500 data elements, from a diverse range of sources, into a quantitative risk ratings related to the likelihood of on-the-job accidents, giving operators cues to where they need to intervene to prevent accidents and save lives. A similar score-based approach is used to rate an employee’s risk of defecting to another trucking company.
    PureData System for Analytics
    SPSS Collaboration and Deployment Services
    SPSS Modeler, Modeler Desktop, Modeler Server
    Standard Chartered Bank – headquarters in London, UK.
    Saved an estimated USD20 million and cut process times by more than 80 percent using with an ECM solution as part of Standard Chartered Bank’s eOps program helps the communities it serves by providing earnings opportunities for rural people and reducing its carbon footprint.
    Standard Chartered has won technological and social awards for its eOps system in England, Singapore and India.
    Business Problem:
    Standard Chartered’s brand promise “Here for Good” takes an inclusive view of sustainability, recognizing that the bank thrives when the communities it serves thrive. As a result, the bank conducts business, in a way that supports customers and clients, while also having a positive impact on the wider society.
    Standard Chartered standardizes on the IBM ECM solution across approximately 56 countries, helping more than 30,000 staff members save time processing roughly 50 million documents annually.
    The eOPs program uses the ECM solution to further boost operating efficiency while improving people’s lives through by enabling virtual work to be done by the people in the villages and smaller towns they serve and getting paid for it.
    IBM® FileNet® Content Manager, IBM Content Collector for Email, IBM Enterprise Records and ILOG
    Aperity – located in Louisville, Kentucky. Aperity provides sales and marketers forecasting, segmentation, and simulation techniques for retail distribution and channel management.
    Aperity leverages PureData™ System for Analytics to provide Sales forecasts for 18+ months with 95% accuracy
    Highly accurate forecasting on more than 500,000 stores in minutes. Facilitate precise inventory management to prevent over- or under-stocking
    Business Problem:
    The ability to precisely track and forecast marketing and sales is essential to the success of retail and CPG companies, especially the need to keep operating expenses to a minimum.
    Without accurate and timely reports, a company may miss distribution opportunities, misread market changes, completely overlook competitive activity, and ultimately lose money.
    Aperity developed its iSalesBrandManagement tool using IBM PureData for Analytics due to its ability to run complex analyses rapidly, and its simplicity to get up and running quickly without requiring ongoing maintenance or high operating costs.
    Fuzzy Logix’s in-database computation engine called DB Lytix runs complex analytics on the IBM Netezza data warehouse appliance
    Start Today Co – located in Japan, the company offers fashion-related services around its mainstay, ZOZOTOWN, one of the largest fashion shopping websites in Japan.
    Three to five times higher email open rates. Five to ten times improved conversion rate
    Estimated 90 percent reduction in time required to plan and implement new promotional campaigns.
    Business Problem:
    Start Today Co., Ltd. uses email as its main vehicle for outbound marketing campaigns - primarily broad advertisement campaigns to those who opt to receive email communications.
    However, the company recognized great potential for much more targeted, user-driven marketing, but system performance issues and limited staff resources held it back from executing targeted campaigns.
    Uses predictive analytics to create customer profiles from their purchase histories and brand affinities to target which shop, product brands, and items customer are likely to be interested in and deliver the right offer in near-real time.
    Combines PureData System for Analytics and Unica to automate time-consuming customer segmentation and response process that is required for one-to-one marketing.
    ScotiaBank - located in Halifax, Nova Scotia, Scotiabank is the wholesale banking arm of the Scotiabank Group, serving 21M clients in more than 55 countries.
    Changed 70 percent of counterparty exposure measurements by 20 percent or more with a centrally managed, consolidated, cross-asset view of counterparty credit risk
    Business Problem:
    The credit crisis in recent years demonstrated the speed at which credit defaults can cascade through the financial sector.
    Banks need greater visibility into and control over counterparty credit risk. Scotiabank lacked a consolidated view of these risks, which forced traders into an overly conservative stance and kept the bank from maximizing credit line use. The organization needed to replace guesswork with a more realistic assessment of counterparty credit risk.
    Scotiabank uses IBM’s Algorithmics® Integrated Market and Credit Risk to keep credit risk contained, calculating risk with unprecedented accuracy by accounting for all key factors, including shifting market conditions, portfolio diversification and collateral.
    VaaSan Group – one of the leading bakery operators in Northern Europe (Finland, Norway and Sweden) producing fresh bakery goods, bake-off products and crisp breads for sale in retail chains, restaurants and hotels.
    Saw a rapid 30 percent increase in sales orders in Sweden and achieves an on-time delivery target of 98.5 percent.
    Business Problem:
    Vaasan was experiencing exponential business growth, but couldn’t accurately forecast fluctuating sales orders across the Nordic region, which meant it couldn’t accurately plan its resources and production schedules.
    Always at risk of out-of-stock or excess inventory can be costly, so food manufacturers constantly strive to prepare for fluctuations in order volume.
    They can now identify trends in customer demand and generates a rolling sales forecast, helping the company predict its production requirements and prepare for fluctuating customer orders.
    For example, when the company won a large-volume account, near-real-time analysis helped enable its sales team to know in minutes whether it was possible to fill the new order on time.
    Cognos Business Intelligence; Cognos Controller; Cognos Planning; Cognos 10
    Camden Council - Camden London Borough Council established in 1963, is one of 32 authority agencies across the United Kingdom charged with community governance.
    Achieved savings of up to 30 percent per year for some households in pilot program
    Expected reduction of 8,000 tonnes (8,818 tons) of CO2 for 2,500 homes over the lifetime of the project
    Business Problem:
    In housing developments that use district heating, residents often pay a fixed monthly heating fee. Whether residents turn down the heat when departing from the flat, or they leave the heat on and the windows open, they pay the same fee.
    Local governments are seeking a solution to this challenge, as more than USD300 billion is spent globally on district heating for block housing developments as well as on college campuses and in commercial and public buildings.
    Develop a first-of-its-kind heat metering program is providing a cost-effective approach that rewards residents for energy efficiency.
    Camden Council used IBM Business Partner Hildebrand Technology, a London-based energy consulting firm, to develop a pilot program to deploy individual metering systems in approximately 1,500 properties to allow Camden Council, and the residents, to measure usage as if each flat had its own water heater. Residents now have accountability on their energy usage so they adopt energy-saving practices.
    Meter readings are captured every six seconds and loaded into Informix TimeSeries DataBlade Module using Informix TimeSeries Real-Time Loader software for later analysis.
    Transition: So what are the reasons why organizations are compelled to act now?
  • 1) Outperform in your industry – 75% of Leaders cite growth as the key source of value from analytics. Source: IBM IBV Study: Analytics: A blueprint for value, October 2013
    Make speed a differentiator – to reduce the latency of decisions, business processes and every aspect improving a customer’s experience or optimization of a company’s infrastructure.
    Monetize their data – data has its own value to create new products or services, for example, telcos are monetizing their location data, manufacturers are monetizing their maintenance data, healthcare organizations are leveraging their treatment data their organizations.
    Be more right, more often – big data & analytics enables the use of more sources of data, and new forms of analytics that increase the understanding about what you are trying to analyze, and in turn, build deeper confidence to act faster.
    2) Guard against risk – 46% of respondents were impacted by a cyber security breach over the past 24 months. Source: IBM Global Study on the Economic Impact of IT Risk, 2013
    Guard against poor decision making – this equates to building confidence by ensuring the veracity, i.e., quality, timeliness and consistency of the information you are trying to analyze.
    Protect the security and privacy of their data– many organizations are acting now to put in place stronger security and privacy measures and governance policies necessary to protect the organization from internal and external threats.
    Get the risk-opportunity equation right –proactively identify and manage your risk exposure from data breaches to compliance with regulations.
    3) Change the economics of business and IT – One in Five Organizations allocate more than 50% of IT budget to new projects. IBM Global Data Center Study, 2012
    Relieve the pressure on IT infrastructure – capitalize on new approaches to IT infrastructure that appropriately leverage optimized systems and cloud for analytical workloads as a way to respond dynamically to demand.
    Adopt a new approach to the onslaught of data – analyze data in its native form, in-motion or at rest, and only store what they need and dispose of the rest to lower storage costs and risk.
    Eliminate hidden costs – unify your data and analytics initiatives to eliminate piecemeal approaches that in long run will cost more than an orchestrated roadmap.
    Transition: So what are the three imperatives to success with your business imperative using Big Data & Analytics?
  • According to a brand new IBM IBV study called “Analytics, a blueprint for success” that came out in October 2013, there are 9 levers that represent the sets of capabilities that most differentiated leaders from other respondents in their survey.
    We’ve synthesized them into three things you must get right:
    First – a culture that infuses analytics everywhere. Develop a curiosity-driven and evidence-inspired workforce. And, infuse analytics into everything employees touch – we call this piece “Imagine It.”
    Second - Invest in a big data & analytics platform. Build against a master plan: All types of data. All types of analytics. A full range of business outcomes. “Realize It.”
    Third – be proactive about privacy, security and governance. Forge forward-thinking approaches to maximize impact whole balancing risk – and lastly “Trust It.”
    Transition: Let’s start with Imagine It.
  • Provides big data and analytics capabilities to fuel Watson and our clients’ journey to Cognitive
    Enables our clients to gain fresh insights in real-time, and act upon those insights with confidence
    Sets the standard in market with the breadth and depth of capabilities required for any data and analytics initiative
    Uniquely delivers innovative capabilities such as stream processing, in-memory computing, advanced predictive and exploration capabilities that our clients need – and the security, privacy and governance they, and their clients require.
    Packaged so that clients can address their immediate need, build on what they have, and realize value at every step
  • Helps me discover (fresh insights)
    Find patterns that I don’t even know to look for
    Freedom to explore and follow my train of thought
    Operates in timely fashion (real-time)
    Real-time analytics as data flows through an organization
    Enterprise-class Hadoop that runs 4x faster
    Speed of thought analytics
    Establishes trust (act with confidence)
    Governance across complete data lifecycle inc. Hadoop
    Security and privacy with compliance
    Transparency and context to decision-making process
  • Key Points
    BigInsights builds on top of open source Hadoop and augments it with mandatory capabilities needed by enterprises:
    Optimizations that automatically tune Hadoop workloads and resources for faster performance
    Intuitive spread-sheet style UI for data scientists to quickly examine/explore/discover data relationships
    Development tooling that makes it easier for your technical team to create applications without first needing to go thru exhaustive training to become Hadoop experts
    Packaging that makes it simpler to install, deploy, and manage
    Accelerators with pre-packaged analytics patterns and best practice knowledge to solve generalized and industry big data problems
    High-speed integration connectors to access any data type and source as well as share analyzed data with other applications and storage
    Security and governance to ensure sensitive data is protected and secure
  • What makes BigInsights different than other Hadoop distributions?
    It boils down to 3 main things: Enterprise Performance, (meaning enterprise-ready Hadoop features), and integration.
    Analytics – BigInsights comes with a powerful text analytics engine, as well as a social and machine data analytics accelerator
    Usability – with enhancements like Big SQL, which gives you SQL access to all of your data in Hadoop, Hive and HBase, and BigSheets, which lets you visualize your data in a familiar spreadsheet like interface,
  • Key Points
    - Integrate v3 – the point is to have one platform to manage all of the data – there’s no point in having separate silos of data, each creating separate silos of insight. From the customer POV (a solution POV) big data has to be bigger than just one technology
    Analyze v3 – very important point – we see big data as a viable place to analyze and store data. New technology is not just a pre-processor to get data into a structured DW for analysis. Significant area of value add by IBM – and the game has changed – unlike DBs/SQL, the market is asking who gets the better answer and therefore sophistication and accuracy of the analytics matters
    Visualization – need to bring big data to the users – spreadsheet metaphor is the key to doing son
    Development – need sophisticated development tools for the engines and across them to enable the market to develop analytic applications
    Workload optimization – improvements upon open source for efficient processing and storage
    Security and Governance – many are rushing into big data like the wild west. But there is sensitive data that needs to be protected, retention policies need to be determined – all of the maturity of governance for the structured world can benefit the big data world
  • I’ll give you a quick summary of BigInsights now, and then we’ll dive into details in the next several charts. BigInsights is IBM’s strategic platform for managing and analyzing persistent Big Data. As you’ll see, it’s based on open source and IBM technologies. Internally, the BigInsights project is being run like a start-up. By that, I mean that IBM is engaging deeply with a number of early customers to shape the future direction of our product. We’re purposefully keeping our plans flexible to accommodate rapidly changing requirements in this emerging technology area.
    Some of the characteristics that distinguish BigInsights include its built-in support for analytics, its integration with other enterprise software, and its production readiness. We’ll talk move about these topics shortly. But before I leave this chart, I want to point out that IBM is uniquely positioned to provide customers with the necessary software, hardware, services, and research advances in the world of Big Data.
    The Standard and Enterprise Editions are IBM’s supported production offerings. They contain IBM-unique technologies in addition to open source technologies. For those who want to work with a free, non-production version of BigInsights, we also offer our Quick Start Edition. It’s similar in content to Standard, plus it lets you experiment with Big R and text analytics as well. Details on the different editions are available as supplemental slides in this deck and in the announcement materials.
  • You saw this slide earlier, the next generation enterprise data warehouse. On this slide, you see the integration points highlighted.
    BigInsights integrates with the Big Data Platform components, data integration and application integration
  • Constant Contact
    Constant Contact, Inc., launched in 1998 and headquartered in Waltham, Massachusetts, wrote the book on Engagement Marketing™ – the new marketing success formula that helps small organizations create and grow customer relationships in today’s socially connected world.
    More than half a million small businesses, nonprofits and associations worldwide use the company’s online marketing tools to generate new customers, repeat business, and referrals through email marketing, social media marketing, event marketing, local deals, digital storefronts, and online surveys.  Only Constant Contact offers the proven combination of affordable tools and free education, including local seminars, personal coaching and award-winning product support.
    With offices in Waltham, Massachusetts; Loveland, Colorado; Delray Beach, Florida; San Francisco, California; New York City, New York; and London, England.
    Constant Contact was looking for help analyzing the 35 billion emails its customers send every year so that it can provide guidance on the best day or time to send email campaigns to have the greatest impact (defined by opens, clicks, etc.). To accomplish this task, the company partnered with IBM and Persistent Systems on a cutting-edge solution.
    IBM and Persistent Systems recommended IBM BigInsights to support Constant Contact's analysis of 35 billion emails. IBM BigInsights along with IBM PureData for Analytics – powered by Netezza technology and Cognos stitch together to create a highly advantageous BigData solution.
    By Implementing IBM BigInsights, Constant Contact enjoyed dramatic performance improvement.
    -performance improvement dramatically increased at over 40 times
    -dramatically improved their customers’ performance based on big data analytics; increased performance of their customers’ email campaigns by 15 to 25%
    -analysis time reduced from hours to seconds
    The impact BigInsights has had on Constant Contact's business has been so dramatic the company is now expanding use of BigData to analyzing the content of emails to help its customers be even more successful.
    Solutions Components
    IBM InfoSphere BigInsights
    IBM PureData for Analytics – powered by Netezza technology
    IBM Cognos BI
    Case Study PDF: TBD – In development
    “Our customers send roughly 35 billion emails every year, and with every email they send, we have more data that we can analyze and feed back to them to help improve their success. Our work analyzing email delivery times has already given our customers a 15-25% lift in their email campaign performance – and that means more customers in their doors and increased revenue.” — Jesse Harriott, Chief Analytics Officer, Constant Contact
  • Client Name: Teikoku Databank
    Case Study Link:
    Pull quote: “With IBM InfoSphere BigInsights it has become possible to process billions of items of textual data in 30 minutes”
    — Mr. Satoshi Kitajima, an MBA Statistician in the SPECIA Team of the Business Analytics Division of the Market and Business Intelligence Department, Teikoku Databank
    Company background:
    Teikoku Databank’s history goes back more than 100 years to the establishment of Teikoku Koshinsha in 1900. Based on the corporate philosophy of “supporting economic activities and contributing to the development of society as a reliable information partner”, they are developing their business in areas such as corporate credit research, credit risk management services, database services, marketing services and e-commerce support services.
    Solution components:
    • IBM® InfoSphere® BigInsights™
    Business challenge:
    Teikoku Databank has been providing its customers with reliable corporate information based on their credit research for more than 100 years, and owns a huge amount of corporate data such as corporate credit report files of 1.6 million companies, “COSMOS1” financial statements of 4.4 million terms worth of information gathered from 680,000 companies, a “COSMOS2” corporate profile database of 1.42 million companies, and other corporate data for 4.1 million companies. Recently, however, information published on the Internet has been starting to have a significant effect on company business, so responding to this situation has become an urgent task. To stay competitive, the company wanted to analyze its proprietary information in combination with Big Data gathered from the Internet.
    The benefits:
    Enables processing billions of items of textual data in 30 minutes, a process that used to take several days
    Analyzes 4.75-fold more data for customers and enables faster response to customer requests
    Enables increasing the number of enhanced offerings to customers as a key differentiator in the market
  • Vestas Wind Systems offers its wind turbine products as alternative energy solutions in a competitive market that is exploding in terms of demand, and characterized by extremely competitive pricing.
    Wind turbines are a multi-million dollar investment with a lifespan of 20 to 30 years. The location chosen to install and operate a turbine can greatly impact the amount of power generated by the unit, as well as it how long it is able to remain in operation.
    In order to determine the optimal placement for a turbine, a large number of location-dependent factors must be considered including temperature, precipitation, wind velocity, humidity, and atmospheric pressure.
    The prior state of the art of location determination – took weeks of data analysis, with the ability to only leverage a fraction of the data.
    Vestas is working with us to build a Big Data computing system that will start by analyzing about 2.6 PB of data with the expectation that it will grow to 6 over the next few years.
    Using more, in fact all, of the available data will improve the effectiveness of the placement process, but they also expect the analysis process to go from weeks to days.
    Create weather models for optimal placement and operation
    One of our early customers has gone from being a manufacturer of turbines to being an operator of turbines. This change in business model has wide-reaching implications.
    To maximize profit, they must understand what design/technology investments have a compelling ROI and which ones don’t.
    They need to understand what makes an ideal location for turbine placement. This is an important decision since most windmills stay in production for 20 to 25 years and the capital investment in the turbines if quite large.
    How to operate the turbine in a way to maximize energy production. Ex: How should the blade be angled based on different weather conditions.
    What kind of maintenance model will optimize costs and energy output? For example, in a dry environment you don’t have to perform maintenance on the equipment as frequently. What are the ideal intervals for maintenance based on location? Based on seasonal conditions?
    The opportunity
    Model weather for a given turbine location to optimize power generation and longevity of turbine.
    The data for creating these models exists, but the task of building the models is non-trivial. . . Initial data sets are approaching 3 Petabytes, not including the sensor data from the installed units.
    Total data volumes will exceed 6 PB quickly, and will include both highly structured and semi-structured information flows
    The Solution:
    InfoSphere BigInsights is capable of handling this massive volume and variety of data.
    Build models to cover both forecasting and optimal, in-the-moment, operation of the power generation units.
    The solution is flexible enough that the customers can use it to answer other questions of the data without redesigning the system.
    That way, the customer can instrument everything, and use that data when they need it, how they need it.
    Ultimately, InfoSphere BigInsights will allow the customers to use their data to make sure they're making the right decisions, and continue to make right decisions as needs evolve.
  • Client
    A large European university
    Business need
    With 145,000 students, more than 4,500 professors and almost 5,000 people working as administrative and technical staff, this university is one of the largest in Europe. It looks and functions very much like a city, comprising one million cu m of building space on 2.5 hectares and containing the amenities and services its 100,000-plus campus members need on a daily basis. Its physical size and population in addition to its EU10 million annual electricity bill made the university the ideal location for a pilot study involving micro grids and observation of energy consumption patterns.
    Phase 1 is of the project involved dividing the campus into nine “energy islands,” each of which includes one or more buildings as well as energy-generating machines and loads, such as lighting, refrigeration and data centers. Some islands feature generation power plants, such as a tri-generation site, photovoltaic roofs and district heating systems. Data collection and reporting are already underway. A dashboard is available to allow the energy manager to understand consumption needs, production potentials and the relations between the two. Project activities included:
    Monitoring the energy balance for each energy island as well as for the campus as a whole
    Analyzing the temporal profiles and identifying the priorities for improving energy efficiency without reducing service levels
    Identifying the management rules in place to optimize the university’s use of energy
    The university is now launching Phase 2 of its micro grid project with an aim toward better understanding and exploiting field data to make more-informed decisions about energy usage, planning and investments. Project directors aim to answer questions such as:
    Can the university campus sustain itself from an energy perspective?
    Can it sustain itself at least during daylight hours when photovoltaic roofs significantly contribute to the supply side?
    Can the neighboring hospital campus with its additional generation and demand capacity be linked to the university? Would this enable the university to improve its energy balance, or would optimizing balance require accepting contributions from the public electricity network?
    By recording local energy consumption patterns in combination with the generation capacities available from renewable, low-carbon sources, the university hopes to raise public awareness about harmful carbon emissions.
    Solution implementation
    The micro grid project at the university is part of the master plan to transition the school’s city into the world’s first postcarbon biosphere city. Currently in Phase 2, the university is gathering and analyzing data to understand the university’s energy consumption needs and patterns. Throughout the process, IBM has contributed a host of solutions, including:
    IBM® Intelligent Operations Center technology
    IBM InfoSphere® BigInsights ™ Enterprise Edition software
    IBM InfoSphere Warehouse Enterprise Edition software
    IBM Cognos® 8 Business Intelligence V8.4 software
    IBM SPSS® Statistics Professional software
    IBM Cognos Business Intelligence Enhanced Consumer software
    IBM Cognos Business Intelligence Professional software
    IBM Tivoli® Service Request Manager software
    IBM Tivoli Monitoring for Energy Management Basic Device software
    IBM Tivoli Monitoring for Energy Management software
    IBM Tivoli Netcool® Omnibus Base
    IBM Tivoli Business Service Manager software
    IBM Tivoli Netcool Performance Manager software
    IBM ILOG® CPLEX® Optimization Studio Developer Edition software
    IBM ILOG CPLEX Optimizer Deployment Edition software
    Researchers are using IBM software as the foundation for embedding alarms, rules, algorithms and automatic processes into the grid system as a way to maximize the energy efficiency of the university. The overall application platform, which consists of IBM solutions plus custom codes and models developed by the university, will or currently has the following capabilities and main functionalities and systems:
    Asset management. Supervisory control and data acquisition (SCADA) sensors, actuators and other devices currently capture and record basic data, such as operational status and intervention history. To date, there is no intelligent integration among field devices; however, this is planned later in Phase 2. In the future, the university also plans to deploy IBM Maximo® Asset Management software for its asset management functionalities.
    Monitoring system. Devices placed within the nine energy islands will collect and deliver data in real time or near-real time. In the case of exceptional behavior, synchronized alerts will provide additional information to the advanced analytics and business modeling.
    Operational rules. The system will make some decisions automatically based on predefined rules. It will also allow researchers to test how different scenarios—such as changing sets or the quantities of field devices or integrating a new cogeneration plant installed at a remote site of the university—might affect energy consumption.
    Advanced reporting. Comprehensive reporting of basic data as well as sophisticated analysis will provide students with research information.
    Modeling. Information gathered from the system will be used as input to simulate different scenarios concerning the predictive energy behavior of each island. Scenarios will be based on standard consumption patterns and expected environmental parameters, the optimized energy flow among the islands, and the recommended investments or changes in consumption behaviors that must take place to get closer to the optimal self-sustained energy balance.
    The micro grid project is just getting underway, and researchers are in the process of collecting and analyzing data. As data analyses and modeling results become available, the university will have more insight into how and where energy is consumed and what changes are needed to curb greenhouse gas emissions. Researchers will also have greater visibility into how best to revise, update or replace the heating, refrigeration, lighting and thermal insulation in each of its buildings and how to optimize energy production. Ultimately, the university is confident that the solution will play a key role in reducing consumption levels and lowering costs. In the meantime, the program is gaining international attention and IBM can use the campus as a demonstration center for its Intelligent Operations Center technology as well as for organizing conferences, workshops and events about energy and buildings management.
    The solution provides the university’s energy manager, students and researchers with comprehensive and detailed energy patterns for each of the nine energy islands and the campus as a whole. Researchers can conduct what-if analysis for different investment scenarios and to ascertain the self-sustainability capabilities for each island. For instance, analyses can be run to understand the financial and energy savings impact and implications of generating additional electrical power through photovoltaics. The university will also be able to determine the actual and potential levels of energy consumed for each island and determine the optimal usage and generation patterns to lower consumption and the consequent network needs.
    Instrumented - The grid uses a wide variety of heterogeneous devices to acquire and collect data, all of which can be easily extended to include new and innovative data acquisition instruments as they become of interest.
    Interconnected - The solution depends on both wired and wireless networks to connect the nine energy islands, data acquisition devices and the grid platform. In addition, field devices feed data to analytics and modeling software.
    Intelligent - Together with smart grid technology, the system employs analytics, rules, algorithms and automatic processes to track and monitor energy production and consumption levels, providing researchers with an unprecedented and comprehensive view of the university’s energy needs and potential energy production capabilities. Data related to voltage, currents, power and temperature is analyzed and tracked, enabling the university to track consumption patterns in each of its islands and buildings and determine where energy is inefficiently used. For instance, by monitoring the consumption levels of appliances such as heating and refrigeration units, researchers can identify the most egregious consumers and make better decisions about which assets to replace or update. The data also contributes to insight about supply and demand, enabling the university to model and optimize the micro grid to meet future energy needs.
  • Key Points
    As you’ve seen, we’ve made a significant investment in building the broadest and most complete big data platform in the industry. If we had to summarize our differentiators into a few categories, this would be it.
    InfoSphere BigInsights has the complete set of capabilities to analyze large volumes and variety of data. We’ve taken open source Hadoop and added the following enhancements to make it enterprise-grade:
    Performance optimizations that optimize Hadoop workloads resulting in faster answers
    Leading-edge text analytics capabilities that delivers more accurate results than other approaches
    Professional-grade developer tooling and administration consoles to develop and manage big applications and environments using existing skills
    Enterprise-class security to protect confidential data and ensure data privacy
    Built-in high-speed connectors to connect to new data sources and types as well as your existing enterprise systems
    As you can see, InfoSphere BigInsights is the Clear Choice to analyze your large volumes and varieties of data
  • Key Points
    New paradigm is required to analyze data in motion – some big data problems simply don’t allow you to persist and then analyze data
    Can process multiple streams of data at the same data
    Modular design that has unlimited scalability – millions of events per day
    Designed for variety – to analyze many data types simultaneously
    Video, audio, text, social media, devices (smart meters, RFID, instruments) as well as structured data
    Can perform complex calculations on the data in real-time
    Built-in integration with the other capabilities in the IBM big data platform
    Data Warehousing (Netezza, InfoSphere Warehouse, Smart Analytics System)
    Hadoop (BigInsights)
  • Key Points
    1) Streams is the right capability when the primary big data challenge is analyze data that is in motion (Velocity) – because the business imperative requires a real-time response/action based on analyzing the data or the data is very large and want to more cost-effectively filter and remove data before moving into your data warehouse or Hadoop system. It can handle continuous or bursty streams of data – millions of events per second with microsecond latency.
    2) Streams can process any type of data (Variety)– audio, video, network logs, sensors, social media such as Twitter, in addition to structured data.
    3) And, Streams is designed to scale to process any size of data from Terabytes to Zetabytes per day
    Streams is a platform to build many applications for many industries. It can handle huge amounts of data, up to terabytes per second, or Petabytes per day.
    It can fun a large variety of analytics – from historic analysis like data mining, to predictive analytics. And custom analytics such as image analysis, voice recognition, etc.
    Since Streams is all done in memory, it has high velocity – it can respond to events in microseconds, 1/1000 of a millisecond. So, it is orders of magnitude faster than databases, which must first store data on disk drives.
    Streams also provides tremendous agility to businesses. With the ability to dynamically added new applications that can tap into existing data streams and applications, businesses can respond more quickly to a changing world.
    And the power developer and debugging tools we provide speed application development.
  • Key Points
    InfoSphere Streams <Focus on bottom portion of the graphic> :
    Manages multiple stream inputs
    Analyzes and joins streams together for joint analysis
    Can join or loop steams – perform multiple analytics on a stream
    Output may be visualization or systematic action (a notification)
    Other portion of InfoSphere Streams <top half of graphic> is a development environment
    Need tools / IDE to develop streaming applications
    Automatically optimizes deployment (e.g., co-locating or fusing operators on a single node if they are used together, etc)
    The Streams programming model is to define a data flow graph that defines the connections among data sources (inputs), operators, and data sinks (outputs). Operators, and the streams that connect them, are the building blocks of the logical program.
    The deployment model is to group, or “fuse”, operators into units called Processing Elements, or PEs. PEs are separate executables that form the building blocks of the distributed application.
    A PE contains one or more operators
    A program consisting of many operators may be fused into a single PE; at the other extreme, each operator may run in its own PE
    Operators fused into a single PE are tightly coupled; data transfer is local and fast
    Communication between PEs uses the network protocol and is therefore slower; but these more loosely coupled components can be flexibly distributed over available processing nodes and are more easily reusable
    At the physical level, a Streams job (a running application) consists of multiple intercommunicating PEs.
    This lets you place resource-hungry analytics on appropriately-sized nodes
    Reuse of generic components is a side-benefit, but choosing the optimum level of component granularity and performance is currently more art than science
    Streams provides the infrastructure to support the decomposition of applications into scalable chunks (PEs), and the deployment and operation of these PEs across stream-connected processing nodes.
    Note: If asked, Streams currently only supports nodes with x86 architecture, running a Linux (Red Hat) operating system.
  • Key Points
    Here’s an animation that helps show how InfoSphere Streams works and what you can do with it.
    Each of these balls represents an operator. The data passes through each operator where some action is being performed on the data.
    You can fuse data from multiple streams, you can modify it, annotate it, perform an analytics operation on it, fuse multiple streams or classify it. All of this can happen in less than a millisecond.
    The InfoSphere Streams analysis results (events/data) can be directly output and viewed in a monitoring dashboard, stored in a data warehouse or BigInsights, passed to a predictive analytics system or business process management system to trigger a response/action or additional analytics.
    All of this can happen in less than a millisecond.
  • Ease of use
    Up & Running Faster with First Steps - guides users through post install setup steps and gets from install to running in just a few clicks
    Drag & Drop graphical editor - allows users to build applications by dragging & dropping operators while automatically synching graphical and SPL source code views
    Improved Visual Application Monitoring – provides an instance graph that displays the application health and metrics and allows users to quickly identify issues
    Streams data visualization - allows users to dynamically add new views to running applications with charts provided out of the box
    Enterprise Integration with:
    Visualization integration
    BigInsights integration (Enhanced)- enables user to visualize Streams data in BigInsights Console
    Vivisimo integration (New)– enables user to visualize Streams data in Vivisimo CXO and stream data to Vivisimo index with a Vivisimo adapter
    InfoSphere DataStage integration - allows users to perform deeper analysis on data as part of the info integration flow and get more timely results; a Streams ETL toolkit provides adapters that exchange data between Streams and DataStage
    Netezza integration – Netezza Adapters use Netezza Native interfaces for optimized performance and allow separation of data preparation and load for flexibility and performance
    Advanced Analytics Toolkits
    Geospatial - high performance analysis and processing of geospatial data enables location based services by supporting GeoSpatial data types and functions
    Time Series – rich set of functionality that includes generation (synthesizing or extracting), preprocessing (preparation and conditioning), analysis (statistics, correlations, decomposition and transformation), modeling (prediction, regression and tracking)
    SPSS – uses IBM SPSS Modeler for developing & building predictive models and deploys models Streams via SPSS Scoring Operator; SPSS models are refreshed in Streams without suspending InfoSphere Streams
    CEP - Uses patterns to detect composite events in streams of simple events, integration in Streams allows CEP style processing with High Performance and Rich analytics
    XML Support
    New support for XML – allows developers to fuse a broader range of traditional and non-traditional data
  • Key points: Streams analyzes a variety of data types. Many people think of Analytics as possible only using BI on warehouses, but Streams enables many different kinds of analytics as well.
    The blue items are included in the Streams product (Mining in Microseconds, Statistics, Text Analysis, Geospatial, Predictive, and Advanced Mathematics)
    The red items have been built for Streams (Acoustic, Image & Video) and have been used in various projects and engagements, but they have not been made a part of the product. Contact development if you think any of them could help you in an opportunity.
    More later in this presentation on the toolkits available in the Streams product.
    A few notes:
    The “simple” in “Simple & Advanced Text” refers to basic functions for regular expression matching (and replacement) in string values; these are built into the Standard Toolkit, meaning they’re basically part of the language. “Advanced” refers to real text analysis using the same System T code used in BigInsights. This is new in Fix Pack 3 (November 2011).
    Statistics included with Streams range from simple aggregate functions (average, sum, etc.) to the more advanced metrics included in the Financial Services Toolkit.
    For true time series analysis, we have an advanced toolkit in development; this is covered under Advanced Mathematical Models in this slide.
  • Customer Reference Database Link:
    Dublin City Council, Ireland and IBM Research implemented a Intelligent Transportation System designed to provide updated speed and traffic flow measurements, travel time estimates and statistical aggregations of current traffic estimates and statistical aggregations of current traffic conditions in real-time.
    Built on IBM’s Big Data Platform using InfoSphere Streams.
    Solution provides Dublin City Council’s Roads & Traffic Department real-time visualization and visibility into the arrival times of their 1,000 buses on 150 routes and 5,000 stops daily.
    Having this information has enabled the department to optimize bus routes and stop locations.
    Benefits: using GPS positions of every bus enables analytics and visualization of:
    Location of the buses along their respective routes, or identify buses that do not follow their assigned route, average speed of individual vehicles and aggregate the speeds measured within a given time window on shared sections of route,
    Estimated the time of arrivals of the buses at their next stops along their route. Probability of delays at stops or travel times at stops at different time of the day or different days of the week from historical data in real-time.
  • Customer Reference Database Link:
    Client name: University of Ontario Institute of Technology
    Subtitle: Leveraging key data to provide proactive patient care
    The need: Today, patients are routinely connected to equipment that continuously monitors vital signs such as blood pressure, heart rate and temperature. The equipment issues an alert when any vital sign goes out of the normal range, prompting hospital staff to take action immediately, but many life-threatening conditions do not reach critical level right away. Often, signs that something is wrong begin to appear long before the situation becomes serious, and even a skilled and experienced nurse or physician might not be able to spot and interpret these trends in time to avoid serious complications. One example of such a hard-to-detect problem is nosocomial infection, which is contracted at the hospital and is life threatening to fragile patients such as premature infants. The indication is a pulse that is within acceptable limits, but not varying as it should. So, while the information needed to detect the infection is present, the indication is very subtle; rather than being a single warning sign, it is a trend over time that can be difficult to spot.The solution/benefit: With a shared interest in providing better patient care, Dr. Carolyn McGregor, Canada Research Chair in Health Informatics at the University of Ontario Institute of Technology (UOIT), and Dr. Andrew James, staff neonatologist at The Hospital for Sick Children (SickKids) in Toronto, partnered to find a way to make better use of the information produced by monitoring devices. Dr. McGregor visited researchers at the IBM T.J. Watson Research Center’s Industry Solutions Lab (ISL), who were extending a new stream-computing platform to support healthcare analytics. A three-way collaboration was established, with each group bringing a unique perspective—the hospital focus on patient care, the university’s ideas for using the data stream, and IBM providing the advanced analysis software and information technology expertise needed to turn this vision into reality. The result was Project Artemis, a highly flexible platform that aims to help physicians make better, faster decisions regarding patient care for a wide range of conditions. The earliest iteration of the project is focused on early detection of nosocomial infection by watching for reduced heart rate variability along with other indications. For safety reasons, in this development phase the information is being collected in parallel with established clinical practice and is not being made available to clinicians. The early indications of its efficacy are very promising. Project Artemis is based on IBM InfoSphere Streams. The IBM DB2 relational database provides the data management required to support future retrospective analysis of the collected data.
  • Customer Reference Database Link:
    We have been working with an Indian Telco client for some time now to help reduce their billing costs and improve customer satisfaction.
    Call Detail Record (CDR) processing within their data warehouse was sub-optimal,
    Could not achieve real time billing which required handling billions of CDRs per day and de-duplication against 15 days worth of CDR data
    Unable to support for future IT and Business with real-time analytics
    Single platform for mediation and real time analytics reduces IT complexity
    The PMML standard is used to import data mining models from InfoSphere Warehouse. Offloaded the CDR processing to InfoSphere Streams resulting in enhanced data warehouse performance and improved TCO
    Each incoming CDR is analyzed using these data mining models, allowing immediate detection of events (ex: dropped calls) that might create customer satisfaction issues.
    Business Benefit:
    Data now processed at the speed of Business - from 12 hours to 1 minute
    HW Costs reduced to 1/8th
    Support for future growth without the need to re-architect, more data, more analysis
    Platform in-place for real-time analytics to drive revenue
  • The product management experts on this call are going to give you a lot of really good detailed information on the new releases of these products, but before they do that I want to just summarize for you as sales people the bottom line on what this announcement means - in terms of what you have to sell and what kind of deals and opportunities you should pursue.
    #1 This release includes three new analytic accelerators. Analytic accelerators are pre-packaged, pre-developed sets of software tools that allow a customer to very quickly deploy one of our big data products to solve a specific business problem. These analytic accelerators are built on top of BigInsights and Streams and are part of the core product - once they GA, they will automatically ship with each order, and they don't cost extra. They are part of the base pricing model.
    As a sales person because you now have these analytic accelerators "in your bag," you can now go out and call on customers and talk to them about big data and how it can solve specific business problems, knowing that the solution you have to offer is more complete than any of our competitors and will help that customer deploy more quickly and get value more quickly than would be possible making it themselves or using a competitors products. The three analytic accelerators are......
  • Key Points:
    Without proper development tools/environments, you may have to code many things on your own – in many different tools
    It isn’t optimized, requires vastly different skill sets, and therefore is risk and/or time consuming
    Take one example with stream computing – you’ll see all of the various things that would have to be coded separately in order to do stream computing – and the impact to a customer (45% faster delivery)
  • Key Points
    The IBM big data capabilities are all designed to work together and with existing analytics applications such as BI and predictive analytics. Here’s an example scenario:
    1) Historic data is stored in the warehouse, where interesting patterns are detected, such as the pattern of credit card transactions that would indicate possible fraud.
    2) These models can be defined in tools like IBM SPSS that create the PMML models.
    3) The PMML models are then imported into InfoSphere Streams Studio to generate Streams programs that are executed to score the incoming records in real time.
    4) Additional data sources such as RFID tags, blogs, or other information might be used to improve the confidence levels of the scoring algorithms.
    5) These measures can be sent to Dashboards like Cognos Real Time Monitoring or business process management systems to trigger business processes to take immediate action as required.
    6) Streams can also detect model drift, and a closed loop process started – as the models drift, new models can be updated and incorporate to provide continuous improvement.
  • IBM enables you to get started with your education or project in multiple ways:
    Accelerated Discovery Lab - collaborative workspace & infrastructure at Almaden Research Facility for organizations that want to use some of the best minds in computer science to gain insights from their information sources.
    Analytic Solution Centers - 9 centers around the world to give quick and easy access to a range of advanced analytics solutions, resources and IBM expertise.
    IBM Experts - 9000+ consultants and the experience from 30,000+ analytics-driven client engagements across seventeen industries and over 170 countries.
    Academic Initiative - IBM has forged 1,000+ academic partnerships with leading universities to prepare students for the expanding scope of careers. Numerous Big Data & Analytics books have been written by IBM thought leaders
    Ecosystem - IBM has 2500+ business partners across Big Data & Analytics. They are extending the reach of our platform with unique value-add solutions.
    IBM AnalyticsZone - technology downloads, information sharing and other resources to help in the analytics journey. With 40K+ members, it’s the world’s leading social network dedicated to business analytics.
    IBM Big Data & Analytics Hub - content from 100+ contributors (analysts, industry luminaries and IBM SMEs.) 35K monthly visitors for live shows, videos, animations, blogs, infographics and more.
    IBM BigData University – 120K people registered and is one of the fastest growing, free education sites on big data.
    Transition: Big Data & Analytics is one of the foremost ways an organization can become a Smarter Enterprise.
  • Value proposition for big data isv partners 0714

    1. 1. © 2011 IBM Corporation IBM Big Data Value Proposition for ISV Partners Niu Bai, Ph.D, Worldwide Big Data Partner Ecosystem Development Lead
    2. 2. © 2011 BW @ IBM Corporation2 Acknowledgements and Disclaimers © Copyright IBM Corporation 2012. All rights reserved. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM This material is copyrighted and should not be used without direct permission from IBM or the author of this content. IBM, the IBM logo,, InfoSphere Brand, InfoSphere Streams and InfoSphere BigInsights are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at Other company, product, or service names may be trademarks or service marks of others. Availability. References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. The workshops, sessions and materials have been prepared by IBM or the session speakers and reflect their own views. They are provided for informational purposes only, and are neither intended to, nor shall have the effect of being, legal or other guidance or advice to any participant. While efforts were made to verify the completeness and accuracy of the information contained in this presentation, it is provided AS-IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this presentation or any other materials. Nothing contained in this presentation is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results.
    3. 3. © 2011 BW @ IBM Corporation Clients Need Both to Be Successful “…Every two days, we now generate the equivalent of all of the data that existed up to 2003. And thanks to advanced computation and analytics, we have the tools to turn that data into insight, knowledge and better decisions…” “Competitive Advantage in the Era of Smart” Ginni Rometty, Chairman, President and CEO, IBM Council on Foreign Relations, NYC March 7, 2013 Analytics Big Data Data • inform decisions • optimize processes • drive new business models. • explosion of data • new types of sources • moving at high speed
    4. 4. © 2011 BW @ IBM Corporation Data at Scale Data in Many Forms Data in Motion Data Uncertainty Big Data Is All Data VolumeVolume VarietyVariety VelocityVelocity VeracityVeracity
    5. 5. © 2011 BW @ IBM Corporation 30%Reduction in heating bills The Opportunities from Big Data & Analytics Are Infinite 15 min Response time to requests 150%Revenue growth rate 95%Accuracy of 18+ month sales forecasts 80%Less time required to open an account 98.5% On-time delivery target achieved 70%Counterparty measurements changed 80%Reduction in serious accidents
    6. 6. © 2011 BW @ IBM Corporation Why Act Now? Create IT AgilityManage RiskOutperform Only 1 in 5 organizations allocate more than 50% of IT budget to new projects Of leaders cite growth as the key source of value from analytics Source: 1 - IBM IBV Study: Analytics: A blueprint for value, October 2013 2 - IBM Global Study on the Economic Impact of IT Risk, 2013 3 - IBM Global Data Center Study, 2012 Of respondents were impacted by a cyber security breach over the past 24 months 46%75% 1in5
    7. 7. © 2011 BW @ IBM Corporation Three Key Imperatives for Big Data & Analytics Success Invest in aInvest in a big data &big data & analyticsanalytics platformplatform Be proactiveBe proactive about privacy,about privacy, security andsecurity and governancegovernance Imagine It. Realize It. Trust It. Build a cultureBuild a culture that infusesthat infuses analyticsanalytics everywhereeverywhere
    8. 8. © 2011 BW @ IBM Corporation Watson Foundations • IBM Decision Management • IBM Content Analytics • IBM Planning & Forecasting • IBM Discovery & Exploration • IBM BI & Predictive Analytics • IBM Content Management • IBM Hadoop System • IBM Stream Computing • IBM Data Management & Data Warehouse • IBM Information Integration & Governance Leading the new era of big data and analytics
    9. 9. © 2011 BW @ IBM Corporation Systems Security On premise, Cloud, As a service Storage IBM Watson Foundations IBM Big Data & Analytics Infrastructure New/Enhanced Applications All Data Real-time analytics zone Enterprise warehouse data mart and analytic appliances zone Information governance zone Exploration, landing and archive zone Information ingestion and operational information zone What could happen? Predictive analytics and modeling What action should I take? Decision management What is happening? Discovery and exploration Why did it happen? Reporting, analysis, content analytics Cognitive Fabric A New Foundation for leveraging all analytics and harnessing for data
    10. 10. © 2011 BW @ IBM Corporation The new era of IBM Big Data & Analytics with Watson Foundations Unique – fuels journey to Cognitive Innovative – easy to consume Complete – enterprise-ready Fast – start anywhere and grow WATSON FOUNDATIONS Sales Marketing Finance Operations HRRisk ITFraud Industry Solutions SOLUTIONS CONSULTING AND IMPLEMENTATION SERVICES BIG DATA & ANALYTICS INFRASTRUCTURE Decision Management Planning & Forecasting Discovery & Exploration Business Intelligence & Predictive Analytics Content Analytics Information Integration & Governance Data Mgmt & Warehouse Hadoop System Stream Computing Content Management WATSON FOUNDATIONS Sales Marketing Finance Operations HRRisk ITFraud Industry Solutions SOLUTIONS CONSULTING AND IMPLEMENTATION SERVICES BIG DATA & ANALYTICS INFRASTRUCTURE Decision Management Planning & Forecasting Discovery & Exploration Business Intelligence & Predictive AnalyticsBusiness Intelligence & Predictive Analytics Content Analytics Information Integration & Governance Data Mgmt & Warehouse Hadoop System Stream Computing Content Management
    11. 11. © 2011 BW @ IBM Corporation Helps me discover fresh insights • Predictive and content analytics to uncover patterns not yet known • Interactive exploration across all data Operates in a timely fashion • Real-time analytics as data flows through an organization • Enterprise-class Hadoop that runs 4x faster • In-memory computing for speed of thought analytics Establishes trust so I can act with confidence • Governance across complete data lifecycle including Hadoop • Security and privacy with compliance • Transparency and context to decision-making process Watson Foundations uniquely… WATSON FOUNDATIONS Decision Management Planning & Forecasting Discovery & Exploration Business Intelligence & Predictive AnalyticsBusiness Intelligence & Predictive Analytics Content Analytics Information Integration & Governance Data Mgmt & Warehouse Hadoop System Stream Computing Content Management
    12. 12. © 2011 BW @ IBM Corporation InfoSphere BigInsights: Provides Enterprise Grade Hadoop analytics  Manages a wide variety and huge volume of data  Augments open source Hadoop with enterprise capabilities – Visualization & Exploration – Development tools – Advanced Engines – Connectors – Workload Optimization – Enterprise integration – Analytic Accelerators – Application and industry accelerators – Administration & Security Accelerators Information Integration & Governance Data Warehouse Stream Computing Hadoop System DiscoveryApplication Development Systems Management Data Media Content Machine Social BIG DATA PLATFORM © 2013 IBM Corporation
    13. 13. © 2011 BW @ IBM Corporation Enterprise Performance & Integration Analytics Usability Key Differentiators for BigInsights It is much more just Hadoop! • Workload / performance optimization • GPFS • Security • Key integrations & Connectors with Enterprise Ecosystem • Text analytics • Social Data Analytics Accelerators • Machine Data Analytics Accelerators • Execute R in an integrated application • Big SQL • BigSheets • Development Tools • Web Console
    14. 14. © 2011 BW @ IBM Corporation InfoSphere BigInsights InfoSphere BigInsights Administration & Security Workload Optimization (MapReduce/SQL) Connectors Development Tools IBM tested & supported open source components Accelerators Open source based components Workload Management Security Development Environment Analytics/Extractors Analytics Extraction engine (System T) Visualization & Exploration Extractors and APIs SQL API
    15. 15. © 2011 BW @ IBM Corporation IBM InfoSphere BigInsights for Hadoop Runtime File System Data Store ResourceManagement& Administration Security Data Access Advanced Analytics Visualization & Ad Hoc Analytics Applications & Development Governance MapReduceMapReduce HBaseHBase HDFSHDFS IBMOpen Source Text Analytics Text AnalyticsRR Big RBig R Kerberos Kerberos Audit&History Audit&History GPFS FPOGPFS FPO Adaptive MapReduceAdaptive MapReduce Console Console Monitoring Monitoring LDAP LDAP DataSecurityforHadoop DataSecurityforHadoop DataPrivacyforHadoop DataPrivacyforHadoopDataMatching DataMatching DataMasking DataMasking Stream Computing Search StreamsStreams Enterprise SearchEnterprise Search Solr/ LuceneSolr/ Lucene JaqlJaql PigPigHiveHive ZooKeeper ZooKeeper Oozie Oozie Big SQLBig SQL Flexible SchedulerFlexible Scheduler ETLETL BigSheetsBigSheets DashboardDashboardChartingCharting Eclipse Tooling: MapReduce, Hive, Jaql, Pig, Big SQL, AQL Eclipse Tooling: MapReduce, Hive, Jaql, Pig, Big SQL, AQL BigSheets Reader and Macro BigSheets Reader and Macro Text Analytics Extractors Text Analytics Extractors FlumeFlume SqoopSqoop HCatalogHCatalog YARN* YARN* * In Beta InfoSphere BigInsights for Hadoop includes the latest Open Source components, enhanced by enterprise components
    16. 16. © 2011 BW @ IBM Corporation © 2013 IBM Corporation16 From Getting Starting to Enterprise Deployment: Different BigInsights Editions For Varying Needs Standard Edition Breadth of capabilities Enterpriseclass Enterprise Edition - Spreadsheet-style tool - - Web console - - Dashboards - Pre-built applications - - Eclipse tooling - - RDBMS connectivity - - Big SQL - - Monitoring and alerts - - Platform enhancements - - . . . - Accelerators - - GPFS – FPO - - Adaptive MapReduce - Text analytics - Enterprise Integration - - Big R - - InfoSphere Streams* - - Watson Explorer* - - Cognos BI* - - Data Click* - - . . . - * Limited use license Apache Hadoop Quick Start Free. Non-production Same features as Standard Edition plus text analytics and Big R
    17. 17. © 2011 BW @ IBM Corporation Predictive Analytics BI & Reporting Visualization & Discovery Operational Warehouse Zone Operational Warehouse Zone Analytics Warehouse Zone Analytics Warehouse Zone Hadoop Zone - Preprocessing, Queriable Archive, Ad Hoc Analysis Information Integration and Governance Information Integration and Governance Integration Master Data Governance Custom Applications Structured Semi Structured Unstructured Hadoop Analytics & Visualization Real time Analytics Zone Enterprise Integration with Multiple Products Brings the Power of the Big Data Platform to BigInsights InfoSphere Data Explorer BUNDLED: Indexing and “on the glass” integration InfoSphere Streams BUNDLED: Enables real- time, continuous analysis of data on the fly InfoSphere Guardium Auditing InfoSphere DataStage Data collection and integration Cognos BI BUNDLED: Support for Hive; BI capabilities Netezza Query & join data using UDFs DB2 and JDBC Hi-speed parallel read-write for DB2, JDBC connectivity R Execute R jobs directly from BigInsights web console InfoSphere Optim Data archiving Data masking
    18. 18. © 2011 BW @ IBM Corporation What’s New in BigInsights v3.0 – features and benefits Rich SQL language support over Hadoop Comprehensive SQL language support that runs ALL 99 TPC-DS queries and ALL 22 TPC-H queries without modification Rich SQL language support over Hadoop Comprehensive SQL language support that runs ALL 99 TPC-DS queries and ALL 22 TPC-H queries without modification Industry-leading SQL processing performance over Hadoop Data Execute queries 20 times faster, on average, over Apache Hive 12 with performance improvements ranging up to 70 times faster with Big SQL Industry-leading SQL processing performance over Hadoop Data Execute queries 20 times faster, on average, over Apache Hive 12 with performance improvements ranging up to 70 times faster with Big SQL Federated query Query multiple sources at once by combining data from many data sources, including DB2 for Linux, UNIX and Windows database software, PDA, PDOA, Teradata and Oracle etc. Federated query Query multiple sources at once by combining data from many data sources, including DB2 for Linux, UNIX and Windows database software, PDA, PDOA, Teradata and Oracle etc. Enhanced BigSheets analysis and visualization tool Integration with SQL and Hive for consistency between tables and sheets; D3 (Data Driven Documents) charts for quick visualization of ad hoc analytics enabling business analysts and data scientists to get insight from big data without coding Enhanced BigSheets analysis and visualization tool Integration with SQL and Hive for consistency between tables and sheets; D3 (Data Driven Documents) charts for quick visualization of ad hoc analytics enabling business analysts and data scientists to get insight from big data without coding Secure and protect sensitive information in Hadoop Enable Kerberos security in Hadoop to establish service to service authentication that increases security strength to prevent middle man attacks Secure and protect sensitive information in Hadoop Enable Kerberos security in Hadoop to establish service to service authentication that increases security strength to prevent middle man attacks
    19. 19. © 2011 BW @ IBM Corporation Big SQL 3.0 – At a Glance
    20. 20. © 2011 BW @ IBM Corporation Transforming Email Marketing Campaign Effectiveness with IBM Big Data Capabilities • InfoSphere BigInsights, IBM PureData for Analytics – powered by Netezza technology, Cognos BI Need • Constant Contact needed to analyze 35 billion annual emails to guide customers on best dates and times to send emails for maximum response Benefits • 40 times improvement in analysis performance • 15-25% performance increase in customer email campaigns • Analysis time reduced from hours to seconds
    21. 21. © 2011 BW @ IBM Corporation Teikoku Databank – Cuts time needed to process billions of textual data items from several days to 30 minutes Analyzes 4.75-fold more data, enhancing corporate credit offering to customers “With IBM InfoSphere BigInsights, it has become possible to process billions of items of textual data in 30 minutes.” — Mr. Satoshi Kitajima, an MBA Statistician in the SPECIA Team of the Business Analytics Division of the Market and Business Intelligence Department, Teikoku Databank Accelerates processing textual data, speeding delivery of information to clients Differentiates the company, providing a significant competitive advantage Banking, Financial Markets The transformation: By effectively combining proprietary data with big data from the Internet, the company maximizes utilization of the available data sets to deliver more detailed information to customers. Soluton components Software • IBM® InfoSphere® BigInsights™ InfoSphere BigInsights
    22. 22. © 2011 BW @ IBM Corporation Vestas optimizes capital investments based on 2.5 Petabytes of information. • Model the weather to optimize placement of turbines, maximizing power generation and longevity. • Reduce time required to identify placement of turbine from weeks to hours. • Incorporate 2.5 PB of structured and semi-structured information flows. • Data volume expected to grow to 6 PB. Capabilities Utilized: InfoSphere BigInsights InfoSphere Warehouse
    23. 23. © 2011 BW @ IBM Corporation23 Large European University generates own energy and uses analytics to monitor and manage consumption Need • After years of 8-digit electric bills, the university deployed an independent on- campus power generation system. But they lacked a solution to monitor, analyze, and manage production and consumption, Benefits • Anticipate lower energy consumption levels and costs • Ability to identify energy inefficient areas of campus and take corrective action • Improved understanding of how changes in power grid model affect energy efficiency Capabilities Utilized: Cognos BI, SPSS InfoSphere BigInsights InfoSphere Warehouse Tivoli Energy Management
    24. 24. © 2011 BW @ IBM Corporation Performance enhancements & workload optimizations resulting in faster answers Unique analytic engines that get more accurate results Built-in development environment and administration consoles enable your resources skills to utilize Hadoop Enterprise-class security to protect your big data Pre-integrated to your existing enterprise IT systems ensuring that big data doesn't become a silo Faster For Enterprise Big Data, InfoSphere BigInsights is the Clear Choice Smarter Secure Plugged-in Easier © 2013 IBM Corporation
    25. 25. © 2011 BW @ IBM Corporation 25 InfoSphere Streams - Streaming Analytics for Big Data • Built to analyze data in motion – Multiple concurrent input streams – Massive scalability • Process and analyze a variety of data – Structured, unstructured, video, audio – Advanced analytic operators • Enables Adaptive Real-Time Analytics – With Data Warehouses – With Hadoop Systems Accelerators Information Integration & Governance Data Warehouse Stream Computing Hadoop System DiscoveryApplication Development Systems Management Data Media Content Machine Social BIG DATA PLATFORM
    26. 26. © 2011 BW @ IBM Corporation 26 Millions of events per second Microsecond Latency Traditional / Non-traditional data sources Real time delivery Powerful Analytics Algorithmic Trading Telco Churn Prediction Smart Grid Cyber Security Government / Law enforcement ICU Monitoring Environment Monitoring Volume Terabytes per second Petabytes per day Variety All kinds of data All kinds of analytics Velocity Insights in microseconds InfoSphere Streams Delivers Analytics for Big Data In-Motion Example Streaming Data Sources: Video, audio, networks, social media
    27. 27. © 2011 BW @ IBM Corporation 27 Massively scalable stream analytics Linear Scalability • Clustered deployments – unlimited scalability Automated Deployment • Automatically optimize operator deployment across nodes Performance Optimization • Parallel & pipeline operations • Efficient multi-threading Analytics on Streaming Data • Analytic accelerators for a variety of data types • Optimized for real-time performance Visualization Streams Runtime Deployments Sink Adapters Analytic Operators Source Adapters Automated and Optimized Deployment Streaming Data Sources Streams Studio IDE
    28. 28. © 2011 BW @ IBM Corporation 28 ModifyModify Filter / SampleFilter / Sample ClassifyClassify FuseFuse AnnotateAnnotate Big Data in Real Time with InfoSphere Streams ScoreScore Windowed Aggregates Windowed Aggregates AnalyzeAnalyze
    29. 29. © 2011 BW @ IBM Corporation 29 IBM InfoSphere Streams 29 Comprehensive Development Tools Scale-out Architecture Sophisticated Analytics with Toolkits & Accelerators • Clustered runtime for near- limitless capacity • Large scale deployment • RHEL v5.3 and above • CentOS v6.0 and above • X86 & Power multicore HW • SUSE Linux Enterprise Server 11.2 and above • InfiniBand support • Ethernet support • Eclipse IDE • Web console • Drag & drop editor • Instance graph • Streams visualization • Streams debugger • Java improvements • Mapped operators • CEP, Database, Data Explorer, DataStage, Finance, SPSS, R Geospatial, Internet, Mining, Messaging with JMS adapter, Standard, Text, Time Series Toolkits • Telco & Social Data Accelerators Front Office 3.0 Items in blue are new in V3.1
    30. 30. © 2011 BW @ IBM Corporation 30 © 2013 IBM Corporation30 What’s new in v3.1 Analytics! Performance! Integration! R analytics • Data analysis with statistical, mining and modeling capabilities • Extends family of analytics options with Streams • Opens new market & fosters BigInsights/Netezza Synergy New Time Series Toolkit operators • Incremental interpolation: replace missing values • Distribution: compute quartiles, median Improved Performance • Java enhancements yield faster throughput and reduced resources for fused Java operators • For operations with lists and maps, 2 to 10x improvements Eliminate barriers to integration and adoption with new support for: • SUSE Linux Enterprise Server v11.2 • JMS (Java Messaging Services) • Teradata & Asterdata
    31. 31. © 2011 BW @ IBM Corporation 31 Streams Analyzes All Variety of Data Mining in Microseconds (included with Streams) Image & Video (Open Source) Simple & Advanced Text (included with Streams) Text (listen, verb), (radio, noun) Acoustic (IBM Research) (Open Source) Geospatial (Included with Streams) Predictive (Included with Streams) Advanced Mathematical Models (Included with Streams) Statistics (included with Streams) ∑population tt asR ),( ***New******New*** ***New******New*** ***New******New*** Blue = included with the product Red = built for Streams and used in projects but not yet part of the product Blue = included with the product Red = built for Streams and used in projects but not yet part of the product
    32. 32. © 2011 BW @ IBM Corporation 32 Dublin City Centre Increases Bus Transportation Performance • Public transportation awareness solution improves on-time performance and provides real-time bus arrival info to riders • Continuously analyzes bus location data to infer traffic conditions and predict arrivals • Collects, processes, and visualizes location data of all bus vehicles • Automatically generates transportation routes and stop locations Results: • Monitoring 600 buses across 150 routes • Analyzing 50 bus locations per second • Anticipated to Increase bus ridership Capabilities Utilized: Stream Computing
    33. 33. © 2011 BW @ IBM Corporation 33 University of Ontario Institute of Technology (UOIT) Detects Neonatal Patient Symptoms Sooner • Performing real-time analytics using physiological data from neonatal babies • Continuously correlates data from medical monitors to detect subtle changes and alert hospital staff sooner • Early warning gives caregivers the ability to proactively deal with complications Significant benefits: • Helps detect life threatening conditions up to 24 hours sooner • Lower morbidity and improved patient care Capabilities Utilized: Stream Computing “Helps detect life threatening conditions up to 24 hours sooner”
    34. 34. © 2011 BW @ IBM Corporation 34 Asian Telco reduces billing costs and improves customer satisfaction Capabilities: Stream Computing Analytic Accelerators Real-time mediation and analysis of 5B CDRs per day Data processing time reduced from 12 hrs to 1 min Hardware cost reduced to 1/8th Proactively address issues (e.g. dropped calls) impacting customer satisfaction.
    35. 35. © 2011 BW @ IBM Corporation35 IBM Accelerator for Telco Event Data Analytics • Telcos • Campaign management, real-time promotion, fraud detection, service assurance and network monitoring, • Ships with Streams v3, but works with BigInsights or PureSparta for Analytics (a.k.a. Netezza) IBM Accelerator for Social Data Analytics • B2C businesses • Sample applications: Customer acquisition / retention, Customer Segmentation or Micro Segmentation, Marketing Campaign Optimization, Lead generation, Brand Management or Surveillance • Ships with BigInsights v2 and Streams v3 IBM Accelerator for Machine Data Analytics • Cross-industry: manufacturing, oil & gas, energy and utility, healthcare, travel and transportation, CPG, Retail, etc. • Operational efficiency monitoring, security incident investigation. proactive maintenance, troubleshooting, outage prevention, efficiency tracking, etc • Ships with BigInsights v2 3 Accelerators BigInsights and Streams are Faster to Deploy
    36. 36. © 2011 BW @ IBM Corporation36 Without a Big Data Platform You Code… IBM Big Data Platform Multithreading Custom SQL and Scripts Performance Optimization Debug Application Management Event Handling Connectors Check Pointing Security HA Accelerators and Toolkits
    37. 37. © 2011 BW @ IBM Corporation 37 Putting it All Together …End-to-End Big Data Solution PureData System InfoSphere BigInsights IBM Cognos IBM SPSS Streaming Data Sources Discover Model Visualize & Publish Score Measure InfoSphere Streams InfoSphere Warehouse InfoSphere Data Explorer
    38. 38. © 2011 BW @ IBM Corporation Go Further and Faster with IBM Resources Accelerated Discovery Lab Ecosystem Analytics Solution Centers Expertise 99KKConsultants 3030KKEngagements 2,5002,500 Academic Initiative 1,0001,000Partnerships Business Partners
    39. 39. © 2011 BW @ IBM Corporation39 Value Proposition for ISVs • Obtain a solid enterprise ready platform to integrate with their applications or solutions • Enhance and expand their solutions by leveraging IBM Big Data Platform and thus open up new business and revenue opportunities • Joint go-to-market Opportunities  Have access to IBM accounts they wouldn’t normally have • Make profit by via ASL or OEM agreement • Take advantage of capabilities of Streams and BigInsights for scalability, security, analytics, and integration with the rest of the information agenda infrastructure • Leverage IBM’s partner ecosystem of ISV and System Integration partners to quickly expand business globally
    40. 40. © 2011 BW @ IBM Corporation40 Big Data Micro Site – Partner Profiles  Main Microsite Link –  Microsite Click Through To BP Detailed Page –
    41. 41. © 2011 BW @ IBM Corporation41 IBM Big Data Partner Program Enrollment Process 1) Become a member of PartnerWorld, if you are not already – 1) Send a “letter of intent” via email to Niu Bai, stating that you intend to integrate your solution with Streams and/or BigInsights. 3) Download Streams and/or BigInsights Quick Start Editions – – 3) Please fill out the partner information template and send them back to once completed  Then get exposure on dedicated IBM Big Data Partner Website, connected to IBM Big Data Sales and Ecosystem Development Teams, and joint marketing etc. 3) Attend the free Big Data bootcamps lang=en#!/wiki/Information+Management/page/Big+Data+Fundamentals+Bootca mp 3) Integrate your software solution with Streams and/or BigInsights and/or develop a offering around Streams/BigInsights –
    42. 42. © 2011 BW @ IBM Corporation42 Key Web Site Links  IBM big data Microsite –  IBM big data Linkedin Group –  IBM big data channel on YouTube –  Training/Education Link –  Big Data University –  Big Data Hub –  Key Blogs – Bruce Weed’s Blog on big data • – IBM Smarter Computing Blog •  Join the conversation on our Information Management Business Partner Blog  Follow us on Twitter: InfoMgmt Partner IBM Big Data
    43. 43. © 2011 BW @ IBM Corporation43 Thank You Merci Grazie Gracias Obrigado Danke Japanese English French Russian German Italian Spanish Brazilian Portuguese Arabic Traditional Chinese Simplified Chinese Hindi Tamil Thai Korean