"Big data" is a popular term generally used to acknowledge the exponential growth, availability and use of information (structured and unstructured). A lot has been written lately on big data trend and how it will become a key basis of competition, innovation, and growth.How does SAS define or view the term “Big Data”?Big data is a relative term (and not an absolute term) - when an organization’s ability to handle, store and analyze data (from a volume, variety and velocity perspective) exceeds its current capacity (i.e. beyond your comfort zone) then it would qualify of having a “big data” problem.
Big Data constitutes: Volumes - Growing volumes of data and how much data need to be processed within a time window Variety - includes structured tables, documents, e-mail, metering data, video, image, audio, stock ticker data, and more. Velocity - How fast data is produced and processed to meet demand. Ability to respond once a problem or opportunity is detected. A data environment can become extreme along any of the dimensions or combination of two or all of them at once. Hence it is important to determine and evaluate “relevant” data to answer the complex set of questions you have before they become obsolete. SAS role here – help determine what is relevant and what is not! In any given situation whether you are looking at pole top transformers, coal fired turbines, … or oil drilling equipment 6000 below sea level, or coupon redemption rate at the local grocery storeBig data in and of itself is not that interesting. Good data management practices is the answer to managing big data. But the only way you can leverage “big data” for valuable insights is by using game changing analytics from SAS. Opportunity for you to excel in your market … or a Ball and Chain to hold you back if you don’t embrace it effectively.** relatve how the values are relative and vary by customer** share examples of yours: Global Oil and Gas company Marketing analytics Service provider to mfg, cpg, and retailTRANSITION – so how do you thrive in big data … solid process, and leverage the right technology … ANALYTICS, ANALYTICS, ANALYTICSText below is from the Jim Davis analytics video on YouTubeOverwhelming amount of data today. And different types of data -- data structured in databases, and then unstructured like voice, video and text.Some call it the data deluge, other say we are drowning in data. We don’t need to look at it that wayLook at data as opportunityNow we may be comfortable making decisions based on gut feel, but that’s not going to cut it [in the smart grid era]The stakes are much higher now. We’ve got to make decisions based on facts.How do we do that? Easy. Analytics can and should be the differentiator.
1.0 Fundamental set or types of Analytics – which are core to our business and our analytical applications2.0 Customers use a combination of analytical techniques – for example data mining and text mining.3.0 On the front-end Data Management is important because end users spend lot of time and effort in preparing data for analytics.4.0 On the downstream-end, sharing of analytical insights through easy-to-use visualization/BI tools is important. 5.0 Integrated set of components
SAS High-Performance Analytics is delivered as a pre-configured analytics appliance. It includes analytical capabilities spanning data exploration, modeling and scoring from SAS delivered on either Teradata or EMC Greenplum database appliance to solve complex problems in a highly scalable, distributed environment using in-memory analytics processing. It will let customers develop and deploy analytical models using complete data – not just a subset or aggregate – to get accurate and timely insights and take well-informed decisions. It does not limit analytic professionals to using simplified analytical methods for solving complex problems. Compresses or shrinks the time from model inception to model deployment and derive rapid insights to make well-informed decisions or before the questions become obsolete!SAS High-Performance Analytics will include a select set of procedures from following SAS products: Base SAS, SAS/STAT, SAS/ETS, and SAS Enterprise Miner. A SAS 9.3 client interface manages the submission of high performance enabled problems to the compute grid (appliance) for execution.
Credit risk decisions are sensitive but fundamental to banks and lending institutions. Too much credit exposure can lead to high default rates and charge-offs; not enough often means lost business and revenue. Accurate and timely decisions on: accepting an applicant (application scoring), likelihood of defaults among customers who have already been accepted (behavioral scoring), and likely amount of debt that the lender can expect to recover (collection scoring) can easily differentiate a bank as a leader or a laggard in the market. Early detection of high-risk accounts (i.e. for cards, residential mortgages, commercial loans) is critical to perform targeted intervention, reduce bad-debt and reduce overall losses. The need to understand behavior and credit risk exposure at the “customer” level, across all touch points they will have with the bank and across all life-style changes (i.e. time dimension) puts a toll on analytic professionals. Banks or lending organizations need to build a greater number of segment-specific models for a variety of purposes, calculate the probability of default (PD) (as an example) on the loans they service and determine when and whether the borrower is migrating to a riskier pool. Benefits: SAS High-Performance analytics helps to incorporate large volumes of data with no limits on number of observations and variables or attributes for accurate determination of likelihood of defaults and loss forecasting. In addition, it will allow banks to adjust the historical transition probabilities bases on changes in interest rates (or other macroeconomic factors), and hedge these risks effectively. SAS High-Performance Analytics compresses the entire model lifecycle from days or hours to minutes or seconds. Bank will be able to enhance the credit risks to reflect real-world assumptions and include variables across dimensions. It offers analytic professionals the flexibility to test multiple scenarios or new ideas, use best modeling techniques, and perform model iterations more frequently to accurately and quickly identify risks at individual portfolio level and take targeted actions for bank to stay ahead of the market.
The traditional analytical process is time consuming and inefficient. It simply take long time (days) to finish data preparation/exploration, model development and model deployment steps. More specifically what if you :Have problems that they can’t solve because their data volumes are too big or beyond the capacity of their existing systems.Have too many records in their data or too many attributes or variables that needs to be incorporated in the modeling process. Predictive Analytics need to be applied on more granular level data. For example, modeling churn at the customer level instead of branch, predicting parts failure for an entire product line, or modeling propensity to buy at the account level. Massive variable selection stepis required, which necessitates sorting through thousands of variables to determine which are the most predictive. Do not want to compromise by using sub-optimal modeling techniques.Cannot quickly test or experiment different modeling techniques and find the best fit to improve accuracy.Fail in terms of getting desired modeling results and want to modify the model with new attributes and do not have time to wait.