• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Architecting bigdata for the enterprise
 

Architecting bigdata for the enterprise

on

  • 149 views

 

Statistics

Views

Total Views
149
Views on SlideShare
149
Embed Views
0

Actions

Likes
1
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • You heard in the earlier presentations about the power of bringing together the relational and nonrelational worlds into a single analytic environment. What we’re going to focus on in this presentation is how to build a data management layer to support that. So that you have a way to bring in all the data you need and make it available within your enterprise.
  • Todd Laurence from Cloudera…
  • Specifically what we’re going to do is talk about how to take the data warehouse that most if not all of you are already using to run your business.” and then show how to add to that an integrated data reservoir that will enable you to bring in and manage all of the new variety of data that people think of as big data.Why not just stick it all in your data warehouse? The simplest way I’ve heard to illustrate that is to think of the airline industry. But instead of passengers, we’ll talk about data. Just like passengers, all data wants to fly first class. That’s the data warehouse with all that if offers. But realistically, there’s lots of data that will do just fine back there in coach. Some of it will get an upgrade, but most of it won’t.We can also go back to those examples of data reservoirs in the earlier presentations. There was the financial institution that needed to get access to all of their data that was held in a variety of different sources with varying degrees of accessibility. They needed to run stress tests required by financial regulators but they just didn’t have enough data to hand to do that properly and efficiently. By building a data reservoir and linking it to their existing Exadata data warehouse, which is where the stress tests are effectively run, they now have the ability to meet current needs, but they are also well positioned for the future. And then there was the mobile provider who Cloudera just talked about. They wanted to combine their existing customer information with this flood of new data about mobile device usage. Both of them needed to enhance their existing information architecture to incorporate Hadoop.In only half an hour, we don’t have the time for an in-depth look at everything you might need, But we want to give some insight into the things that these customers had to take account of so that you can use this as you are considering your own data reservoir.
  • Specifically what we’re going to do is talk about how to take the data warehouse that most if not all of you are already using to run your business.” and then show how to add to that an integrated data reservoir that will enable you to bring in and manage all of the new variety of data that people think of as big data.Why not just stick it all in your data warehouse? The simplest way I’ve heard to illustrate that is to think of the airline industry. But instead of passengers, we’ll talk about data. Just like passengers, all data wants to fly first class. That’s the data warehouse with all that if offers. But realistically, there’s lots of data that will do just fine back there in coach. Some of it will get an upgrade, but most of it won’t.We can also go back to those examples of data reservoirs in the earlier presentations. There was the financial institution that needed to get access to all of their data that was held in a variety of different sources with varying degrees of accessibility. They needed to run stress tests required by financial regulators but they just didn’t have enough data to hand to do that properly and efficiently. By building a data reservoir and linking it to their existing Exadata data warehouse, which is where the stress tests are effectively run, they now have the ability to meet current needs, but they are also well positioned for the future. And then there was the mobile provider who Cloudera just talked about. They wanted to combine their existing customer information with this flood of new data about mobile device usage. Both of them needed to enhance their existing information architecture to incorporate Hadoop.In only half an hour, we don’t have the time for an in-depth look at everything you might need, But we want to give some insight into the things that these customers had to take account of so that you can use this as you are considering your own data reservoir.
  • We need to have an integrated data reservoir and data warehouse in order to make more productive use of all the data that’s available inside, or in some cases, outside the organization. In other words we need to shrink the gap mentioned in the keynote between the data we produce and the data were actually able to use productively.There are three main problems that we want to tackle. We’re going to start with simply how to capture all of that data effectively such that we stand a fighting chance of being able to use it all. So we’ll take a look at what’s behind building a Hadoop cluster, getting one up and running quickly and cost effectively.Then we’ll move on to how we can do efficient analysis of it all. Ultimately, you gather data because you are looking to gain some insight from it, now or at some point in the future. Remember that with both those other customers, they needed to do an integrated analysis. It’s not enough to analyze things in their silos. You need, for example, to look at customer clicks on a website (which would be a classic use case for Hadoop) along side their purchasing records which would be kept in your existing data warehouse. So analyze all the data really means, analyze it all together, not in isolated or separate chunks. This is where we’re going to talk about integrating Hadoop and your data warehouse so that they start to look more like a single platform and less like chalk and cheese.And finally we’ll talk about securing the overall platform. That hasn’t always been a concern with Hadoop, or even a focus. But it’s growing increasingly important. Even if the data’s flying in coach, you still need to keep It secure.So let’s get started with building a Hadoop cluster.
  • Hadoop is open source software designed to run on commodity servers. So many people make the assumption that building a Hadoop cluster is as simple as finding a few spare servers and downloading some free software. And it’s true that may give you a Hadoop cluster. But it’s probably not one that you wouldwant to run in a production environment in your organization. Building an enterprise grade Hadoop cluster is a little more complex than that.We asked an analyst firm called Enterprise Strategy Group to look at the cost and complexity of building a production grade, enterprise ready Hadoop cluster. Here’s what they found. You have to spend some time on hardware design, getting the right servers, the right balance between disk and RAM, and of course a network that is fast enough. You have to find the right software and there are many components to the Hadoop ecosystem, and then install it and for most people the hardest part you then need to optimize it. You need to make sure the Hadoop runs optimally on your specific hardware and network configuration. Oh yes, and you need to plan to support this combination of stuff from multiple vendors.So on the left-hand side you can see the costs associated with building a single rack Hadoop cluster. It’s a little over US$700,000. But this number includes everything you need design hardware software installation and of course support.It’s a lot more complex than most people assume. If this is something you’re looking at doing, please take a look at the white paper to make sure you properlyestimate the magnitude of the task.WEAKNESS -> new, unfamiliar, not a lot of expertiseIn the early days, teams of developers, hardware experts and network engineers would design the system: identify the best CPU/disk/memory ratios engineer redundancy across the key components procure the components from their server and networking vendors identify key software - like the OS, JVM, Hadoop Distribution and NoSQL database once the servers arrive, they then rack and stack and network the servers procure the OS, a Hadoop distribution and/or NoSQL database install and configure the software across the cluster. Finally, they would tune the hundreds of configuration settings - in Hadoop, Java, OS - in order to ensure their workloads ran in a performant way.
  • We also asked Enterprise Strategy Group to contrast Oracle big data appliance with a DIY cluster. And you can see the results. The cluster comes as a fully built rack, with all software installed and tuned, hardware and network properly balanced, everything fully cabled and working, and of course fully supported as an integrated system.Purchase cost for this integrated appliance came out significantly lower than building it yourself. And they also found big improvements in time to value, operations, and overall performance. I should note here that since the white paper was published specifications and pricing for both a DIY cluster and Oracle Big Data Appliance have changed. But the overall model comes to the same conclusion.Now one word of caution that I have to mention here. I’ve been to lots of trade shows, and one show in particular almost every person who walked past the booth and engaged in conversation said something like this. “Yes, I know about your appliance. But I can build a cluster for half that price, so I’m not interested”. Now we’d done the investigation, so we knew that wasn’t correct. But everybody had the same idea. After asking some questions, it turned out that for that event at least, there was a common misconception. For everybody there, a Hadoop cluster meant “let’s find the cheapest pizza box we can get our hands on, and stick 40 of them in a rack”. Setting aside for the moment, the small issue of switches (they were all assuming that switch costs are just in the noise) the main issue here is about the capacity of the cluster. Today, most Hadoop systems are constrained more by disk space then they are by CPU power. There are a few algorithms that people use that go the other way, but for the most part they key characteristic is “how much disk space”. And when I inquired more carefully, it turned out that those cheap pizza boxes just didn’t have a lot of disk in them. That was one of the reasons they were so cheap, I suppose. So to build a cluster with the same amount of disk space, they would have needed nearly two racks and in one case 2 ½ racks of kit. All of a sudden, their DIY clusters weren’t so cost effective.The main takeaway here, then, is that as you look at building a Hadoop cluster, be sure to compare like with like. The Big Data Appliance has 864 TB of raw disk space. IF you want to do your own comparison, make sure your cluster has similar capacity.WEAKNESS -> new, unfamiliar, not a lot of expertiseIn the early days, teams of developers, hardware experts and network engineers would design the system: identify the best CPU/disk/memory ratios engineer redundancy across the key components procure the components from their server and networking vendors identify key software - like the OS, JVM, Hadoop Distribution and NoSQL database once the servers arrive, they then rack and stack and network the servers procure the OS, a Hadoop distribution and/or NoSQL database install and configure the software across the cluster. Finally, they would tune the hundreds of configuration settings - in Hadoop, Java, OS - in order to ensure their workloads ran in a performant way.
  • So cost savings up to 40%, and time to value reduced by a third. Your results may vary depending on discounts, existing skills, pre-paid licenses and similar factors. But overall this report shows that an integrated appliance is quite likely to be a much better alternative than building it yourself. If you want to understand how to apply this kind of model to your specific situation we would be happy to take you through the numbers to help you build a business case that works in your organization. We usually find that the BDA is a more cost effective option – it’s certainly much more cost effective than most people assume. So please keep it on your shortlist.
  • So in summary for the first problem, capturing massive volumes of new data, building a Hadoop cluster using an integrated appliance can lower the total cost of ownership for Hadoop and deliver business value faster. Let’s move on now and see how to analyze all of that data that you now have in a data reservoir alongside your data warehouse.
  • Remember my first slide. I re-emphasized the importance of finding a way to integrate nonrelational data and relational data in your enterprise. Our vision is to integrate these two environments as closely and efficiently as we can. We’ll start this process by looking at how to connect a Hadoop cluster with an Oracle data warehouse.Oracle Big Data Connectors are used to link an Oracle data warehouse with a Hadoop cluster. In this case we’re talking about Oracle big data appliance, but these connectors will work with other distributions of Hadoop as well. In fact, we’ve just recently announced that Intel have certified IDH to work with Big Data Connectors. The two basic capabilities we need are to able to see data in Hadoop from the data warehouse, and to be able to move data, when needed of course, from Hadoop into the data warehouse.Oracle Big Data Connectors enable you to set up an external table. This gives Oracle Database access to Hive data in Hadoop. So you can initiate and run queries in your data warehouse that make use of data stored in Hadoop.And should you need to move any of that data from Hadoop the connectors enable a transfer rate of up to 15 TB an hour, taking advantage of the Infiniband network in both engineered systems. Of course, the connectors work on standard hardware without an Infiniband network. Plain old ethernet will do just fine. But you won’t get quite the same transfer rate on the slower network. So the connectors provide some basic building blocks. Let’s move on now to analyzing the data in an integrated fashion.THEME: Integrating with Existing EnterpriseBut, Big data is not an island. Oracle customers have a lot invested in their Oracle ecosystems. Their Oracle data warehouses contain valuable data that drive their analyses and insights. Oracle databases are also at the core of transaction systems and enterprise applications. Oracle BI is used to visualize this critical information - using dashboards, answers and mobile - and use these insights to make better decisions.We want to extend these applications to be able to leverage big data. In order to accomplish this, we need blazingly fast connections and simple access to data in Hadoop. Describe Big Data Connectors - 12TB/hour - automatic access to data in Hive. Off-line. On-line. Create queries that combine data in DB w/data in Hadoop.Entire stack is “big data enabled”. Exalytics - access data in Hive. Endeca - information discovery over data in Hadoop.
  • Analytics is critical - it’s almost always the reason you implement big data. And, with the big data platform - you are not limited to analyzing data samples - you can analyze all your data. Simpler algorithms on all your data have been proven to be easier to implement and frankly more effective than more complex algorithms over samples. It’s also important, to be able to analyze your data in place. Or if you prefer, to analyze all your data while minimizing the amount of data movement required. If there is one constant about big data it’s the size. And it simply takes a lot of time to move it around. Much better to do the analysis where the data is already located.We have a rich set of analytics in both Hadoop and the Oracle database.You might be familiar with the existing analytics capabilities built into Oracle database. These include text, graph, and spatial analytics. A newer addition to the database analytics portfolio is Oracle Advanced Analytics. This allows users to perform powerful predictive or statistical analysis using both SQL or the R language. SQL, of course, is what most users of the data warehouse are very familiar with. R Is a newer addition but growing increasingly popular in the analytics and data science communities. By supporting both languages for in database analytics, it is easier for your analytics team to work with data that managed in your data warehouse. They can use their language of choice.For some data scientists, the data warehouse is not an environment they are familiar with. And they often have a relationship with IT who are managing that system that is adversarial rather than cooperative. But done right, Oracle Advanced Analytics solves that problem. StubHub, the online ticket marketplace use OAA for their analysis. And here, both groups are working in harmony. IT are able to manage the data, making it available to the data scientists. They are able to use R, a language they are familiar with, and no longer have to worry about moving data around and managing it – which is not their area of expertise.On Oracle big data appliance we include and support a distribution of R. another component of Oracle Big data connectors provides native R access to data in Hadoop and the map reduce programming framework. So analysts familiar with R can also write their analytics for Hadoop without having to learn Java or understand the details of HDFS. This enables powerful analytics of data in Hadoop without moving the data elsewhere to a custom analytics cluster. Combined with the external table access which I mentioned earlier, Oracle offers a good solution for analyzing big data while minimizing data movement.
  • As you run analytics in your data warehouse and Hadoop clusters the next step is to ensure that the results of that analysis are available to all your key stakeholders and decision makers. What this means is that your BI reports and dashboards need to be able to access that new analysis. The new capabilities of Oracle advanced analytics in the data warehouse are of course available to BI software running on Exalytics. And OBIEE can also connect to Hadoop and pick up the results of any analysis done using R. So we have a powerful analytic environment that can handle both relational and non-relational data and make the results of that analysis available to the rest of your organization.
  • So we summarize the data analysis problem. We have an integrated analytic environment that can handle both kinds of data. It does that efficiently by minimizing data movement. And because it relies upon languages like SQL and R, which are often skills that your organization already has, it helps to expand the use of analytics within the organization. And that is one of the keys to getting business value from big data. Analytics need to become much more widely used within organizations.Let’s move on to the third issue that we want to discuss. Hadoop has the reputation for lacking a strong security model. We want to tell you what we’re doing to address that so you can be confident of incorporating it into your existing enterprise architecture.
  • Let’s step back a little bit and take a look at the strengths of your data warehouse environment, particularly if it should be running on Oracle Exadata. You can almost take for granted high-performance, availability of SQL, a rich toolset, and lots of existing expertise in your organization. And of course, given the sensitive nature of much of the information you have to handle, there are many security features built into Oracle database and related database options.Hadoop, on the other hand, comes with a different set of strengths. It scales out very well, it can store any kind of data because the schema doesn’t really matter until you read the data. Of course it’s also very open and is rapidly evolving.All of this is part of its appeal, but you’ll notice I didn’t say anything about security. Until recently Hadoop security was something of an oxymoron.THEME: Leveraging Strenghts of both WorldsTake a step back and look at the strengths of the two platforms - what can we do to
  • We still need to use both platforms because they offer complementary strengths. But we really need to address the security. Let’s look at how to take advantage of some of the security strengths in your data warehouse and that Oracle has in order to help secure big data.
  • Big data is sometimes thought of as low value information. Things like web clicks are useful to have but not particularly sensitive like, say, the social security or credit card numbers that you have in your data warehouse.That is beginning to change. If you are using Hadoop to store, for example, a large volume of health insurance claims to look for trends, you are inevitably also storing information that could be used to identify patients and their medical conditions. This kind of information is highly sensitive and must be protected and audited. In this respect, the data you capture in Hadoop is often no different from critical data that you would store your data warehouse, at least from a regulatory and security standpoint.Medical records are an obvious example, but from a privacy standpoint, lots of other information from email to location information needs to be kept secure. Even something as basic as a thermostat is capable of giving out information about when you are not in the house and there are doubtless some people who would like to know when a house is empty. The consequences of that data getting out there may not be as painful or immediate as, say, the recent credit card breaches at some major retailers, but they still matter for reasons of law, liability and plain good business.So you need to have a strong security capability and policies in place to protect this information.
  • Here we have four pillars of security or Oracle has been investing in order to ensure that data you store in Hadoop can be managed with appropriate security controls. These were announced last year as we released several enhancements to Oracle BDA software that dramatically improve the security story for Hadoop. As it has been and still is with Oracle Database, security is an important component of the architecture.First off we start with authentication, based on Kerberos. Basically what this means is we can control who has access to the cluster in the first place.Once granted access to the cluster, we can then authorize access to specific data. Along with Cloudera engineers, Oracle engineers are founding and submitting members to the new Apache Sentry project. So thanks to authentication we know who you are. And that means we can take your identity and make sure that you only have access to the data that you should see. So you might get access to one set of data completely different from your colleague on the other side of the room.The next step, of course is to audit that. We have expanded the capabilities of Oracle audit vault and database firewall. So if your security policy calls for it you now have the ability to demonstrate who accessed what data and when.Finally, we want to encrypt data as it flows through your system. Even with the overall goal of minimizing data movement, it’s still got to move around. And we encrypt it before it moves.
  • So we can now deliver enterprise grade security for all your data, not just the data that resides in your data warehouse. Of course, security is not something you do once and then you’re done. This will remain an ongoing focus. When we combine the security with the ability to capture data in the first place, and analyze it all, we have a powerful platform for bringing together relational and non-relational data.
  • What I want to finish with is a quick overview of Oracle’s overall solution
  • [These next two slides should be thought of as one slide with a build]As we went through those points earlier, we touched on almost all the products here. Let’s start on the left hand side with the data reservoir running on Oracle Big Data Appliance. It comes with the Cloudera distribution and Oracle’s R distribution as I mentioned. I didn’t cover Oracle NoSQL Database earlier, but that product is also available on the BDA. There are some use cases for new data where Hadoop is not the best choice.I did talk about Oracle Big Data Connectors, but this is the first mention of Oracle Data Integrator. There’s a specific connector for ODI so it can see and manipulate data in HDFS, generating underlying MapReduce code to transform and organize it. And ODI can also drive the Oracle Loader for Hadoop so you can run the T part of your ETL on Hadoop and then load it into the data warehouse all under the control of ODI. In fact, doing as much ETL work on Hadoop rather than in your data warehouse is often a good division of labor.On the right hand side there’s Oracle Exadata running Oracle Database and the database options.But you’ve also got to get the data into these systems in the first place...
  • So I’ve added in the layer that can help bring in data from external sources. I’ll highlight one of them, because Oracle Event Processing is used in situations where your data reservoir should include both Hadoop and Oracle NoSQL Database.If your new stream of data comes from sensors used to instrument a device, or measure flow and pressure in oil pipes, then that data should pass through Oracle Event Processing and then into Oracle NoSQL Database rather than Hadoop. Because OEP will want to look at not just the current reading, but also the recent history before determining if there’s a pending problem or just a one off anomaly. And for that fast lookup, you need a database, not Hadoop. Of course, once that data goes cold and forms part of your historical record, then transfering it to Hadoop for long term storage and analysis will probably be the best option.We can cover NoSQL in the Q&A or we can discuss it in more detail on another occasion.
  • Time doesn’t permit an in-depth discussion of all of this, but the big data management layer is part of a larger overall architecture including business analytics and a growing portfolio of Oracle and partner applications that build on our foundation.
  • We probably have the most comprehensive Big Data portfolio in the world. Engineered systems provide the backbone to our portfolio and we have best of breed technologies across the entire offering. We provide a one stop shop from data acquisition to sophisticated visualisation analytics.This slide is intended to show the massive capability of the Oracle Big Data portfolio. Whatever your requirements be it acquiring more compliance related data or improving your analytics capability for unstructured data that forms part of your compliance requirements – you can start your journey anywhere across the portfolio and be confident that as your requirements evolve and grow, it is highly likely that the Oracle will provide the necessary capabilities to support you meet your compliance challenges.Our solutions will be tailored to your requirements and budget.
  • De Persgroep is a major Media group operating in Belgium and the Netherlands. Annual revenues are approximately Euro 1BN.Recent BDA and Connectors win in conjunction with a new ISV Partner – NG Data and an established Oracle implementation partner – Uptime/CronosWe are hoping for an additional sale for a second Starter BDA over the next few months as well as the sale of Exadata.Customer is operating in the highly competitive Media business and is looking for ways to gain better customer insight for competitive advantage.

Architecting bigdata for the enterprise Architecting bigdata for the enterprise Presentation Transcript

  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1
  • Big Data Enterprise Architecture Marcin Chwin Big Data at WorkIn association with
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.3 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.3 Stała Innowacja Big Data at Work
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.4 Typowa architektura BI/DW spotykana obecnie
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.5 Architektura BI/DW jutra - Hadoop + hurtownia danych
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.6 Architektura Big Data Oracle Hurtownia danychRepozytorium danych +
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.7 Ograniczanie luki informacyjnej W użyciu Dane Dostępne Dane Gromadzenie i przetwarzanie dużych wolumenów danych Analiza wszystkich danych Bezpieczeństwo informacji
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.8 Hadoop Cluster Projekt HW Wybór SW Optymalizacja HadoopZakup I dostawa Rack and Stack Instalacja SW Hadoop Cluster “domowej roboty” $0k $700k Wsparcie Instalacja Software Hardware Projekt Początkowy koszt infrastruktury1 1 http://www.oracle.com/us/corporate/analystreports/industries/esg-big-data-wp-1914112.pdf
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.9 Hadoop Cluster Projekt HW Wybór SW Optymalizacja HadoopZakup I dostawa Rack and Stack Instalacja SW Oracle Big Data Appliance Początkowy koszt infrastruktury1 Przewaga nad DIY Clusters:  Koszt początkowy  Czas wdrożenia vs korzyści  Obsługa  Wydajność $0k $700k DIY BDA 1 http://www.oracle.com/us/corporate/analystreports/industries/esg-big-data-wp-1914112.pdf
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.10 Oracle Big Data Appliance 40% Oszczędność kosztów 33% Szybszy czas zwrotu z inwestycji Engineered To Perform $0k $700k DIY BDA
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.11 W użyciu Dane Dostępne Dane Niższe TCO dla instalacji Hadoop Krótszy czas zwrotu z inwestycji Analiza wszystkich danych Gromadzenie i przetwarzanie dużych wolumenów danych
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.12 Big Data Connectors oraz Data Integrator Big Data Appliance + Hadoop Exadata + Oracle Database 15TB/godzinę 10xszybciej
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.13 Analiza wszystkich danych w jednym środowisku Big Data Appliance + Hadoop Exadata + Oracle Database Advanced Analytics R
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.14 Udostępnianie wszystkich danych użytkownikom Big Data Appliance + Hadoop Exadata + Oracle Database Endeca OBI EE
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.15 Dostępne Dane W użyciu Dane Minimalizacja transferów danych Szersze wykorzystanie analityki Bezpieczeństwo informacji Analiza wszystkich danych
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.16 Zalety platformy Oracle Big Data Appliance + Hadoop Exadata + Oracle Database  Niski koszt skalowalności  Elastyczny model danych “Schema on Read”  Abstract Storage Model  Otwartość  Możliwość szybkich zmian  Wysoka wydajność  Wysoki poziom bezpieczeństwa  Analityka via SQL  Bogaty zestaw narzędzi  Znane środowisko
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.17 Jakwykorzystaćzalety obuplatform?
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.18 Zabezpieczanie Big Data  Rozwiązania Big Data często przechowują dane wrażliwe, które wymagają audytu i ochrony  Nie ma różnicy w stosunku do danych wrażliwych gromadzonych w “tradycyjnych” RDBMS
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.19 Rozszerzone bezpieczeństwo Big Data Uwierzytelnianie użytkowników – bezpieczny Kerberos Autoryzacja dostęp do danych - Apache Sentry Audyt aktywności o dostępu - Oracle Audit Vault and Database Firewall Szyfrowanie danych podczas ich przepływu przez system
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.20 Dostępne Dane W użyciu Dane Zabezpieczenia wszystkich danych klasy enterprise Gromadzenie i przetwarzanie dużych wolumenów danych Analiza wzystkich danych Bezpieczeństwo platformy Big Data
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.21 Architektura Big Data Kompletne rozwiązanie Platforma Big Data Oracle
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.22 Architektura Big Data Oracle Hurtownia danychRepozytorium danych + Oracle Big Data Connectors Oracle Data Integrator Oracle Advanced Analytics Oracle Database Oracle Spatial & Graph Oracle NoSQL Database Cloudera Hadoop Oracle R Distribution Oracle Industry Models Oracle Advanced Analytics Oracle Database Oracle Spatial & Graph Oracle Industry Models
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.23 Architektura Big Data Oracle Hurtownia danychRepozytorium danych + Oracle Big Data Connectors Oracle Data Integrator Oracle Advanced Analytics Oracle Database Oracle Spatial & Graph Oracle NoSQL Database Cloudera Hadoop Oracle R Distribution Oracle Industry Models Oracle GoldenGate Oracle Data Integrator Oracle Event Processing Oracle Event Processing Apache Flume Oracle GoldenGate Oracle Advanced Analytics Oracle Database Oracle Spatial & Graph Oracle Industry Models
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.24 Platforma Big Data Oracle Aplikacje Big Data Analityka Biznesowa Zarządzanie Big Data Hurtownia danych Repozytorium danych + Discovery BI+ Dla sektorów rynku I linii biznesowych
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.25 Oracle Exadata Oracle Exalytics Platforma Big Data Oracle Stream Acquire Organize Discover & Analyze Oracle Big Data Appliance Oracle Big Data Connectors Optimized for Analytics & In-Memory Workloads “System of Record” Optimized for DW/OLTP Optimized for Hadoop, R, and NoSQL Processing Oracle Enterprise Performance Management Oracle Business Intelligence Applications Oracle Business Intelligence Tools Oracle Endeca Information Discovery Embeds Times Ten Hadoop Open Source R Applications Oracle NoSQL Database Oracle Big Data Connectors Oracle Data Integrator In-Database Analytics Data Warehouse Oracle Advanced Analytics Oracle Database Oracle Event Processing Real Time Decisions
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.26  Wzrost wiedzy o klientach  Ujednolicony kompletny obraz klienta integrujący informacje z wielu źródeł, w tym z sieci społecznościowych  Zarządzanie kampaniami marketingowymi  Personalizacja dostarczanych treści  Rekomendacje Next Best Action Wzrost przychodów dzięki ulepszonej personalizacji treści Przykład - Customer & Brand 360 Rozwiązanie oparte na Oracle Big Data Appliance Solution components: Real-Time Decisions, Business Intelligence, Siebel CRM, Eloqua
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.27 Korzyści:  6 tygodni od zamówienia do wdrożenia produkcyjnego  Przechwytywanie wszystkich cyfrowych interakcji z klientem  Zintegrowany obraz klienta umożliwia realizację kampanii I rekomendacji za pośrednictwem wielu kanałów komunikacji Wyzwania:  Optymalizacja wartości klienta poprzez opartą o dane optymalizację działań marketingowych  Powiązanie danych on-line i off-lineCombine w zintegrowaną bazę wiedzy o klientach online and Przed Po Aktywność On-line (dane) ? Subskrypcja Gazet (dane) Dane Klientów Hurtownia danych Dane klientów Data Warehouse Big Data Appliance Social Geo Aktywności On-line i inne Nowe dane Wiedza o kliencie & rekomendacje, Zarządzanie kampaniami Subskrypcja Gazet (dane) Raporty & Dashboardy Przykład - Customer & Brand 360
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.28 Regions Bank Obniżenie kosztów poprzez uproszczenie infrastruktury IT Cele  Zgodność z wymogami regulacyjnymi wymaga więcej danych dla stress testów  Ograniczenie kosztów IT oraz redukcję redundancji danych w wielu repozytoriach Rozwiązanie  Pojedyncze, pewne środowisko ODS oparte na BDA/Exadata obsługujące wszystkie systemy źródłowe  Repozytorium bieżących i archiwalnych danych strukturyzowanych i bez struktury - Toyota Global Vision Operational Data Store Mainframe, RD BMS, more BDA Exadata • Agile business model • All data • De-normalized & Partial- normalized • Normalized • Aggregate data • EDW Oracle Enterprise Manager Oracle Data Integrator Data Delivery Master S1 Master S2 Master Sn SOA/API CRMS Other  Szybki dostęp do pozostałych 85% danych  Niższe koszty, uproszczona architektura, szybki czas uzyskania wartości z danych Korzyści ONLY FOR INTERNAL USE
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.29 Korzyści:  Szybszy dostęp do 6x większej ilości danych  Niższy koszt, uproszczona architektura  Realizacja w ciągu kilku miesięcy Przed Po Wyzwania:  Ograniczenie kosztów IT  Zgodność z regulacyjami wymagającymi więcej danych do obsłygi stress testów  Uporządkowanie przetwarzania danych Mainframe Mainframe Big Data Appliance Exadata Regions Bank Obniżenie kosztów poprzez uproszczenie infrastruktury IT
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.30
  • Copyright © 2013, Oracle and/or its affiliates. All rights reserved.31