빅데이터 분석 플랫폼, 스플렁크~~!!


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • 구글링이 검색하다의 의미이듯 스플렁킹하면 IT데이터를 검색하다라는 의미로 IT가이들에서는 받아들여지고 있음.
  • 스플렁크는 모니터링 솔루션이 아니라 차세대 데이터 처리 엔진임.
  • Many of you are familiar with SQL and writing SQL queries to relational databases. How is Splunk different from using a relational database to query for insights?
    Basically, you need to create a good schema for the data at design time (and hopefully not have to change the schema later)
    This means you also need to know what types of queries you’re like to run against the data to design a good schema
    As for the data, you need to make sure you figure out what the tables should look like and how to convert the data to fit into the tables
  • With Splunk, there’s no need to specify a schema.
    We create structure at search time so the queries and searches can be totally ad-hoc.
    The data can come from a variety of text based sources and can continuously evolve without playing with data formats and such
  • API's are the basic components that allow developers to use the languages and tools they already know, lowering the barrier to entry for developers.
    Allows for developers to build some, or an entire applications in an environment a developer is already comfortable with.
    From API's to tools to entire development environments, Splunk will integrate into the existing framework.
  • We’re making it easy for analysts, not just developers, to use Splunk...
    By providing a way to integrate with a variety of analytical tools, Splunk will become a key component of an analysts toolkit.
    Anyone doing data analysis will be able to start with tools they're familiar with, and get real value from Splunk.
    These analysts are, by and large, the ones who will be using Splunk to drive a new level of business insight, bringing out the value of Operational Intelligence.
  • Splunk is announcing the planned availability of a new software package called Splunk Enterprise™ with Hadoop. This new offering will include Splunk Enterprise™, the Splunk Hadoop integration layer and Apache™ Hadoop™.
    The Splunk Hadoop integration layer will provide more than just point-to-point connectivity and is planned to support the following operations:
    • Issuing MapReduce queries, or higher level queries (using Pig, Hive for example) from the Splunk search language, pull the resulting data sets back into Splunk
    • Indexing the output of Hadoop jobs in Splunk
    • Indexing data storied in HDFS in Splunk
    • Delivering data from Splunk to HDFS
    • Calling Splunk APIs directly from Hadoop jobs
  • According to IDC, unstructured data, much of it generated by machines, accounts for more than 90% of the data in today’s organizations.
    All websites, communications, networking and complex IT infrastructures generate massive streams of machine data every second of every day, in an array of unpredictable formats that are difficult to process and analyze by traditional methods or in a timely manner.
  • Machine data is one of the fastest growing and most pervasive segments of “big data”—generated by websites, applications, servers, networks, mobile devices and all the sensors and RFID assets that produce data every second of every day.
    It’s also one of the most valuable, containing a definitive record of user transactions, customer behavior, sensor activity, machine behavior, security threats, fraudulent activity and more.
    Traditional technologies predominantly built on relational databases cannot handle the complexity or massive scale of today’s machine data. Nor do they allow the flexibility to ask any question or get questions answered in real time—which is now an expectation of users.
    By monitoring and analyzing everything from customer clickstreams and transactions to network activity and call records —and more—Splunk software turns machine data into valuable insights no matter what business you’re in. It’s what we call operational intelligence.
  • Individual components in your infrastructure generate hundreds of events per second. A datacenter can log many terabytes of data per day.
    Making use of this data, however, presents real challenges. Existing data analysis, management and monitoring solutions are simply not engineered for this high-volume, high-velocity and highly diverse data.
    Consider traditional information management systems, such as business intelligence and data warehouse tools. These systems are batch-oriented and designed for structured data with rigid schemas. IT management and security information and event management tools on the other hand, provide a very narrow view of the underlying data and are hard-wired for specific data types and sources. They also don’t provide historical context.
    Finding a better way to sift, distill and understand the vast amounts of machine data can transform how IT organizations manage, secure and audit IT. It can also provide valuable insights for the business on how to innovate and offer new services, as well as trends and customer behaviors.
  • Machine data complexity – getting to the data – is a real challenge. Let’s take an example of a customer call a service desk.
    We have a customer in Boston who used to have 36 people on the phone for up to 8 hours while they tried to figure out why the core website was down
    And it’s not just a problem for IT, it can harm the business.
    Customer calls service desk – service desk logs calls and escalates (red light/green light, everything looks green)
    Escalated to App support – looks at java monitoring tools and everything looks fine because rely on instrumentation; but no access to logs!
    Developer gets pulled in and has to stop working on new code
    Needs to ask sysadmin for logs
    Developer establishes not his problem, escalate to DB guy
    DB guy looks at audit logs and points to bad query
    We call this “human latency” and customers we talk to say it can consume hours or sometimes days of previous time when issues occur!
  • Unlike traditional structured data or multi-dimensional data– for example data stored in a traditional relational database for batch reporting – machine data is non-standard, highly diverse, dynamic and high volume. You will notice that machine data events are also typically time-stamped – it is time-series data.
    Take the example of purchasing a product on your tablet or smartphone: the purchase transaction fails, you call the call center and then tweet about your experience. All these events are captured - as they occur - in the machine data generated by the different systems supporting these different interactions.
    Each of the underlying systems can generate millions of machine data events daily. Here we see small excerpts from just some of them.
  • When we look more closely at the data we see that it contains valuable information – customer id, order id, time waiting on hold, twitter id … what was tweeted.
    What’s important is first of all the ability to actually see across all these disparate data sources, but then to correlate related events across disparate sources, to deliver meaningful insight.
  • If you can correlate and visualize related events across these disparate sources, you can build a picture of activity, behavior and experience. And what if you can do all of this in real-time? You can respond more quickly to events that matter.
    You can extrapolate this example to a wide range of use cases – security and fraud, transaction monitoring and analysis, web analytics, IT operations and so on.
  • Using Splunk, organizations identify and resolve issues up to 70% faster and reduce costly escalations by up to 90%. Splunk is one place to find and fix problems, and investigate incidents across all your IT systems and infrastructure - your applications, websites, servers, networks, virtual machines, security devices, and more. This alone eliminates much of the "human latency" experienced in the trenches.
  • It’s fair to ask “what’s so different about this new generation of data?” After all, haven’t data volumes always been growing?
    The answer is yes, data is always growing. Some types of data are more mature. For example, business application data that comes from accounting systems, databases, and the like. This data is well understood, highly structured, and is usually managed by relational databases and OLAP systems. This data is growing more slowly – and the technologies to manage it are quite capable.
    There is also human-generated data, such as documents, text messages, and video. Technologies like Google are doing a great job of harvesting, indexing, and managing human-generated data. Document management systems handle some of this information, and those technologies are well known and mature.
    What’s new about machine data are the massive volumes of data that are being generated by devices, like servers, web streams, and mobile technologies. This data has highly diverse formats, and time is a critical dimension. It also contains human-generated data. This is the data that Splunk manages – this is the world of machine data.
    Splunk is as important to the world of machine data as the relational data base is to structured data, or as Google is to text data.
  • Splunk’s flagship product is Splunk Enterprise. Splunk Enterprise is a fully featured, powerful platform for collecting, searching, monitoring and analyzing machine data.
    Splunk collects machine data securely and reliably from wherever it’s generated. It stores and indexes the data in real time in a centralized location and protects it with role-based access controls. You can even leverage other data stores. Splunk lets you search, monitor, report and analyze your real-time and historical data. Now you have the ability to quickly visualize and share your data, no matter how unstructured, large or diverse it may be.
    Troubleshoot problems and investigate security incidents in minutes (not hours or days). Monitor your end-to-end infrastructure to avoid service degradation or outages. Gain real-time visibility and critical insights into customer experience, transactions and behavior. Use Splunk and make your data accessible, usable and valuable across the enterprise.
  • Splunk collects and indexes any machine data from virtually any source, format or location in real time. This includes data streaming from packaged and custom applications, app servers, web servers, databases, networks, virtual machines, telecoms equipment, OS’s, sensors, and much more. There’s no requirement to “understand” the data upfront. Just point Splunk at your data or deploy Splunk forwarders to reliably stream data from remote systems at scale. Splunk immediately starts collecting and indexing, so you can start searching and analyzing.
    No more armies of consultants, or a DBA to make it work.
  • Here's how using Splunk and your machine data can drive significant benefits for your organization.
    Search and investigation. Using Splunk, organizations identify and resolve issues up to 70% faster and reduce costly escalations by up to 90%. Splunk is one place to find and fix problems, and investigate incidents across all your IT systems and infrastructure.
    Proactive monitoring. Monitor IT systems in real time to identify issues, problems and attacks before they impact your customers, services and revenue. Splunk keeps watch of specific patterns, trends and thresholds in your machine data so you don't have to. Trigger notifications in real-time via email or RSS, execute a script to take remedial actions, send an SNMP trap to your system management console or generate a service desk ticket.
    Operational visibility. See the whole picture, track performance and make better decisions. Visualize usage trends to better plan for capacity; spot SLA infractions, track how you are being measured by the business. Do all of this using your existing machine data without spending millions of dollars instrumenting your IT infrastructure.
    Real-time business insight. Make better-informed business decisions by understanding trends, patterns and gaining Operational Intelligence from your machine data. See the success of new online services by channel or demographic, reconcile 3rd-party service provider fees against actual use, find your heaviest users and heaviest abusers, and more. Because machine data captures every behavior, the possibilities are game changing. You'll find the lead times to get to this intelligence dramatically less than other solutions - measured in minutes/hours instead of months.
  • Both IT and business professionals can analyze machine data to get real-time visibility and operational intelligence.
    With our data engine and our customers' machine data, organizations can meaningfully improve their performance in a wide range of areas e.g. meet service levels, reduce costs, mitigate security risks, maintain compliance and gain insights.
  • We have been seeing innovation with Splunk outside of IT in a range of exciting new areas.
    Personal Activity Monitoring
    Devices like Fitbit tell me how active a person is in a given day. It has an open API that allows me to track my offline movements and analyze them online. I can correlate my daily activity with all sorts of other measurements, calorie intake, blood pressure and maybe even number of unread emails in my inbox on a given day and start to correlate health related activities to work productivity.
    'Building Power Consumption’
    Splunk indexes data from 'power-taps' in buildings and correlates it with power tap-location information to provide real-time insight and analysis of power consumption per floor/area/room. They also have the ability to drill-down to identify the reason for any excessive power consumption and trigger automatic remote shut-off to save energy (weekends, based on power levels, etc.).
    Several organizations are Splunking power consumption to look for cost savings and environmental benefits.
    'Flood Monitoring Warning’
    Developed by a partner in Thailand in conjunction with the Thai govt. Splunk collects, indexes and monitors water level sensor data in real-time and alerts subscribers in advance of any future impending flood situations.
  • You can share and reuse Apps within your organization and the rest of the Splunk community. There are a
    growing number of Apps available on our community site www.splunkbase.com, built by our community, partners and Splunk.
    You can find Apps that help visualize data geographically, or that support specific use cases, such as enterprise security or PCI compliance. There are also Apps for different operating systems and third-party technologies, such as Windows, Linux, Blue Coat, Cisco, WebSphere and F5 Networks.
    Apps are being created all the time, so bookmark the site and check in frequently.
    Examples on this page include Apps for Cisco, F5, for BlueCoat, an award winning “Google Maps” App, Apps to gauge Twitter sentiment, external ‘WHOIS’ lookups, license usage, and more.
  • As of 12/29/10:
    Overall Progress, 5/15 References Targets
    1TB (6)Citrix
  • Since June 2006, more than 1,600 users have purchased the enterprise license (Feb 2010)
    These enterprise customers now use Splunk across a balanced and wide range of industries from telecommunications, financial services, government and large consumer facing internet services. Last year, 2009, over 650 new customers started using Splunk.
    Like the customer examples we just saw, these customers have transformed the management, security and compliance of their IT infrastructures with IT Search.
  • samples
  • 빅데이터 분석 플랫폼, 스플렁크~~!!

    1. 1. 빅데이터 엔진 ! 스플렁크
    2. 2. 이제 답을 찾아 드리게 되어 기쁩니다 . Up-to-the-minute dashboards Searching your way through the dark Copyright © 2012, Splunk Inc. Listen to your data.
    3. 3. Copyright © 2012, Splunk Inc. 3 Listen to your data.
    4. 4. 10 11 00 10 10 01 1 10 00 #& %( 11 (& ^( ) 10 00 ** 10 4. 판매 및 거래 기록 00 10 %$ # 01 내역 거래 용자 성  사 습RDBMS 용자  사 습성 Human 생성 데이터 기계의 정보  활동 위협 1. 고객 정보 허가 보안  및 비 2. 물류 및 제조 정보 불법  3. 금융 신용 정보 내역 1 %& 10 유형 이터 데 IT 데이터 엔진 데이터 습성  시간적 요인이 적용된 데이 터 구조나 포멧을 예측할 수 없는 비 정량화 된 데이 터  모든 IT 시스템으로 여러 벤더로 부터 생성된 모든 방대한 종류의 데이터  방대한 양의 데이터  빠른 호출 , 분석 및 상관관 계 분석 요구
    5. 5. RDBMS/SQL – Early-structure Binding 구조  Schema – 시스템 구축과 디자인  Queries – 디자인 시점에 정확한 이 해를 통한 최적의 Query 정의 데이터  단일유형 – 설정된 구조에 맞게 입력 하거나 변환이 요 구됨  여러 DB 의 구조적 요건을 맞추어야 함 SELECT customers.* FROM customers WHERE customers.customer_id NOT IN(SELECT customer_id FROM orders WHERE year(orders.order_date) = 2004) Copyright © 2012, Splunk Inc. Listen to your data.
    6. 6. Late-structure Binding 구조 데이터  여러 이종의 데이터  Schema 가 요구되 수용 – 모든 종류의 지 않음  데이터의 속성이 검 Raw 데이터 수용 색과 함께 정의 됨  지속적인 변경을 수용  Conversion 이나 데이  Queries 나 검색은 터 규격에 따른 제약 그때그때 다이나믹 조건이 없음 . 하게 구성 eventtype=firewall accept OR allow | top src_port Copyright © 2012, Splunk Inc. Listen to your data.
    7. 7. 스플렁크 특허 Integrated Real-time and Historical Search ( 실시간과 기간을 주고 찾는 검색이 모두 가능 ) Copyright © 2012, Splunk Inc. 10X-100X Faster “Rare” Search ( 블룸 필터 : 데이터 스캔 없 이 원하는 데이터 검색 ) 7 Listen to your data.
    8. 8. 이기종 DB 와 데이터 상관 분석 엔진 Oracle Sysbase DB2 MS - Sql TEXT Log … … Virtualized & Shared Network 서로 다른 DB 와 데이 터에서 데이터를 가져 와서 상관분석 A DB 의 데이터의 조건 일 때 B DB 의 데이터 검색 많은 시간의 리포팅 작 업을 획기적으로 단축 Very Likely Shared Storage Copyright © 2012, Splunk Inc. Listen to your data.
    9. 9. 스플렁크는 매우 독특한 아키텍쳐를 제 공하는 빅데이터 엔진입니다 . Copyright © 2012, Splunk Inc. 9 Listen to your data.
    10. 10. 스플렁크는 지능적인 IT 관리를 제공합니다 . 단일 데이터 스토 어 단일 인터페이스 여러 목적의 사용 3 가지 주요 기능 실시간 가시성 검색 과거 분석 실시간 대시보드 이벤트 상관관계 분석 모니터 및 Alerts 성능문제 감시 Transaction 감시 SLA 감시 데이터 세부 검색 “Needle in a haystack” Root cause 분석 / 장애해 결 사건 조사 및 관리 Baseline 및 임계치 분석 데이터 트렌딩 분석 운영 사후 분석 과거 데이터 패턴 분석 컴프라이언스 분석 및 리 포트 © Copyright 2012, Splunk Inc. Listen to your data.
    11. 11. Splunk 는 획기적인 데이터 처리 엔진입니다 . 스플렁크는 획기적인 Architecture 를 통하여 다양한 조직에 다양한 목적으로 사용됩니다 . 조 건 검 색 연관관계분석 통 RAW DATA RAW DATA Raw Log Configurations Terabytes Universal Index • • © Copyright 2012, Splunk Inc. 트 람 이종시스템연동 Raw Data 수집 다양한 포맷의 Raw 데이터 를 Rule Set 없이 흡수 . 포 모 니 터 링 알 Metrics Messages Scripts • 사용자 정의 리 계 패턴 자동 인식 및 인덱 싱. 파일시스템에 10:1 압축 저 장. 기 타 Listen to your data.
    12. 12. 스플렁크는 아래 핵심기술로 작동됩니다 . 개발 UI Application IT Operations … Management Management User Interface Security Compliance APIs Business Analytics SDK Access Controls Correlate with Other Data Sources IT Data Sources © Copyright 2012, Splunk Inc. Stats/ Analytics Alerts Web-based Role-based 핵심 모듈 Search Language … User-developed Splunk-developed Community, Partners Reports Dashboards 실시간 검색엔진 실시간 인덱싱 및 저장 Real-Time Monitoring Data Drilldown Historical Analytics Correlation High Performance Real-time No predefined Schema Massive Scale Listen to your data.
    13. 13. 모든 데이터를 수집하여 분석하는 엔진입니다 . 사전 스키마도 필요없고 , 에이전트도 필요없고 , DB 도 필요없고 , 필터링장치도 필요없 습니다 . Customer Facing Data Outside the Datacenter • Click-stream data • Shopping cart data • Online transaction data Logfiles Windows • • • • Registry Event logs File system sysinternals © Copyright 2012, Splunk Inc. Linux/Unix • • • • Configurations syslog File system ps, iostat, top Configs Messages Traps Alerts Metrics Virtualization & Cloud • Hypervisor • Guest OS, Apps • Cloud Scripts Changes Applications • • • • Web logs Log4J, JMS, JMX .NET events Code and scripts • Manufacturing, logistics… • CDRs & IPDRs • Power consumption • RFID data • GPS data Tickets Databases • • • • Networking Configurations Audit/query logs Tables Schemas • • • • Configurations syslog SNMP netflow Listen to your data.
    14. 14. IT Data 를 수집하는 방법은 아래와 같습니다 . Local File Monitoring syslog log files config files dumps and trace files TCP/UDP syslog compatible hosts and network devices Mounted File Systems hostnamemount WMI Event Logs Performance Scripted Inputs Active Directory shell scripts custom parsers batch loading Windows Inputs Event Logs performance counters registry monitoring Active Directory monitoring code code shell shell virtual host perf perf Unix, Linux and Windows hosts Windows hosts Agent-less Data Input Copyright © 2012, Splunk Inc. Custom apps and scripted API connections Windows hosts Splunk Forwarder Listen to your data.
    15. 15. IT Data 를 아래 다양한 방법으로 수집합니다 . Category TYPE Splunk Agent SSH / TELNET FTP NFS/SCP/ RSYNC TCP/ UDP DBI / SQL Script SNMP Network Routers   O     O   O O   Switch   O     O   O O   Servers Firewall Linux   O O O   O   O O       O O O O     AIX Solaris O O O O O O O O         O O O O   Windows O O O O     O O   MAC O O O O     O O     TANDEM TRU64     O O O O O O         O O O O   AS400   O O       O O   Mainfreme   O O       O   Database Oracle O O O O   O O     Informix O O O O   O O       Sybase Mysql O O O O O O O O     O O O O       Applications MS SQL apache O O O O O O O O     O   O O       Weblogic O O O O     O     Websphere O O O O     O       SAP Custom App O O O O O O O O         O O     © Copyright 2012, Splunk Inc. Listen to your data.
    16. 16. 모든 데이터를 자동으로 인식하여 인덱싱합 어플리케이션 , 서버 , 네트워크 장비 니다 . 남겨지는 모든 H/W, S/W 에서 데이 등 데이터가 터를 수집합니다 . • 지속적 실시간 데이터 수집 , 인덱스 . • 사용자 정의 어뎁터 없이 모든 데이터 포맷 • 고효율 데이터 저장 . • Schema 및 RDBMS 불필요 . • 데이터 무결정 보장 . • 고성능 및 무제한 확장성 제공 . 지원 . • 자동으로 단일 및 멀티 라인 데이트를 식별 . • 전체 시스템의 모든 이벤트 및 로그를 인덱 스. © Copyright 2012, Splunk Inc. Data Inputs Listen to your data.
    17. 17. 실시간으로 수집한 데이터를 자동인식합니다 . 자동 이벤트 경계 식별 자동 타임스탬프 정규화 전체 IT Data 의 정확한 트랜드분석 및 검색을 위한 활성화 Copyright © 2012, Splunk Inc. Listen to your data.
    18. 18. 수집한 데이터는 세분화하여 인덱싱합니다 . 모든 항목을 세분화 , 정밀한 색인 원본 이벤트에 대해 신속한 검색 Copyright © 2012, Splunk Inc. Listen to your data.
    19. 19. 다양한 조건으로 데이터를 검색하게 됩니다 . 하나의 포인트에서 전체 시스템 통합 검색 실시간 스트리밍 데이터와 과거 색인 데이 터의 연관 관계 검색 Application 별 실시간 특화 검색 특정한 시간내의 모든 데이터 이벤트의 세부 정보 검색 및 특정 서로 다른 데이터 센터와 지역에서 여러 서버에 걸쳐 발생한 데이터의 통합 검색 및 분석      Copyright © 2012, Splunk Inc. 간단한 논리연산자 사용 . 시간 및 조건에 의한 검색 . 어떠한 용어나 문자열 검색 . 시간에 따른 결과값을 시각화 . 결과값 내에서 관계 검색 . Listen to your data.
    20. 20. 검색을 하면 자동으로 필드값들을 찾아냅니 다. 자동으로 필드값 식별 사용자 지정 필드값 정확하고 신속한 검색치 제공 Copyright © 2012, Splunk Inc. Listen to your data.
    21. 21. 찾아낸 결과를 바로 저장하게 됩니다 . 이벤트 타입으로 검색결과 저장 이벤트 , 호스트 , 다른 필드값 태깅 정규 보고서 , 지식공유 및 세분화된 액세스 제어를 활성화 Copyright © 2012, Splunk Inc. Listen to your data.
    22. 22. 실시간으로 © Copyright 2012, Splunk Inc. 데이터를 검색하고 처리합니다 . Listen to your data.
    23. 23. 검색 및 분석 결과를 그 자리에서 리포트를 만듭 니다 . • 여러 호스트에서 발생한 데이터의 연관성 분석 • 멀티라인 데이터 및 특화 서비스 프로그램 로그 분석 • 대화형 리포팅 기능을 결합한 빠르고 유연한 분 석 • IT 데이타의 볼륨 데이터의 결과를 쉽게 시각화   요약 , 통계 , 동향 . 비즈니스 데이타 ( 타 시스템 연계 시 ) 통합 . •복잡한 스키마 또는 데이타 re-indexing 없이 모든 분야의 데이타 Mining 지원 •일정에 맞춰 RSS 또는 이메일로 리포트 자동 송부 Copyright © 2012, Splunk Inc. Listen to your data.
    24. 24. 리포팅후 실시간으로 모니터링하도록 지정합 니다 . 사전 모니터링 , Alerts 및 자동 대응 • • • • Copyright © 2012, Splunk Inc. 실시간 Alerts 및 예약 검색 검색 결과와 내용을 기반으로 통지 및 자동 대응 설정 RSS, 이메일 또는 SNMP 를 통해 다른 관리 (NMS, SMS) 콘솔로 연계 서버 재 시작 또는 트러블 티켓 발행과 같 은 정기적 자동화를 사용자 정의 스크립트로 설정 Listen to your data.
    25. 25. 자체 적인 보안 기능을 제공합니다 . 인증 모든 사용자에 대한 접속 및 시스템 변경내역 인 증 Directory 통합 사용자 , 역할 및 조직을 Active Directory 또는 LDAP 에 반 영 감사 능력 유연한 Roles 60 가지 이상의 조절점을 포함한 Role-based access 관리 체계 모니터링 및 Splunk 액세스 감사 추적 데이터 무결성 보장 데이터 수집 및 저장의 무결성 제공 Digital Forensics Copyright © 2012, Splunk Inc. Listen to your data.
    26. 26. 여러부서에서 동시에 지식관리를 할수 있습 니다 . 협업기반으로 업무 지식을 공유하게 되어 업무처리속도가 획기적으로 향상됩니다 . 이벤트 타 입 지정 검색 , 경보 , 보 고서 저장 Weblogic Shutdown, sshd login, etc. Client IP, Status, etc. 필드 정의 Email Delivery, Checkout, Etc. 트랜잭션 Label Copyright © 2012, Splunk Inc. 시간이 지남에 따라 지식 전파 권한 및 업무별 대시보드 compliance issue event ok restart server 이벤트 Tag Listen to your data.
    27. 27. 스플렁크는 방대한 확장성을 제공합니다 . 다중 인덱서를 이용한 분산 검색 및 리포팅이 가능하여 하루 수십테라의 데이터를 처리합 니다 . Offload search load to Splunk Search Heads Auto load-balanced forwarding to as many Splunk Indexers as you need to index terabytes/day Send data from 1000s of servers using combination of Splunk Forwarders, syslog, WMI, message queues, or other remote protocols © Copyright 2012, Splunk Inc. Listen to your data.
    28. 28. 개발 연동 툴 APIs Tools IDEs Java Contextual Editing Eclipse REST Debugging Visual Studio .NET Validation/Testing Python Copyright © 2012, Splunk Inc. 28 Listen to your data.
    29. 29. 3 Party 연동 rd Protovis Microsoft Microsoft Reports Reports Analyst Plug-ins GoogleViz Copyright © 2012, Splunk Inc. 29 Listen to your data.
    30. 30. Hadoop™ 을 품은 스플렁크 Real-time data collection, indexing, search, analysis, reporting and dashboards Ad hoc search Add knowledge Monitor and alert Data Archival and Batch Analytics Report and Custom dashboards analyze Run MR Queries Data Result sets Archive data Re-index data Copyright © 2012, Splunk Inc. 30 Listen to your data.
    31. 31. Splunk 는 아래 특장점을 제공합니다 . Any Data Completely Flexible Immediate Results 데이터 포맷 및 소스에 관계 없이 데이타 처리 모든 IT 관리영역에 대한 분 석 , 감시 , 보고를 지원 필요한 만큼 확장하는 구조 : 데이터 무결성 보장으로 모든 원시 데이타 접근 직무에 따른 유연한 Dashboard 생성 및 정보 제공 데이타 라이프 사이클 관리 - 수집 , 검색 , 분석 , 보 Device, OS, Application 등의 환경변화로 인한 로그 및 데이 터 변경에 유연한 대처 관 , 폐기 노트북에서 데이타센타까지 설치 후 바로 사용 가능 빠른 ROI 실현 Splunk: the IT Data Engine Copyright © 2012, Splunk Inc. Listen to your data.
    32. 32. 어떠한 목적으로 Splunk 를 사용할 수 있을까요 ? Copyright © 2012, Splunk Inc. 32 Listen to your data.
    33. 33. 대부분의 기업데이터는 머신데이터 Industrial Data + Additional Sources Shipping RFID Core IT Customer-facing IT Web Services Databases Desktops Developers Applications Telecoms Online Shopping Carts Security GPS/Cellular Energy Manufacturing Servers Storage Networking Messaging Cloud Copyright © 2012, Splunk Inc. Virtual 33 Social Media Clickstream Physical Listen to your data.
    34. 34. 데이터양과 복잡성이 폭발적으로 증가 Industrial Data + Additional Sources Core IT Web Services Customer-facing IT Online Fastest growing, most complex segment of big data Shopping Shipping RFID Desktops Databases Carts Generated continuously by websites, communications, networking and complex IT infrastructures Developers Security Social Media Applications Servers Energy Contains categorical record of all activity and behavior GPS/Cellular Telecoms Value from data largely untapped – extremely difficult to process and analyze by traditional methods or in a Storage Networking Clickstream Manufacturing timely manner Messaging Cloud Copyright © 2012, Splunk Inc. Virtual 34 Physical Listen to your data.
    35. 35. 기존에는 각 분야별 툴로 별도 분석 Transaction Monitoring Trouble Ticket SIEM Security Incident Shipping RFID APM Compliance Audit Operational Insights Web Services DW Databases Desktops ECA Developers Applications Telecoms Online Shopping Carts Business Analytics Security GPS/Cellular Energy Manufacturing Servers Storage Networking Messaging Cloud Copyright © 2012, Splunk Inc. Virtual 35 Social Media Clickstream Physical Listen to your data.
    36. 36. 원인분석과 해결 / 판단에 너무 많은 시간이 소 요 Application Support Application Developer Systems Administrator Application Developer Database Administrator Log call. The console says everything is green. Java monitoring tools don’t show anything either. Call the developer. Stop working on new code to troubleshoot. Need production logs! Stop current task to identify and gather production logs for developer. Manual investigation establishes not application problem. DBA analyzes audit logs which points to bad query. ESCALATE. ESCALATE. ESCALATE. RESPOND. ESCALATE. NOW WHAT? Service Desk Copyright © 2012, Splunk Inc. 36 Listen to your data.
    37. 37. Sources 머신데이터를 통한 다양한 접근 Order Processing Middleware Error Care IVR Twitter Copyright © 2012, Splunk Inc. 37 Listen to your data.
    38. 38. Sources 머신데이터를 통한 정밀 분석 Customer ID Order ID Product ID Order Processing Order ID Customer ID Middleware Error Time Waiting On Hold Customer ID Care IVR Twitter ID Twitter Customer’s Tweet Company’s Twitter ID Copyright © 2012, Splunk Inc. 38 Listen to your data.
    39. 39. Sources 머신데이터를 통한 정밀 분석 Customer ID Order ID Product ID Order Processing Order ID Customer ID Middleware Error Time Waiting On Hold Customer ID Care IVR Twitter ID Twitter Customer’s Tweet Company’s Twitter ID Copyright © 2012, Splunk Inc. 39 Listen to your data.
    40. 40. 이제 즉석에서 데이터를 검색 / 분석 가능합니 다 . faster across your organization Find and fix issues and incidents dramatically Web Services Shipping RFID Energy Developers Desktops Servers Security Databases/ DWH Telecoms GPS/Cellular Copyright © 2012, Splunk Inc. App Support Social Media Storage Networking Manufacturing Web Services Online Shopping Carts Messaging 40 Clickstream Listen to your data.
    41. 41. 데이터 처리에 대한 새로운 접근 Business Application Data Human-generated Data Relational data, highly structured, based on inflexible schema Generated by human-to-human interaction Time series, diverse, unstructured, no predefined schema Financial records, multidimensional data, math computation Includes email, IM, voice, video, text Encapsulates human-generated content (e.g. social data) Monthly reporting, not for real-time events Stored in centralized corporate servers, fileshares and desktops Machine-generated Data Generated by all IT systems and technology devices Massive volume; fast navigation and correlation paramount Copyright © 2012, Splunk Inc. 41 Listen to your data.
    42. 42. 모든 데이터를 쉽게 수집 / 검색 / 분석하는 툴 ! Ad hoc search Monitor and alert Report and Custom analyze dashboards Developer Platform Data collection and indexing Splunk storage Copyright © 2012, Splunk Inc. 42 Other Big Data stores Listen to your data.
    43. 43. 모든 데이터를 쉽게 수집 / 검색 / 분석하는 툴 ! Ad hoc search Monitor and alert Report and Custom analyze dashboards Developer Platform Any amount, any location, any source. Data collection No upfront schema and indexing No custom connectors No RDBMS Splunk storage No need to filter/forward Copyright © 2012, Splunk Inc. 43 Other Big Data stores Listen to your data.
    44. 44. 머신데이터를 통한 지능적 운영 Real-time Real-time Business Insight Business Insight Proactive Operational Operational Visibility Visibility Proactive Proactive Monitoring and Monitoring and Alerting Alerting Search and Search and Investigate Investigate Copyright © 2012, Splunk Inc. Reactive 44 Listen to your data.
    45. 45. 모든 사용자 기준의 Dashboards 및 리포트 제공 . 새로운 수준의 인프라 및 비즈니스가시성 확보 마케팅 팀 웹 사이트 관리자 © Copyright 2012, Splunk Inc. Mash-up 및 웹 애플리케이션 45 Listen to your data.
    46. 46. 비즈니스 View 와 IT 운용 View 를 통합분석 Web Intelligence IT Operations Management Application Management Business Analytics Security and Compliance Customer Support LOB Owners/ Executives Operations Teams Website/Business Analysts System Administrator Copyright © 2012, Splunk Inc. Application Developers Security Analysts 46 Auditors IT Executives Listen to your data.
    47. 47. 전형적인 IT 장비분야를 넘어선 분석 제공 Commercial Transport Health and Safety Power and Energy Supporting the next gen airliner Personal Activity Tracking Building Power Consumption Cars as telemetry sensors Flood monitoring warning Home Energy Management 47 Listen to your data. Copyright © 2012, Splunk Inc.
    48. 48. A Growing Community of Apps 185 Apps Currently Available for Customer Use Weather BigFix Sendmail PDF Report Server F5 Radio Stations WebSphere XenDesktop NetScaler Ruby on Rails Google Maps Whois lookup PCI Compliance Puppet Conf. Mgt Python Mail NetFlow Audible Alerts Splunk Monitoring SNORT FireEye Malware YouTube Encrypt/Decrypt Splunk ESS AS/400 - iSeries JMS receiver Geo Location VMware Fin. Inf. eXchange Twitter Windows Nagios Unix and Linux Security Javamail BlueCoat ProxySG Solera DeepSee Security SCOM Copyright © 2012, Splunk Inc. TCP/UDP Sending IronPort WSA IronPort WSA Sourcefire IMAP RSS Input 48 Multicast Stock Quote POST/GET Rqsts Listen to your data.
    49. 49. 스플렁킹 빅데이터 1TB References Copyright © 2012, Splunk Inc. 500GB References 49 Listen to your data.
    50. 50. 최근 이슈 사항 IT 장애 예측 시스템 빅데이터 기반 BI 사이버 포렌식 실시간 생산데이터분석 Copyright © 2012, Splunk Inc. 50 Listen to your data.
    51. 51. XX 전자 “ 스플렁크를 통해 클라우드 관 제 , 고객 분석 “ 내부 데이터 용량이 1TB 이상 , 추가 라이선스 구매 예정” 통합 클라우드 및 앱 관제 단말기 IP, MAC Address 비교 AWS 상의 관제 Copyright © 2012, Splunk Inc. 51 Listen to your data.
    52. 52. XX 전자 “ 스마트 TV 로그 분석 , 보안 분석 , 시스템 로그 분석” “ 각 사업부분 별로 별도 도 입 하여 구축 예정” 초기 애플리케이션 관제 구축 , 원인분석 못함 20 일 -> 2 시간 분석 단말기 서비스 분석 Copyright © 2012, Splunk Inc. 52 Listen to your data.
    53. 53. XX 금융 “ 전사적인 보안 관제를 스플 렁크로 시행중” “ 고객과의 협업으로 내부통 제 패턴 ( 시나리오 ) 구현을 위한 조건 검색 및 앱 제작” 전사적인 보안 관제 개인 정보보호 관점 내부통제 시스템 Copyright © 2012, Splunk Inc. 53 Listen to your data.
    54. 54. XXX 쇼핑 “ 전사적인 고객 데이터 분석 을 시행” “ 마케팅 및 BI 를 위한 스플 렁크 첫 구축사례” 전사적인 고객 분석 모든 데이터 통합하여 맞춤 분석 Copyright © 2012, Splunk Inc. 54 Listen to your data.
    55. 55. Amazon - Indexing ~ 일일 20TB 의 데이터 수집 30 Indexers - Amazon Extra Large Instances 8 vCPU, 8 GB of RAM each 2 Search Heads : 8 vCPU, 8 GB of RAM each • 물리적인 Splunk 웹 서버 와 Splunk Amazon 클라우드 웹 서버를 이용 : 시스템 AWS AWS AWS AWS AWS 보안 및 Application 문제 해결 관리 및 포렌식 분석 비즈니스 10 Indexers: 4-way, 4 cores, 8 GB RAM 2 Search Heads: 4-way, 4 cores, 8GB RAM Physical boxes – in house datacenter Copyright © 2012, Splunk Inc. 55 인텔리젼스 분석 • 많은 물리적 인프라는 물론 가상 인프라의 System, Network, Application, Security 및 사용자 Transaction 분석 Listen to your data.
    56. 56. 시스코 – 전세계 IDC 보안 분석 US San Francisco Data Center Copyright © 2012, Splunk Inc. Europe Data Center 56 India Data Center Listen to your data.
    57. 57. 국내 구축사례 57 Copyright © 2012, Splunk Inc. Listen to your data.
    58. 58. Loomis Creative Pitch Ideas 58 Thank You!