Mucho Big Data ¿y la Seguridad para cuándo?
July 9, 2013
Juan Carlos Vázquez
Sales Systems Engineer, LTAM
"Los datos personales son el petróleo del siglo XXI"
Una montaña de datos
>15000 Millones
Dispositivos Conectados2
(15B)
1. IDC “Server Workloads Forecast” 2009. 2.IDC “The Internet Reaches Late A...
Source: IDC, 2011 Worldwide Enterprise Storage Systems 2011–2015 Forecast Update.
Worldwide Enterprise Storage Consumption...
Más datos…
6
En 2020, el volumen de información será de 35.2 Zettabytes
En el 2020, el volumen de información digital alca...
Un caso
El New York Times usó 100 instancias de Amazon EC2
y Hadoop para procesar 4 TB de datos en imágenes
TIFF y obtener...
Otro caso
Los clusters para Hadoop en Yahoo! cuentan
con 40,000 servidores y almacenan 40
petabytes de datos, y donde el c...
Solo un caso más
En 2010 Facebook declaró que tenía el cluster
de Hadoop mas grande del mundo con 21 PB.
En 2011 anunció q...
Big Data
10
Es un término aplicado a conjuntos de datos que superan la capacidad del
software habitual para ser capturados...
The four Vs
11
• Volume. When the term big data is used, data volume typically ranges multiple terabytes
to petabytes. Thi...
Thousands of Events
The Big Security Data Challenge
BILLIONS OF EVENTS
Correlate Events
Consolidate Logs
Perimeter
APTs
Cl...
The Security Dilemma
MONITORING TECHNIQUES MUST ADVANCE
VISIBILITY
INSTRUMENTATION
Instrumentation and data collection are...
Big Data vs. Big Security Data
Datasets whose size and variety is beyond the ability of typical
database software to captu...
Gartner says…
• The amount of data analyzed by enterprise
information security organizations will double every
year throug...
Goal…
One of the primary drivers of security
analytics will be the need to identify when
an advanced targeted attack has b...
Needle in a Datastack
17
• Organizations are storing approximately 11-15 terabytes of security data a week.
• The ability ...
Datos útiles…de Verizon 2012
18
• “84% de los incidentes de seguridad (intrusiones
exitosas) se han reflejado en los logs”...
Normalización
19
What else happened at this time?
Near this time?
What is the time zone?
What is this service? What other
messages did it p...
SEM + SIM = SIEM
SIEM is the Evolution and Integration of
Two Distinct Technologies
 Security Event Management (SEM)
― Pr...
The State of SIEM
Antiquated Architectures Force
Choices Between Time-to-Data
and Intelligence
Events Alone Do Not Provide...
Shifting from Compliance to Security
23
Source: InformationWeek 2012 Security Information and Event Management Vendor Eval...
SIEM as solution to detect CyberAttacks
Medium Risk High Risk
Global Threat Intelligence and SIEM
McAfee Labs
IP Reputation Updates
GOOD SUSPECT BAD
IP REPUTATION...
GTI with SIEM Delivers Even Greater Value
Sorting Through a Sea of Events…
200M events
18,000 alerts
and logs
Dozens of
en...
Manejo de Eventos…
Priorizar los eventos de seguridad
De arriba hacia abajo…
Si bueno, con quién hablo?
Conocimiento de mi ambiente…
McAfee ESM
McAfee Starts at the Core
July 9, 2013
32
McAfee DB
• Real-time, complex analysis
• Indexing purpose-built for ...
Sitios Web Maliciosos…
33
El malware está aquí…
Spam y Bots en descenso…
Conclusiones…
• Usar y encender tus Logs
• Primero un Log Mgmt antes que un SIEM
• No hay “balas de plata”
• Gana el pensa...
“If you’re in a fight, you need to know that while it’s happening, not after the fact”
El contexto de la integración masiva de datos
Upcoming SlideShare
Loading in …5
×

El contexto de la integración masiva de datos

223
-1

Published on

http://sg.com.mx/sgce/2013/sessions/el-contexto-la-integraci%C3%B3n-masiva-datos

Los ejecutivos de las áreas de TI saben con certeza que la información de negocio más importante, se encuentra escondida en billones de eventos de seguridad. La habilidad de integrar datos para obtener una fotografía clara de la situación actual, es esencial en la manera que hoy día se detectan los ataques clandestinos. Basado en la colección, manejo y análisis; la seguridad de los datos puede ser un gran activo o un enorme dolor de cabeza.

Los desafíos de las llamadas soluciones “SIEM legacy” combinadas con metodologías de inteligencia en seguridad, pueden llevar su organización al siguiente nivel cuando ataques internos y externos se presentan, siempre en cumplimiento reportando, administrando y entregando un valor excepcional y rentabilidad. Conozca como responder ante las necesidades del Big Data mediante la integración de inteligencia global de amenazas (GTI).

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
223
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
22
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

El contexto de la integración masiva de datos

  1. 1. Mucho Big Data ¿y la Seguridad para cuándo? July 9, 2013 Juan Carlos Vázquez Sales Systems Engineer, LTAM
  2. 2. "Los datos personales son el petróleo del siglo XXI"
  3. 3. Una montaña de datos
  4. 4. >15000 Millones Dispositivos Conectados2 (15B) 1. IDC “Server Workloads Forecast” 2009. 2.IDC “The Internet Reaches Late Adolescence” Dec 2009, extrapolationby Intel for 2015 2.ECG “Worldwide Device Estimates Year 2020 - Intel One Smart Network Work” forecast 3. Source: http://www.cisco.com/assets/cdc_content_elements/networking_solutions/service_provider/visual_networking_ip_traffic_chart.html extrapolatedto 2015 En 2015… Mayor demanda para los Data Centers >1000 Million Mas Netizen’s1 (1B) >1 Zetabyte Tráfico en Internet3 (1000 Exabytes)
  5. 5. Source: IDC, 2011 Worldwide Enterprise Storage Systems 2011–2015 Forecast Update. Worldwide Enterprise Storage Consumption Capacity Shipped by Model, 2006–2015 (PB) 2.7 ZB de datos en 2012, 15,000 milliones de dispositivos conectados en 2015 Al rededor de 24 Petabytes De datos procesados por Google* al día en 2011 4,000 milliones Piezas de contenido compartidas en Facebook* cada día (Julio 2011) 250 milliones …de Tweets por día en Octubre de 2011 5.5 milliones Emails (legítimos) por segundo en 2011 Una explosión de datos
  6. 6. Más datos… 6 En 2020, el volumen de información será de 35.2 Zettabytes En el 2020, el volumen de información digital alcanzará los 35.2 Zettabytes (1 ZB es igual a 1 trillón de GB), frente al 1.8 ZB de 2010. Ese crecimiento exponencial de los datos hace de Big Data la fuerza motriz de la era de la información, de acuerdo con estimaciones de Sogeti, compañía del Grupo Capgemini. Por su parte, la consultora Gartner afirma que las empresas capaces de tener información más valiosa, procesarla y administrarla, obtendrán resultados financieros un 20% mejor que sus competidores.
  7. 7. Un caso El New York Times usó 100 instancias de Amazon EC2 y Hadoop para procesar 4 TB de datos en imágenes TIFF y obtener 11 millones de PDFs en 24 hrs a un costo de $240 usd http://en.wikipedia.org/wiki/Apache_Hadoop
  8. 8. Otro caso Los clusters para Hadoop en Yahoo! cuentan con 40,000 servidores y almacenan 40 petabytes de datos, y donde el cluster mayor es de 4,000 sevidores http://www.aosabook.org/en/hdfs.html
  9. 9. Solo un caso más En 2010 Facebook declaró que tenía el cluster de Hadoop mas grande del mundo con 21 PB. En 2011 anunció que había crecido a 30PB y hacia la mitad de 2012 alcanzó los 100PB. En Noviembre 8, 2012 ellos anunciaron que su almacen de datos crece casi la mitad de un PB por día. http://en.wikipedia.org/wiki/Apache_Hadoop
  10. 10. Big Data 10 Es un término aplicado a conjuntos de datos que superan la capacidad del software habitual para ser capturados, gestionados y procesados en un tiempo razonable. Los tamaños del “Big Data" se encuentran constantemente en movimiento creciente, de esta forma en 2012 se encontraba dimensionado en un tamaño de una docena de terabytes hasta varios petabytes de datos en un único data set. Los retos incluyen la captura, el procesamiento, el almacenamiento, el compartir inteligencia, el análisis y la visualización. Beneficio para el sector Salud, Financiero, Telcos, Energía, Tráfico, Marketing, Manufactura, Seguridad… quién hará la pregunta correcta?
  11. 11. The four Vs 11 • Volume. When the term big data is used, data volume typically ranges multiple terabytes to petabytes. This certainly fits the enterprise security model as it is not uncommon for large organizations to collect tens of terabytes of security data on a monthly basis. • Velocity. This term is used with respect to real-time data analysis requirements. In cybersecurity, velocity can refer to the need for immediate anomaly, or incident detection. Real-time data analysis is critical here to minimize damages associated with a cybersecurity attack. • Variety. Big data can be made up of multiple data types and feeds including structured and unstructured data. From a security perspective, data variety could include log files, network flows, IP packet capture, external threat/vulnerability intelligence, click streams, network/physical access, and social networking activity, etc. It is not unusual for enterprises to collect hundreds of different types of data feeds for security analysis. • Veracity. Big data must be trustworthy and accurate. From a security perspective, this means trusting the confidentiality, integrity, and availability of data sources like log files and external data feeds.
  12. 12. Thousands of Events The Big Security Data Challenge BILLIONS OF EVENTS Correlate Events Consolidate Logs Perimeter APTs Cloud Data Insider BILLIONS OF EVENTS
  13. 13. The Security Dilemma MONITORING TECHNIQUES MUST ADVANCE VISIBILITY INSTRUMENTATION Instrumentation and data collection are still critical, but applying filters derived from intelligence is the path to achieving better security.
  14. 14. Big Data vs. Big Security Data Datasets whose size and variety is beyond the ability of typical database software to capture, store, manage and analyze. Understanding Security Data As Big Data • How do I gather security context? • How do I manage big security information? • How do I make security information management work? BIG DATA BIG SECURITY DATA • Size of Security Data doubling annually • Advanced threats demand collecting more data • Legacy data management approaches failing • SIEM use shifting from compliance to security Security Big Data is about matching security intelligence with the right collected data.
  15. 15. Gartner says… • The amount of data analyzed by enterprise information security organizations will double every year through 2016. • By 2016, 40% of enterprises will actively analyze at least 10 terabytes of data for information security intelligence, up from less than 3% in 2011. • By 2016, 40% of Type A enterprises will create and staff a security analytics role, up from less than 1% in 2011.
  16. 16. Goal… One of the primary drivers of security analytics will be the need to identify when an advanced targeted attack has bypassed traditional preventative security controls and has penetrated the organization.
  17. 17. Needle in a Datastack 17 • Organizations are storing approximately 11-15 terabytes of security data a week. • The ability to detect data breaches within minutes is critical in preventing data loss, yet only 35 percent of firms stated that they have the ability to do this. • In fact, more than a fifth (22 percent) said they would need a day to identify a breach, and five percent said this process would take up to a week. On average, organizations reported that it takes 10 hours for a security breach to be recognized. • Nearly three quarters (73 percent) of respondents claimed they can assess their security status in real-time and they also responded with confidence in their ability to identify in real-time insider threat detection (74 percent), perimeter threats (78 percent), zero day malware (72 percent) and compliance controls (80 percent). However, of the 58 percent of organizations that said they had suffered a security breach in the last year, just a quarter (24 percent) had recognized it within minutes. In addition, when it came to actually finding the source of the breach, only 14 percent could do so in minutes, while 33 percent said it took a day and 16 percent said a week. The study, conducted by research firm Vanson Bourne, interviewed 500 senior IT decision makers in January 2013, including 200 in the USA and 100 each in the UK, Germany and Australia.
  18. 18. Datos útiles…de Verizon 2012 18 • “84% de los incidentes de seguridad (intrusiones exitosas) se han reflejado en los logs” • “Sólo el 8% de los incidentes de seguridad detectados por las empresas han sido por minar sus logs”
  19. 19. Normalización 19
  20. 20. What else happened at this time? Near this time? What is the time zone? What is this service? What other messages did it produce? What other systems does it run on? What is the hosts IP address? Other names? Location on the network/datacenter? Who is the admin? Is this system vulnerable to exploits? What does this number mean? s this documented somewhere? Who is this user? What is the users access-level? What is the users real name, department, location? What other events from this user? What is this port? Is this a normal port for this service? What else is this service being used for? DNS name, Windows name, Other names? Whois info? Organization owner? Where does the IP originate from (geo location info)? What else happened on this host? Which other hosts did this IP communicate with? SIEM is Still Evolving …Beyond Logs
  21. 21. SEM + SIM = SIEM SIEM is the Evolution and Integration of Two Distinct Technologies  Security Event Management (SEM) ― Primarily focused on Collecting and Aggregating Security Events  Security Information Management (SIM) ― Primarily focused on the Enrichment, Normalization, and Correlation of Security Events Security Information & Event Management (SIEM) is a Set of Technologies for:  Log Data Collection  Correlation  Aggregation  Normalization  Retention  Analysis and Workflow 1 2 3 Three Major Factors Driving the Majority of SIEM Implementations Real-Time Threat Visibility Security Operational Efficiency Compliance and/or Log Management Requirements
  22. 22. The State of SIEM Antiquated Architectures Force Choices Between Time-to-Data and Intelligence Events Alone Do Not Provide Enough Context to Combat Today’s Threats Complex Usability and Implementation Have Caused Costs To Skyrocket 00001001001111 11010101110101 10001010010100 00101011101101 VS Legacy SIEM REALITY: Turns Security Data Into Actionable Information Provides an Intelligent Investigation Platform Supports Management and Demonstration of Compliance SIEM Promise:
  23. 23. Shifting from Compliance to Security 23 Source: InformationWeek 2012 Security Information and Event Management Vendor Evaluation Survey of 322 business technology professionals, April 2012
  24. 24. SIEM as solution to detect CyberAttacks
  25. 25. Medium Risk High Risk Global Threat Intelligence and SIEM McAfee Labs IP Reputation Updates GOOD SUSPECT BAD IP REPUTATION CHECK Botnet/ DDos Mail/ Spam Sending Web Access Malware Hosting Network Probing Network Probing Presence of Malware DNS Hosting Activity Intrusion Attacks AUTOMATIC IDENTIFICATION AUTOMATIC RISK ANALYSIS VIA ADVANCED CORRELATION ENGINE
  26. 26. GTI with SIEM Delivers Even Greater Value Sorting Through a Sea of Events… 200M events 18,000 alerts and logs Dozens of endpoints Handful of users Specific files breached (if any) Optimized response RESPOND Have I Been Communicating With Bad Actors? Which Communication Was Not Blocked? What Specific Servers/Endpoints/ Devices Were Breached? Which User Accounts Were Compromised? What Occurred With Those Accounts? How Should I Respond?
  27. 27. Manejo de Eventos…
  28. 28. Priorizar los eventos de seguridad
  29. 29. De arriba hacia abajo…
  30. 30. Si bueno, con quién hablo?
  31. 31. Conocimiento de mi ambiente…
  32. 32. McAfee ESM McAfee Starts at the Core July 9, 2013 32 McAfee DB • Real-time, complex analysis • Indexing purpose-built for SIEM • Massive context feeds with enrichment • Historical retrieval and analytics • Integrated log and event management • No DBA required SMART FAST Scale, Analytical flexibility, Performance
  33. 33. Sitios Web Maliciosos… 33
  34. 34. El malware está aquí…
  35. 35. Spam y Bots en descenso…
  36. 36. Conclusiones… • Usar y encender tus Logs • Primero un Log Mgmt antes que un SIEM • No hay “balas de plata” • Gana el pensamiento vs la tecnología • Menos es más • Windows Events Logs • Syslogs • DNS • App Logs • Context Awareness (Geolocation, Users, VM, Asset Mgmt, etc) • Casos de uso , caso de uso, casos de uso! • Arquitecturas de Big Data • Alta velocidad (I/O), horas para ver un reporte? O minutos para una vista? • Feeds de Seguridad (Sistemas de reputación) • Seguridad Interconectada • IP mala de reputación automáticamente bloqueada por el IPS. • Equipo que tuvo contacto con IP maliciosa ser analizado desde el SIEM
  37. 37. “If you’re in a fight, you need to know that while it’s happening, not after the fact”

×