In this two part presentation we will explore log analysis and log visualization. We will have a look at the history of log analysis; where log analysis stands today, what tools are available to process logs, what is working today, and more importantly, what is not working in log analysis. What will the future bring? Do our current approaches hold up under future requirements? We will discuss a number of issues and will try to figure out how we can address them.
By looking at various log analysis challenges, we will explore how visualization can help address a number of them; keeping in mind that log visualization is not just a science, but also an art. We will apply a security lens to look at a number of use-cases in the area of security visualization. From there we will discuss what else is needed in the area of visualization, where the challenges lie, and where we should continue putting our research and development efforts.
Cloud computing has changed the way businesses operate, the way businesses make money, and the way business have to protect their assets and information. More and more software applications are moving into the cloud. People are running their proxies in the cloud and soon you will be collecting your logs in the cloud. You shouldn't have to deal with log collection and log management. You should be able to focus your time on getting value out of the logs; to do log analysis and visualization.
In this presentation we will explore how we can leverage the cloud to build security visualization tools. We will discuss some common visualization libraries and have a look at how they can be deployed to solve security problems. We will see how easy it is to quickly stand up such an application. To close the presentation, we will look at a number of security visualization examples that show how security data benefits from visual representations. For example, how can network traffic, firewall data, or IDS data be visualized effectively?
DAVIX - Data Analysis and Visualization LinuxRaffael Marty
DAVIX, a live CD for data analysis and visualization, brings the most important free tools for data processing and visualization to your desk. There is no hassle with installing an operating system or struggle to build the necessary tools to get started with visualization. You can completely dedicate your time to data analysis.
Workshop: Big Data Visualization for SecurityRaffael Marty
Big Data is the latest hype in the security industry. We will have a closer look at what big data is comprised of: Hadoop, Spark, ElasticSearch, Hive, MongoDB, etc. We will learn how to best manage security data in a small Hadoop cluster for different types of use-cases. Doing so, we will encounter a number of big-data open source tools, such as LogStash and Moloch that help with managing log files and packet captures.
As a second topic we will look at visualization and how we can leverage visualization to learn more about our data. In the hands-on part, we will use some of the big data tools, as well as a number of visualization tools to actively investigate a sample data set.
Managing your black friday logs Voxxed LuxembourgDavid Pilato
Surveiller une application complexe n'est pas une tâche aisée, mais avec les bons outils, ce n'est pas si sorcier. Néanmoins, des périodes fortes telles que les opérations de type "Black Friday" (Vendredi noir) ou période de Noël peuvent pousser votre application aux limites de ce qu'elle peut supporter, ou pire, la faire crasher. Parce que le système est fortement sollicité, il génère encore davantage de logs qui peuvent également mettre à mal votre système de supervision.
Dans cette session, j'aborderai les bonnes pratiques d'utilisation de la suite Elastic pour centraliser et monitorer vos logs. Je partagerai également avec vous quelques trucs et astuces pour vous aider à passer sans souci vos Vendredis noirs !
Nous verrons :
* Les architectures de monitoring
* Trouver la taille optimale pour l'API _bulk
* Distribuer la charge
* Taille des index et des shards
* Optimiser les E/S disque
Vous ressortirez de la session avec : des bonnes pratiques pour bâtir son système de monitoring avec la suite Elastic, le tuning avancé pour optimiser les performances d'ingestion et de recherche.
Managing your Black Friday Logs NDC OsloDavid Pilato
Monitoring an entire application is not a simple task, but with the right tools it is not a hard task either. However, events like Black Friday can push your application to the limit, and even cause crashes. As the system is stressed, it generates a lot more logs, which may crash the monitoring system as well. In this talk I will walk through the best practices when using the Elastic Stack to centralize and monitor your logs. I will also share some tricks to help you with the huge increase of traffic typical in Black Fridays.
Topics include:
* monitoring architectures
* optimal bulk size
* distributing the load
* index and shard size
* optimizing disk IO
Takeaway: best practices when building a monitoring system with the Elastic Stack, advanced tuning to optimize and increase event ingestion performance.
Cloud computing has changed the way businesses operate, the way businesses make money, and the way business have to protect their assets and information. More and more software applications are moving into the cloud. People are running their proxies in the cloud and soon you will be collecting your logs in the cloud. You shouldn't have to deal with log collection and log management. You should be able to focus your time on getting value out of the logs; to do log analysis and visualization.
In this presentation we will explore how we can leverage the cloud to build security visualization tools. We will discuss some common visualization libraries and have a look at how they can be deployed to solve security problems. We will see how easy it is to quickly stand up such an application. To close the presentation, we will look at a number of security visualization examples that show how security data benefits from visual representations. For example, how can network traffic, firewall data, or IDS data be visualized effectively?
DAVIX - Data Analysis and Visualization LinuxRaffael Marty
DAVIX, a live CD for data analysis and visualization, brings the most important free tools for data processing and visualization to your desk. There is no hassle with installing an operating system or struggle to build the necessary tools to get started with visualization. You can completely dedicate your time to data analysis.
Workshop: Big Data Visualization for SecurityRaffael Marty
Big Data is the latest hype in the security industry. We will have a closer look at what big data is comprised of: Hadoop, Spark, ElasticSearch, Hive, MongoDB, etc. We will learn how to best manage security data in a small Hadoop cluster for different types of use-cases. Doing so, we will encounter a number of big-data open source tools, such as LogStash and Moloch that help with managing log files and packet captures.
As a second topic we will look at visualization and how we can leverage visualization to learn more about our data. In the hands-on part, we will use some of the big data tools, as well as a number of visualization tools to actively investigate a sample data set.
Managing your black friday logs Voxxed LuxembourgDavid Pilato
Surveiller une application complexe n'est pas une tâche aisée, mais avec les bons outils, ce n'est pas si sorcier. Néanmoins, des périodes fortes telles que les opérations de type "Black Friday" (Vendredi noir) ou période de Noël peuvent pousser votre application aux limites de ce qu'elle peut supporter, ou pire, la faire crasher. Parce que le système est fortement sollicité, il génère encore davantage de logs qui peuvent également mettre à mal votre système de supervision.
Dans cette session, j'aborderai les bonnes pratiques d'utilisation de la suite Elastic pour centraliser et monitorer vos logs. Je partagerai également avec vous quelques trucs et astuces pour vous aider à passer sans souci vos Vendredis noirs !
Nous verrons :
* Les architectures de monitoring
* Trouver la taille optimale pour l'API _bulk
* Distribuer la charge
* Taille des index et des shards
* Optimiser les E/S disque
Vous ressortirez de la session avec : des bonnes pratiques pour bâtir son système de monitoring avec la suite Elastic, le tuning avancé pour optimiser les performances d'ingestion et de recherche.
Managing your Black Friday Logs NDC OsloDavid Pilato
Monitoring an entire application is not a simple task, but with the right tools it is not a hard task either. However, events like Black Friday can push your application to the limit, and even cause crashes. As the system is stressed, it generates a lot more logs, which may crash the monitoring system as well. In this talk I will walk through the best practices when using the Elastic Stack to centralize and monitor your logs. I will also share some tricks to help you with the huge increase of traffic typical in Black Fridays.
Topics include:
* monitoring architectures
* optimal bulk size
* distributing the load
* index and shard size
* optimizing disk IO
Takeaway: best practices when building a monitoring system with the Elastic Stack, advanced tuning to optimize and increase event ingestion performance.
Introduction to Big Data and how FIWARE manage it through the different approaches. What are the differences between Apache Flink and Spark approaches. Introduction to FIWARE Connectors to manage NGSI context information. Brief introduction to Machine Learning with FIWARE technology
Hermes: Free the Data! Distributed Computing with MongoDBMongoDB
Moving data throughout an organization is an art form. Whether mastering the art of ETL or building micro services, we are often left with either business logic embedded where it doesn't belong or monolithic apps that do too much. In this talk, we will show you how we built a persisted messaging bus to ‘Free the Data’ from the apps, making it available across the organization without having to write custom ETL code. This in turn makes it possible for business apps to be standalone, testable and more reliable. We will discuss the basic architecture and how it works, go through some code samples (server side and client side), and present some statistics and visualizations.
In this training session, two leading security experts review how adversaries use DNS to achieve their mission, how to use DNS data as a starting point for launching an investigation, the data science behind automated detection of DNS-based malicious techniques and how DNS tunneling and DGA machine learning algorithms work.
Watch the presentation with audio here: http://info.sqrrl.com/leveraging-dns-for-proactive-investigations
Adam Fuchs' presentation slides on what's next in the evolution of BigTable implementations (transactions, indexing, etc.) and what these advances could mean for the massive database that gave rise to Google.
Our journey with druid - from initial research to full production scaleItai Yaffe
Here at the Nielsen Marketing Cloud we use druid.io (http://druid.io/) as one of our main data stores, both for simple counts and for approximate count-distinct (DataSketches).
It’s been more than a year since we started using it, injecting billions of events each day to multiple druid clusters for different use-cases.
In this meet-up, we will share our journey, the challenges we had, the way we overcame them (at least most of them) and the steps we made to optimize the process around Druid to keep the solution cost effective.
Before diving into Druid, we will briefly present our data pipeline architecture, starting from the front-end serving system, deployed in number of geo-locations, to a centralized Kafka cluster in the cloud, and give some examples of the different processes that consume from Kafka and feed our different data sources.
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
Once you start working with distributed Big Data systems, you start discovering a whole bunch of problems you won’t find in monolithic systems.
All of a sudden to monitor all of the components becomes a big data problem itself.
In the talk we’ll mention all of the aspects that you should take in consideration when monitoring a distributed system once you’re using tools like:
Web Services, Apache Spark, Cassandra, MongoDB, Amazon Web Services.
Not only the tools, what should you monitor about the actual data that flows in the system?
And we’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Deep Learning in Security—An Empirical Example in User and Entity Behavior An...Databricks
Recently, deep learning has delivered groundbreaking advances in many industries. In this presentation, Dr. Wang will share empirical experiences of applying deep learning to solving some specific security problems with real-world customer attack detection examples. He will also discuss the challenges and guidelines for successfully deploying deep learning, or general machine learning, in broader security.
This session will feature two deep learning examples. The first example is a user-behavior anomaly detection solution using Convolutional Neural Network (CNN). Since CNN is most effective for image processing, Dr. Wang will introduce an innovative way to encode a user’s daily behavior into multi-channel images. He will also share the experimental comparison results of CNN hyperparameter tuning. The second example is a stateful user risk scoring system using Long Short Term Memory (LSTM). Most of the modern attacks happen in a multi-stage fashion, i.e., infection -> command & control -> lateral movement -> data infiltration -> data exfiltration. In this case, the company uses LSTM to monitor the temporal state transition of each user over these.“
Splunk, SIEMs, and Big Data - The Undercroft - November 2019Jonathan Singer
Guild members join us on Thursday November 14th at 6pm for our class on Splunk. Our Analyze Guild Master Jonathan Singer will be hitting on Centralized Logging, SEIM, Big Data, and much more.
Introduction to Big Data and how FIWARE manage it through the different approaches. What are the differences between Apache Flink and Spark approaches. Introduction to FIWARE Connectors to manage NGSI context information. Brief introduction to Machine Learning with FIWARE technology
Hermes: Free the Data! Distributed Computing with MongoDBMongoDB
Moving data throughout an organization is an art form. Whether mastering the art of ETL or building micro services, we are often left with either business logic embedded where it doesn't belong or monolithic apps that do too much. In this talk, we will show you how we built a persisted messaging bus to ‘Free the Data’ from the apps, making it available across the organization without having to write custom ETL code. This in turn makes it possible for business apps to be standalone, testable and more reliable. We will discuss the basic architecture and how it works, go through some code samples (server side and client side), and present some statistics and visualizations.
In this training session, two leading security experts review how adversaries use DNS to achieve their mission, how to use DNS data as a starting point for launching an investigation, the data science behind automated detection of DNS-based malicious techniques and how DNS tunneling and DGA machine learning algorithms work.
Watch the presentation with audio here: http://info.sqrrl.com/leveraging-dns-for-proactive-investigations
Adam Fuchs' presentation slides on what's next in the evolution of BigTable implementations (transactions, indexing, etc.) and what these advances could mean for the massive database that gave rise to Google.
Our journey with druid - from initial research to full production scaleItai Yaffe
Here at the Nielsen Marketing Cloud we use druid.io (http://druid.io/) as one of our main data stores, both for simple counts and for approximate count-distinct (DataSketches).
It’s been more than a year since we started using it, injecting billions of events each day to multiple druid clusters for different use-cases.
In this meet-up, we will share our journey, the challenges we had, the way we overcame them (at least most of them) and the steps we made to optimize the process around Druid to keep the solution cost effective.
Before diving into Druid, we will briefly present our data pipeline architecture, starting from the front-end serving system, deployed in number of geo-locations, to a centralized Kafka cluster in the cloud, and give some examples of the different processes that consume from Kafka and feed our different data sources.
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
Once you start working with distributed Big Data systems, you start discovering a whole bunch of problems you won’t find in monolithic systems.
All of a sudden to monitor all of the components becomes a big data problem itself.
In the talk we’ll mention all of the aspects that you should take in consideration when monitoring a distributed system once you’re using tools like:
Web Services, Apache Spark, Cassandra, MongoDB, Amazon Web Services.
Not only the tools, what should you monitor about the actual data that flows in the system?
And we’ll cover the simplest solution with your day to day open source tools, the surprising thing, that it comes not from an Ops Guy.
Deep Learning in Security—An Empirical Example in User and Entity Behavior An...Databricks
Recently, deep learning has delivered groundbreaking advances in many industries. In this presentation, Dr. Wang will share empirical experiences of applying deep learning to solving some specific security problems with real-world customer attack detection examples. He will also discuss the challenges and guidelines for successfully deploying deep learning, or general machine learning, in broader security.
This session will feature two deep learning examples. The first example is a user-behavior anomaly detection solution using Convolutional Neural Network (CNN). Since CNN is most effective for image processing, Dr. Wang will introduce an innovative way to encode a user’s daily behavior into multi-channel images. He will also share the experimental comparison results of CNN hyperparameter tuning. The second example is a stateful user risk scoring system using Long Short Term Memory (LSTM). Most of the modern attacks happen in a multi-stage fashion, i.e., infection -> command & control -> lateral movement -> data infiltration -> data exfiltration. In this case, the company uses LSTM to monitor the temporal state transition of each user over these.“
Splunk, SIEMs, and Big Data - The Undercroft - November 2019Jonathan Singer
Guild members join us on Thursday November 14th at 6pm for our class on Splunk. Our Analyze Guild Master Jonathan Singer will be hitting on Centralized Logging, SEIM, Big Data, and much more.
This chapter is devoted to log mining or log knowledge discovery - a different type of log analysis, which does not rely on knowing what to look for. This takes the “high art” of log analysis to the next level by breaking the dependence on the lists of strings or patterns to look for in the logs.
Information security lection № 2 in NSU.
Вторая лекция, в которой описываются все принципы ИБ, жизненный цикл СЗИ, бизнес-структура объекта защиты, показывается суть документа "политика безопасности", её реализации в реальном бизнесе.
Troubleshooting common oslo.messaging and RabbitMQ issuesMichael Klishin
This talk focuses on troubleshooting of common oslo.messaging and RabbitMQ issues in OpenStack environments. Co-presented at the OpenStack Summit Austin in April 2016.
The presentation will describe methods for discovering interesting and actionable patterns in log files for security management without specifically knowing what you are looking for. This approach is different from "classic" log analysis and it allows gaining an insight into insider attacks and other advanced intrusions, which are extremely hard to discover with other methods. Specifically, I will demonstrate how data mining can be used as a source of ideas for designing future log analysis techniques, that will help uncover the coming threats. The important part of the presentation will be the demonstration how the above methods worked in a real-life environment.
How to Troubleshoot OpenStack Without Losing SleepSadique Puthen
The complex architecture, design, and difficulties while troubleshooting amplifies the effort in debugging a problem with an OpenStack environment. This can give administrators and support associates sleepless nights if OpenStack native and supporting components are not configured properly and tuned for optimum performance, especially with large deployments that involve high availability and load balancing.
Final Year Projects Computer Science (Information security) -2015Syed Ubaid Ali Jafri
Final Year Project Ideas for Computer Science Students, These Projects helps students to enhance their Expertise in the area of Information Security + they would be able to understand the concept of Information Security
Are you curious about KNIME Software?
Do you know the difference between KNIME Analytics Platform and KNIME Server?
Which data sources can KNIME connect to?
Can you run an R script from within a KNIME workflow? A Python script? Which other integrations are available?
How can KNIME help with ETL, data preparation, and general data manipulation? Which machine learning algorithms can KNIME offer?
This webinar answers all of these questions! There’s also information about connecting to big data clusters and how you can run the whole or part of your analysis on a big data platform. It also covers everything you need to know about Microsoft Azure and Amazon AWS
SnapLogic provides a Data Integration platform that takes integration to another level, by combining the power of dynamic programming languages with standard Web interfaces to solve today's most pressing problems in application integration. SnapLogic has an intuitive visual designer that runs in your browser and connects to highly scalable web based Integration server that you can run on premise or in the cloud.
Apache Drill [1] is a distributed system for interactive analysis of large-scale datasets, inspired by Google’s Dremel technology. It is a design goal to scale to 10,000 servers or more and to be able to process Petabytes of data and trillions of records in seconds. Since its inception in mid 2012, Apache Drill has gained widespread interest in the community. In this talk we focus on how Apache Drill enables interactive analysis and query at scale. First we walk through typical use cases and then delve into Drill's architecture, the data flow and query languages as well as data sources supported.
[1] http://incubator.apache.org/drill/
Playing in the Same Sandbox: MySQL and Oraclelynnferrante
SCaLE Linux presentation January2012 "Playing in the Same Sandbox: MySQL and Oracle" describes current and upcoming integrations between MySQL and other Oracle products like Oracle Database firewall, Audit Vault, Secure Backup, Goldengate, My Oracle Support and MySQL Enterprise Monitor
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...Deepak Chandramouli
PayPal Data Lake Journey | 2017-Oct | San Diego | Teradata Edge of Next
Gimel [http://www.gimel.io] is a Big Data Processing Library, open sourced by PayPal.
https://www.youtube.com/watch?v=52PdNno_9cU&t=3s
Gimel empowers analysts, scientists, data engineers alike to access a variety of Big Data / Traditional Data Stores - with just SQL or a single line of code (Unified Data API).
This is possible via the Catalog of Technical properties abstracted from users, along with a rich collection of Data Store Connectors available in Gimel Library.
A Catalog provider can be Hive or User Supplied (runtime) or UDC.
In addition, PayPal recently open sourced UDC [Unified Data Catalog], which can host and serve the Technical Metatada of the Data Stores & Objects. Visit http://www.unifieddatacatalog.io to experience first hand.
Splunk Conf2010: Corporate Express presents Splunk with SAPSplunk
Corporate Express is in the midst of a business transformation program called “next-gen”, rolling out SAP across New Zealand and Australia. This session will detail how they’re using Splunk to plan for capacity and performance requirements, and the how they’re combining multiple charts, graphs, tables and views from disparate systems into a single pane in Splunk. Learn more here: http://www.splunk.com/view/splunk-at-corporate-express/SP-CAAAFNR
Similar to Mining Your Logs - Gaining Insight Through Visualization (20)
How to protect, detect, and respond to your threats.
This is an MSP centric talk exploring how to detect, protect, and respond to cyber security threats. We first walk through the cyber defense matrix, explore what security intelligence needs to be and emphasize the concepts with two case studies of BlackCat.
Extended Detection and Response (XDR)An Overhyped Product Category With Ulti...Raffael Marty
Extended Detection and Response, or XDR for short, is one of the acronyms that are increasingly used by cybersecurity vendors to explain their approach to solving the cyber security problem. We have been spending trillions of dollars on approaches to secure our systems and data, with what success? Cybersecurity is still one of the biggest and most challenging areas that companies, small and large, are dealing with. XDR is another approach driven by security vendors to solve this problem. The challenge is that every vendor defines XDR slightly differently and makes it fit their own “challenge du jour” for marketing and selling their products.
In this presentation we will demystify the XDR acronym and put a working model behind it. Together, we will explore why XDR is a fabulous concept, but also discover that it’s nothing revolutionarily new. With an MSP lens, we will explore what the XDR benefits are for small and medium businesses and what it means to the security strategy of both MSPs and their clients. The audience will leave with a clear understanding of what XDR is, how the technology matters to them, and how XDR will ultimately help them secure their customers and enable trusted commerce.
Blog Post: http://raffy.ch/blog. - Video: https://youtu.be/nk5uz0VZrxM
In this video we talk about the world of security data or log data. In the first section, we dive into a bit of a history lesson around log management, SIEM, and big data in security. We then shift to the present to discuss some of the challenges that we face today with managing all of that data and also discuss some of the trends in the security analytics space. In the third section, we focus on the future. What does tomorrow hold in the SIEM / security data space? What are some of the key features we will see and how does this matter to the user of these approaches.
Cyber Security Beyond 2020 – Will We Learn From Our Mistakes?Raffael Marty
The cyber security industry has spent trillions of dollars to keep external attackers at bay. To what effect? We still don't see an end to the cat and mouse game between attackers and the security industry; zero day attacks, new vulnerabilities, ever increasingly sophisticated attacks, etc. We need a paradigm shift in security. A shift away from traditional threat intelligence and indicators of compromise (IOCs). We need to look at understanding behaviors. Those of devices and those of humans.
What are the security approaches and trends that will make an actual difference in protecting our critical data and intellectual property; not just from external attackers, but also from malicious insiders? We will explore topics from the 'all solving' artificial intelligence to risk-based security. We will look at what is happening within the security industry itself, where startups are putting placing their bets, and how human factors will play an increasingly important role in security, along with all of the potential challenges that will create.
Artificial Intelligence – Time Bomb or The Promised Land?Raffael Marty
Companies have AI projects. Security products use AI to keep attackers out and insiders at bay. But what is this "AI" that everyone talks about? In this talk we will explore what artificial intelligence in cyber security is, where the limitations and dangers are, and in what areas we should invest more in AI. We will talk about some of the recent failures of AI in security and invite a conversation about how we verify artificially intelligent systems to understand how much trust we can place in them.
Alongside the AI conversation, we will discover that we need to make a shift in our traditional approach to cyber security. We need to augment our reactive approaches of studying adversary behaviors to understanding behaviors of users and machines to inform a risk-driven approach to security that prevents even zero day attacks.
In this presentation I explore the topic of artificial intelligence in cyber security. What is AI and how do we get to real intelligence in a cyber context. I outline some of the dangers of the way we are using algorithms (AI, ML) today and what that leads to. We then explore how we can add real intelligence through export knowledge to the problem of finding attackers and anomalies in our applications and networks.
Presented at AI 4 Cyber in NYC on April 30, 2019
AI & ML in Cyber Security - Why Algorithms are DangerousRaffael Marty
Link to the video of the presentation: https://www.youtube.com/watch?v=WG1k-Xh1TqM
Every single security company is talking in some way or another about how they are applying machine learning. Companies go out of their way to make sure they mention machine learning and not statistics when they explain how they work. Recently, that's not enough anymore either. As a security company you have to claim artificial intelligence to be even part of the conversation.
Guess what. It's all baloney. We have entered a state in cyber security that is, in fact, dangerous. We are blindly relying on algorithms to do the right thing. We are letting deep learning algorithms detect anomalies in our data without having a clue what that algorithm just did. In academia, they call this the lack of explainability and verifiability. But rather than building systems with actual security knowledge, companies are using algorithms that nobody understands and in turn discover wrong insights.
In this talk, I will show the limitations of machine learning, outline the issues of explainability, and show where deep learning should never be applied. I will show examples of how the blind application of algorithms (including deep learning) actually leads to wrong results. Algorithms are dangerous. We need to revert back to experts and invest in systems that learn from, and absorb the knowledge, of experts.
AI & ML in Cyber Security - Why Algorithms Are DangerousRaffael Marty
Every single security company is talking in some way or another about how they are applying machine learning. Companies go out of their way to make sure they mention machine learning and not statistics when they explain how they work. Recently, that's not enough anymore either. As a security company you have to claim artificial intelligence to be even part of the conversation.
Guess what. It's all baloney. We have entered a state in cyber security that is, in fact, dangerous. We are blindly relying on algorithms to do the right thing. We are letting deep learning algorithms detect anomalies in our data without having a clue what that algorithm just did. In academia, they call this the lack of explainability and verifiability. But rather than building systems with actual security knowledge, companies are using algorithms that nobody understands and in turn discover wrong insights.
In this talk I will show the limitations of machine learning, outline the issues of explainability, and show where deep learning should never be applied. I will show examples of how the blind application of algorithms (including deep learning) actually leads to wrong results. Algorithms are dangerous. We need to revert back to experts and invest in systems that learn from, and absorb the knowledge, of experts.
Delivering Security Insights with Data Analytics and VisualizationRaffael Marty
It's an interesting exercise to look back to the year 2000 to see how we approached cyber security. We just started to realize that data might be a useful currency, but for the most part, security pursued preventative avenues, such as firewalls, intrusion prevention systems, and anti-virus. With the advent of log management and security incident and event management (SIEM) solutions we started to gather gigabytes of sensor data and correlate data from different sensors to improve on their weaknesses and accelerate their strengths. But fundamentally, such solutions didn't scale that well and struggled to deliver real security insight.
Today, cybersecurity wouldn't work anymore without large scale data analytics and machine learning approaches, especially in the realm of malware classification and threat intelligence. Nonetheless, we are still just scratching the surface and learning where the real challenges are in data analytics for security.
This talk will go on a journey of big data in cybersecurity, exploring where big data has been and where it must go to make a true difference. We will look at the potential of data mining, machine learning, and artificial intelligence, as well as the boundaries of these approaches. We will also look at both the shortcomings and potential of data visualization and the human computer interface. It is critical that today's systems take into account the human expert and, most importantly, provide the right data.
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't ChangedRaffael Marty
We are writing the year 2017. Cyber security has been a discipline for many years and thousands of security companies are offering solutions to deter and block malicious actors in order to keep our businesses operating and our data confidential. But fundamentally, cyber security has not changed during the last two decades. We are still running Snort and Bro. Firewalls are fundamentally still the same. People get hacked for their poor passwords and we collect logs that we don't know what to do with. In this talk I will paint a slightly provocative and dark picture of security. Fundamentally, nothing has really changed. We'll have a look at machine learning and artificial intelligence and see how those techniques are used today. Do they have the potential to change anything? How will the future look with those technologies? I will show some practical examples of machine learning and motivate that simpler approaches generally win. Maybe we find some hope in visualization? Or maybe Augmented reality? We still have a ways to go.
Ensuring security of a company’s data and infrastructure has largely become a data analytics challenge. It is about finding and understanding patterns and behaviors that are indicative of malicious activities or deviations from the norm. Data, Analytics, and Visualization are used to gain insights and discover those malicious activities. These three components play off of each other, but also have their inherent challenges. A few examples will be given to explore and illustrate some of these challenges,
Creating Your Own Threat Intel Through Hunting & VisualizationRaffael Marty
The security industry is talking a lot about threat intelligence; external information that a company can leverage to understand where potential threats are knocking on the door and might have already perpetrated the network boundaries. Conversations with many CERTs have shown that we have to stop relying on knowledge about how attacks have been conducted in the past and start ‘hunting’ for signs of compromises and anomalies in our own environments.
In this presentation we explore how the decade old field of security visualization has emerged. We show how we have applied advanced analytics and visualization to create our own threat intelligence and investigated lateral movement in a Fortune 50 company.
Visualization. Data science. No machine learning. But pretty pictures.What is internal threat intelligence? Check out http://www.darkreading.com/analytics/creating-your-own-threat-intel-through-hunting-and-visualization/a/d-id/1321225
Creating Your Own Threat Intel Through Hunting & VisualizationRaffael Marty
The security industry is talking a lot about threat intelligence; external information that a company can leverage to understand where potential threats are knocking on the door and might have already perpetrated the network boundaries. Conversations with many CERTs have shown that we have to stop relying on knowledge about how attacks have been conducted in the past and start 'hunting' for signs of compromises and anomalies in our own environments.
In this presentation we explore how the decade old field of security visualization has emerged. We show how we have applied advanced analytics and visualization to create our own threat intelligence and investigated lateral movement in a Fortune 50 company.
Visualization. Data science. No machine learning. But pretty pictures.
Here is a blog post I wrote a bit ago about the general theme of internal threat intelligence:
http://www.darkreading.com/analytics/creating-your-own-threat-intel-through-hunting-and-visualization/a/d-id/1321225?
The extent and impact of recent security breaches is showing that current security approaches are just not working. But what can we do to protect our business? We have been advocating monitoring for a long time as a way to detect subtle, advanced attacks that are still making it through our defenses. However, products have failed to deliver on this promise.
Current solutions don't scale in both data volume and analytical insights. In this presentation we will explore what security monitoring is. Specifically, we are going to explore the question of how to visualize a billion log records. A number of security visualization examples will illustrate some of the challenges with big data visualization. They will also help illustrate how data mining and user experience design help us get a handle on the security visualization challenges - enabling us to gain deep insight for a number of security use-cases.
An overview of some methods and principles for big data visualization. The presentation quickly hits on the topic of dashboards and some cyber security uses. The topic of a big data lake is also briefly discussed in the context of a cyber security big data setup.
The Heatmap - Why is Security Visualization so Hard?Raffael Marty
The extent and impact of recent security breaches is showing that current approaches are just not working. But what can we do to protect our business? We have been advocating monitoring for a long time as a way to detect subtle, advanced attacks. However, products have failed to deliver on this promise. Current solutions don't scale in both data volume and analytical insights. In this presentation we will explore why it is so hard to come up with a security monitoring (or shall we call it security intelligence) approach that helps find sophisticated attackers in all the data collected. We are going to explore the question of how to visualize a billion events. We are going to look at a number of security visualization examples to illustrate the problem and some possible solutions. These examples will also help illustrate how data mining and user experience design help us get a handle of the security visualization challenges - enabling us to gain deep insight for a number of security use-cases.
Vision is a human’s dominant sense. It is the communication channel with the highest bandwidth into the human brain. Security tools and applications need to make better use of information visualization to enhance human computer interactions and information exchange.
In this talk we will explore a few basic principles of information visualization to see how they apply to cyber security. We will explore both visualization as a data presentation, as well as a data discovery tool. We will address questions like: What makes for effective visualizations? What are some core principles to follow when designing a dashboard? How do you go about visually exploring a terabyte of data? And what role do big data and data mining play in security visualization?
The presentation is filled with visualizations of security data to help translate the theoretical concepts into tangible applications.
The Heatmap - Why is Security Visualization so Hard?Raffael Marty
This presentation explores why it is so hard to come up with a security monitoring (or shall we call it security intelligence) approach that helps find sophisticated attackers in all the data collected. It explores the question of how to visualize a billion events. To do so, the presentation dives deeply into heatmaps - matrices - as an example of a simple type of visualization. While these heatmaps are very simple, they are incredibly versatile and help us think about the problem of security visualization. They help illustrate how data mining and user experience design help get a handle of the security visualization challenges - enabling us to gain deep insight for a number of security use-cases.
22. Logging as a Service (LaaS)
• Economically advantageous - think about TCO
• Pay as you go
• Elastic infrastructure scales with your needs
• No installation needed
• No setup costs / time for logging solution
• Open platform with RESTful APIs
Logging as a Service 22
23. Loggly
Data Sources Consumers
Loggly
user interface
UI extensions
mobile-166 My syslog
Data collection
Proxies API Data access
Distributed
Indexers and Search Machines indexing and
processing
Log Archive Distributed
data store
Logging as a Service 23