The document discusses the SqrrlThreat Hunting Platform, which collects security, network, and endpoint data to detect threats. It uses Apache Accumulo for distributed storage and processing. Behavioral analytics models adversary behavior based on the attack kill chain. Analytics run on the data to detect rare events and chain them together using graphs. Results are then collated for visualization and analysis to hunt, detect, and disrupt threats.
Threat Hunting for Command and Control ActivitySqrrl
Sqrrl's Security Technologist Josh Liburdi provides an overview of how to detect C2 through a combination of automated detection and hunting.
Watch the presentation with audio here: http://info.sqrrl.com/threat-hunting-for-command-and-control-activity
Delivered 1 - day Practical Threat Hunting workshop at sacon.io in Bangalore,India balancing on developing the threat hunting program in organization, how and where to start from as well threat hunting demos as it would look on the ground with hands on labs for 100+ participants.
MITRE ATT&CKcon 2018: Hunters ATT&CKing with the Data, Roberto Rodriguez, Spe...MITRE - ATT&CKcon
With the development of the MITRE ATT&CK framework and its categorization of adversary activity during the attack cycle, understanding what to hunt for has become easier and more efficient than ever. However, organizations are still struggling to understand how they can prioritize the development of hunt hypothesis, assess their current security posture, and develop the right analytics with the help of ATT&CK. Even though there are several ways to utilize ATT&CK to accomplish those goals, there are only a few that are focusing primarily on the data that is currently being collected to drive the success of a hunt program.
This presentation shows how organizations can benefit from mapping their current visibility from a data perspective to the ATT&CK framework. It focuses on how to identify, document, standardize and model current available data to enhance a hunt program. It presents an updated ThreatHunter-Playbook, a Kibana ATT&CK dashboard, a new project named Open Source Security Events Metadata known as OSSEM and expands on the “data sources” section already provided by ATT&CK on most of the documented adversarial techniques.
Knowledge for the masses: Storytelling with ATT&CKMITRE ATT&CK
From ATT&CKcon 3.0
By Ismael Valenzuela and Jose Luis Sanchez Martinez, Trellix
The Trellix team believes that creating and sharing compelling stories about cyber threats -with ATT&CK- is a powerful way for raising awareness and enabling actionability against cyber threats.
In this talk the team will share their experiences leveraging ATT&CK to disseminate Threat knowledge to different audiences (Software Development teams, Managers, Threat detection engineers, Threat hunters, Cyber Threat Analysts, Support Engineers, upper management, etc.).
They will show concrete examples and representations created with ATT&CK to describe the threats at different levels, including: 1) an Attack Path graph that shows the overall flow of the attack; 2) Tactic-specific TTP summary tables and graphs; 3) very detailed, step-by-step description of the attacker's behaviors.
While traditional cybersecurity defenses focus on prevention, there are many vulnerabilities and potential attacks against weapon systems. While weapon systems are more software dependent and networked than ever before, cybersecurity has not always been prioritized with regards to weapon systems acquisition.
Threat actors have advanced in their sophistication as they are well-resourced and highly skilled, oftentimes gathering detailed knowledge of the systems they want to attack. Ensuring stronger detection methods is imperative, but because these types of threats are very targeted and advanced, agencies need the capability to proactively hunt.
Threat Hunting for Command and Control ActivitySqrrl
Sqrrl's Security Technologist Josh Liburdi provides an overview of how to detect C2 through a combination of automated detection and hunting.
Watch the presentation with audio here: http://info.sqrrl.com/threat-hunting-for-command-and-control-activity
Delivered 1 - day Practical Threat Hunting workshop at sacon.io in Bangalore,India balancing on developing the threat hunting program in organization, how and where to start from as well threat hunting demos as it would look on the ground with hands on labs for 100+ participants.
MITRE ATT&CKcon 2018: Hunters ATT&CKing with the Data, Roberto Rodriguez, Spe...MITRE - ATT&CKcon
With the development of the MITRE ATT&CK framework and its categorization of adversary activity during the attack cycle, understanding what to hunt for has become easier and more efficient than ever. However, organizations are still struggling to understand how they can prioritize the development of hunt hypothesis, assess their current security posture, and develop the right analytics with the help of ATT&CK. Even though there are several ways to utilize ATT&CK to accomplish those goals, there are only a few that are focusing primarily on the data that is currently being collected to drive the success of a hunt program.
This presentation shows how organizations can benefit from mapping their current visibility from a data perspective to the ATT&CK framework. It focuses on how to identify, document, standardize and model current available data to enhance a hunt program. It presents an updated ThreatHunter-Playbook, a Kibana ATT&CK dashboard, a new project named Open Source Security Events Metadata known as OSSEM and expands on the “data sources” section already provided by ATT&CK on most of the documented adversarial techniques.
Knowledge for the masses: Storytelling with ATT&CKMITRE ATT&CK
From ATT&CKcon 3.0
By Ismael Valenzuela and Jose Luis Sanchez Martinez, Trellix
The Trellix team believes that creating and sharing compelling stories about cyber threats -with ATT&CK- is a powerful way for raising awareness and enabling actionability against cyber threats.
In this talk the team will share their experiences leveraging ATT&CK to disseminate Threat knowledge to different audiences (Software Development teams, Managers, Threat detection engineers, Threat hunters, Cyber Threat Analysts, Support Engineers, upper management, etc.).
They will show concrete examples and representations created with ATT&CK to describe the threats at different levels, including: 1) an Attack Path graph that shows the overall flow of the attack; 2) Tactic-specific TTP summary tables and graphs; 3) very detailed, step-by-step description of the attacker's behaviors.
While traditional cybersecurity defenses focus on prevention, there are many vulnerabilities and potential attacks against weapon systems. While weapon systems are more software dependent and networked than ever before, cybersecurity has not always been prioritized with regards to weapon systems acquisition.
Threat actors have advanced in their sophistication as they are well-resourced and highly skilled, oftentimes gathering detailed knowledge of the systems they want to attack. Ensuring stronger detection methods is imperative, but because these types of threats are very targeted and advanced, agencies need the capability to proactively hunt.
Threat hunting - Every day is hunting seasonBen Boyd
Breakout Presentation by Ben Boyd during the 2018 Nebraska Cybersecurity Conference.
Introduction to Threat Hunting and helpful steps for building a Threat Hunting Program of any size, from small to massive.
Automating the mundanity of technique IDs with ATT&CK Detections CollectorMITRE ATT&CK
From ATT&CKcon 3.0
By Marcus LaFerrera and Ryan Kovar, Splunk
Since the release of MITRE ATT&CK, vendors and governmental bodies have begun mapping their security blogs, whitepapers, and threat intel reports to ATT&CK TTPs, which is incredible! Vendors have then begun mapping their detections to those mapped TTPs, which is even more awesome! What is not awesome is dissecting a piece of prose for all of the specific embedded ATT&CK technique IDs and then mapping them to your detections to determine coverage. Over the last year, the team at Splunk has spent more time doing this than they would like to admit, so they wrote a tool to do it for them and want to share it with the world. Join the Splunk team as they tell the world about ATT&CK Detections Collector (ADC). ADC is an open-source python tool that will allow you to extract MITRE technique IDs from a third-party URLs and output them into a file. If you use Splunk, the team even maps them to their existing (previously mapped) detection corpus. They even added the ability to export them into a navigator json for fun, profit, or (at least) better visualization!
From ATT&CKcon 3.0
By Matt Snyder, VMWare
Insider threats are some of the most treacherous and every organization is susceptible: it's estimated that theft of Intellectual Property alone exceeds $600 billion a year. Armed with intimate knowledge of your organization and masked as legitimate business, often these attacks go unnoticed until it's too late and the damage is done. To make matters worse, threat actors are now trying to lure employees with the promise of large paydays to help carry out attacks.
These advanced attacks require advanced solutions, and we are going to demonstrate how we are using the MITRE ATT&CK framework to proactively combat these threats. Armed with these tactics and techniques, we show you how to build intelligent detections to help secure even the toughest of environments.
Your adversaries continue to attack and get into companies. You can no longer rely solely on alerts from point solutions to secure your network. To identify and mitigate these advanced threats, analysts must become proactive in identifying not just indicators, but attack patterns and behavior. In this workshop we will walk through a hands-on exercise with a real world attack scenario. The workshop will illustrate how advanced correlations from multiple data sources and machine learning can enhance security analysts capability to detect and quickly mitigate advanced attacks.
Your adversaries continue to attack and get into companies. You can no longer rely on alerts from point solutions alone to secure your network. To identify and mitigate these advanced threats, analysts must become proactive in identifying not just indicators, but attack patterns and behavior. In this workshop we will walk through a hands-on exercise with a real world attack scenario. The workshop will illustrate how advanced correlations from multiple data sources and machine learning can enhance security analysts capability to detect and quickly mitigate advanced attacks.
After anomalous network traffic has been identified there can still be an abundance of results for an analyst to process. This presentation is for data scientist and network security professionals who want to increase the signal-to-noise.
Flare is a network analytic framework designed for data scientists, security researchers, and network professionals. Written in python, flare is designed for rapid prototyping and development of behavioral analytics. Flare comes with a collection of pre-built utility functions useful for performing feature extraction.
Using flare, we'll walk through identifying Domain Generation Algorithms (DGA) commonly used in malware and how to reduce the dataset to a manageable amount for security professionals to process.
We'll also explore flare's beaconing detection which can be used with the output from popular Intrusion Detection System (IDS) frameworks.
More information on flare can be found at https://github.com/austin-taylor/flare
www.austintaylor.io
Threat Hunting Platforms (Collaboration with SANS Institute)Sqrrl
Traditional security measures like firewalls, IDS, endpoint protection, and SIEMs are only part of the network security puzzle. Threat hunting is a proactive approach to uncovering threats that lie hidden in your network or system, that can evade more traditional security tools. Go in-depth with Sqrrl and SANS Institute to learn how hunting platforms work.
Watch the recording with audio here: http://info.sqrrl.com/sans-sqrrl-threat-hunting-webcast
Evolution of Offensive Testing - ATT&CK-based Adversary Emulation PlansChristopher Korban
Talk about the evolution of security posture assessments, solving red team problems with ATT&CK-based Adversary Emulation Plans.
Conference: Art into Science - A Conference on Defense 2018
https://www.enoinstitute.com/training-tutorials-courses/cyber-threat-hunting-training-ccthp/ Learn how to find, assess, and remove threats from your organization in our Certified Cyber Threat Hunting Training (CCTHP) designed to prepare you for the Certified Cyber Threat Hunting Professional (CCTHP) exam.
In this Cyber Threat Hunting Training (CCTHP) course, we will deep dive into “Threat hunting” and searching for threats and mitigate before the bad guy pounce. And we will craft a series of attacks to check Enterprise security level and hunt for threats. An efficient Threat hunting approach towards Network, Web, Cloud, IoT Devices, Command & Control Channel(c2), Web shell, memory, OS, which will help you to gain a new level of knowledge and carry out all tasks with complete hands-on.
RESOURCES:
Cyber Threat Hunting Training: Cyber Threat Hunting A Complete Guide – 2020 Edition By Gerardus Blokdyk/vitalsource.com
Cyber Threat Hunting Training: Cyber Threat Hunting A Complete Guide – 2019 Edition By: Gerardus Blokdyk/vitalsource.com
Cyber Threat Hunting Training: Hunting Cyber Criminals: A Hacker’s Guide to Online Intelligence Gathering Tools and Techniques 1st Edition by Vinny Troia/Amazon.com
Cyber Threat Hunting Training: Investigating the Cyber Breach: The Digital Forensics Guide for the Network Engineer by Muniz Joseph and Lakhani Aamir/Amazon.com
CUSTOMIZE It:
We can adapt this Cyber Threat Hunting Training (CCTHP) course to your group’s background and work requirements at little to no added cost.
If you are familiar with some aspects of this Cyber Threat Hunting (CCTHP) course, we can omit or shorten their discussion.
We can adjust the emphasis placed on the various topics or build the Cyber Threat Hunting Training (CCTHP) around the mix of technologies of interest to you (including technologies other than those included in this outline).
If your background is nontechnical, we can exclude the more technical topics, include the topics that may be of special interest to you (e.g., as a manager or policy-maker), and present the Cyber Threat Hunting Training (CCTHP) course in manner understandable to lay audiences.
The New Pentest? Rise of the Compromise AssessmentInfocyte
If an attacker had a foothold in your network today, would you know it?
If they made it past your real-time defense measures (EDR, EPP, AV, UEBA, firewalls, etc.) or an analyst misinterpreted a critical alert, chances are they've entrenched themselves for the long haul. Skilled and organized attackers know long-term persistence in your network is the most critical component to meeting their goal of stealing information, causing damage, or pivoting attacks on other organizations.
Threat hunting is the proactive practice of finding attackers in your environment before they can cause damage (or at least stop the bleeding from continued exposure). Unfortunately, effective threat hunting practices remain out-of-reach for most organizations due to lack of security infrastructure and qualified people to manage advanced endpoint security solutions.
One solution to this problem is to hire a third party to conduct a periodic assessment geared toward discovery of unauthorized access and compromised systems. This is called a "compromise assessment" and just recently compromise assessments have become one of the most requested services from top security service providers.
Customers don’t want to just know if they can be hacked (a good penetration tester will generally conclude “yes”) they want to know if they ARE hacked—right now—and if so, what endpoints/hosts/servers on their network are compromised.
In this presentation, which was originally prepared for Black Hat 2018, Chris Gerritz outlines the growing practice of compromise assessments and the best practices being utilized by some of the largest and most sophisticated managed security service providers (MSSPs) with this offering.
What approaches are most effective?
What data is being utilized?
What are some of the top challenges?
To request a free 100-node compromise assessment or to learn more about Infocyte HUNT — our comprehensive threat hunting platform — and start a free trial, please visit https://try.infocyte.com.
This talk will be an overview of the new features and improvements currently implemented for the Apache Accumulo 1.8.0 release. This will be a discussion about some of these exciting changes with a focus on what is of the most importance for users.
Threat hunting - Every day is hunting seasonBen Boyd
Breakout Presentation by Ben Boyd during the 2018 Nebraska Cybersecurity Conference.
Introduction to Threat Hunting and helpful steps for building a Threat Hunting Program of any size, from small to massive.
Automating the mundanity of technique IDs with ATT&CK Detections CollectorMITRE ATT&CK
From ATT&CKcon 3.0
By Marcus LaFerrera and Ryan Kovar, Splunk
Since the release of MITRE ATT&CK, vendors and governmental bodies have begun mapping their security blogs, whitepapers, and threat intel reports to ATT&CK TTPs, which is incredible! Vendors have then begun mapping their detections to those mapped TTPs, which is even more awesome! What is not awesome is dissecting a piece of prose for all of the specific embedded ATT&CK technique IDs and then mapping them to your detections to determine coverage. Over the last year, the team at Splunk has spent more time doing this than they would like to admit, so they wrote a tool to do it for them and want to share it with the world. Join the Splunk team as they tell the world about ATT&CK Detections Collector (ADC). ADC is an open-source python tool that will allow you to extract MITRE technique IDs from a third-party URLs and output them into a file. If you use Splunk, the team even maps them to their existing (previously mapped) detection corpus. They even added the ability to export them into a navigator json for fun, profit, or (at least) better visualization!
From ATT&CKcon 3.0
By Matt Snyder, VMWare
Insider threats are some of the most treacherous and every organization is susceptible: it's estimated that theft of Intellectual Property alone exceeds $600 billion a year. Armed with intimate knowledge of your organization and masked as legitimate business, often these attacks go unnoticed until it's too late and the damage is done. To make matters worse, threat actors are now trying to lure employees with the promise of large paydays to help carry out attacks.
These advanced attacks require advanced solutions, and we are going to demonstrate how we are using the MITRE ATT&CK framework to proactively combat these threats. Armed with these tactics and techniques, we show you how to build intelligent detections to help secure even the toughest of environments.
Your adversaries continue to attack and get into companies. You can no longer rely solely on alerts from point solutions to secure your network. To identify and mitigate these advanced threats, analysts must become proactive in identifying not just indicators, but attack patterns and behavior. In this workshop we will walk through a hands-on exercise with a real world attack scenario. The workshop will illustrate how advanced correlations from multiple data sources and machine learning can enhance security analysts capability to detect and quickly mitigate advanced attacks.
Your adversaries continue to attack and get into companies. You can no longer rely on alerts from point solutions alone to secure your network. To identify and mitigate these advanced threats, analysts must become proactive in identifying not just indicators, but attack patterns and behavior. In this workshop we will walk through a hands-on exercise with a real world attack scenario. The workshop will illustrate how advanced correlations from multiple data sources and machine learning can enhance security analysts capability to detect and quickly mitigate advanced attacks.
After anomalous network traffic has been identified there can still be an abundance of results for an analyst to process. This presentation is for data scientist and network security professionals who want to increase the signal-to-noise.
Flare is a network analytic framework designed for data scientists, security researchers, and network professionals. Written in python, flare is designed for rapid prototyping and development of behavioral analytics. Flare comes with a collection of pre-built utility functions useful for performing feature extraction.
Using flare, we'll walk through identifying Domain Generation Algorithms (DGA) commonly used in malware and how to reduce the dataset to a manageable amount for security professionals to process.
We'll also explore flare's beaconing detection which can be used with the output from popular Intrusion Detection System (IDS) frameworks.
More information on flare can be found at https://github.com/austin-taylor/flare
www.austintaylor.io
Threat Hunting Platforms (Collaboration with SANS Institute)Sqrrl
Traditional security measures like firewalls, IDS, endpoint protection, and SIEMs are only part of the network security puzzle. Threat hunting is a proactive approach to uncovering threats that lie hidden in your network or system, that can evade more traditional security tools. Go in-depth with Sqrrl and SANS Institute to learn how hunting platforms work.
Watch the recording with audio here: http://info.sqrrl.com/sans-sqrrl-threat-hunting-webcast
Evolution of Offensive Testing - ATT&CK-based Adversary Emulation PlansChristopher Korban
Talk about the evolution of security posture assessments, solving red team problems with ATT&CK-based Adversary Emulation Plans.
Conference: Art into Science - A Conference on Defense 2018
https://www.enoinstitute.com/training-tutorials-courses/cyber-threat-hunting-training-ccthp/ Learn how to find, assess, and remove threats from your organization in our Certified Cyber Threat Hunting Training (CCTHP) designed to prepare you for the Certified Cyber Threat Hunting Professional (CCTHP) exam.
In this Cyber Threat Hunting Training (CCTHP) course, we will deep dive into “Threat hunting” and searching for threats and mitigate before the bad guy pounce. And we will craft a series of attacks to check Enterprise security level and hunt for threats. An efficient Threat hunting approach towards Network, Web, Cloud, IoT Devices, Command & Control Channel(c2), Web shell, memory, OS, which will help you to gain a new level of knowledge and carry out all tasks with complete hands-on.
RESOURCES:
Cyber Threat Hunting Training: Cyber Threat Hunting A Complete Guide – 2020 Edition By Gerardus Blokdyk/vitalsource.com
Cyber Threat Hunting Training: Cyber Threat Hunting A Complete Guide – 2019 Edition By: Gerardus Blokdyk/vitalsource.com
Cyber Threat Hunting Training: Hunting Cyber Criminals: A Hacker’s Guide to Online Intelligence Gathering Tools and Techniques 1st Edition by Vinny Troia/Amazon.com
Cyber Threat Hunting Training: Investigating the Cyber Breach: The Digital Forensics Guide for the Network Engineer by Muniz Joseph and Lakhani Aamir/Amazon.com
CUSTOMIZE It:
We can adapt this Cyber Threat Hunting Training (CCTHP) course to your group’s background and work requirements at little to no added cost.
If you are familiar with some aspects of this Cyber Threat Hunting (CCTHP) course, we can omit or shorten their discussion.
We can adjust the emphasis placed on the various topics or build the Cyber Threat Hunting Training (CCTHP) around the mix of technologies of interest to you (including technologies other than those included in this outline).
If your background is nontechnical, we can exclude the more technical topics, include the topics that may be of special interest to you (e.g., as a manager or policy-maker), and present the Cyber Threat Hunting Training (CCTHP) course in manner understandable to lay audiences.
The New Pentest? Rise of the Compromise AssessmentInfocyte
If an attacker had a foothold in your network today, would you know it?
If they made it past your real-time defense measures (EDR, EPP, AV, UEBA, firewalls, etc.) or an analyst misinterpreted a critical alert, chances are they've entrenched themselves for the long haul. Skilled and organized attackers know long-term persistence in your network is the most critical component to meeting their goal of stealing information, causing damage, or pivoting attacks on other organizations.
Threat hunting is the proactive practice of finding attackers in your environment before they can cause damage (or at least stop the bleeding from continued exposure). Unfortunately, effective threat hunting practices remain out-of-reach for most organizations due to lack of security infrastructure and qualified people to manage advanced endpoint security solutions.
One solution to this problem is to hire a third party to conduct a periodic assessment geared toward discovery of unauthorized access and compromised systems. This is called a "compromise assessment" and just recently compromise assessments have become one of the most requested services from top security service providers.
Customers don’t want to just know if they can be hacked (a good penetration tester will generally conclude “yes”) they want to know if they ARE hacked—right now—and if so, what endpoints/hosts/servers on their network are compromised.
In this presentation, which was originally prepared for Black Hat 2018, Chris Gerritz outlines the growing practice of compromise assessments and the best practices being utilized by some of the largest and most sophisticated managed security service providers (MSSPs) with this offering.
What approaches are most effective?
What data is being utilized?
What are some of the top challenges?
To request a free 100-node compromise assessment or to learn more about Infocyte HUNT — our comprehensive threat hunting platform — and start a free trial, please visit https://try.infocyte.com.
This talk will be an overview of the new features and improvements currently implemented for the Apache Accumulo 1.8.0 release. This will be a discussion about some of these exciting changes with a focus on what is of the most importance for users.
"Cyberhunting" actively looks for signs of compromise within an organization and seeks to control and minimize the overall damage. These rare, but essential, breed of enterprise cyber defenders give proactive security a whole new meaning.
Check out the accompanying webinar: http://www.hosting.com/resources/webinars/?commid=228353
Hunting: Defense Against The Dark Arts - BSides Philadelphia - 2016Danny Akacki
We can all agree that threat detection is an essential component of a functioning security monitoring program. Let's start thinking about how to take our tradecraft to the next level and hunt for ways for evil to do evil things. This talk will run through some of the observations gathered during hunting expeditions inside the networks of multiple Fortune ranked organizations. We hope to challenge you to expand your security operations, moving beyond traditional signature based detection.
2017 02-07 - elastic & spark. building a search geo locatorAlberto Paro
Using Elasticsearch in a BigData environment is very simple. In this talk, we analyse what's Big Data and we show how it is easy integrating ElasticSearch with Apache Spark
Smart Data Slides: Emerging Hardware Choices for Modern AI Data ManagementDATAVERSITY
Leading edge AI applications have always been resource-intensive and known for stretching the limits of conventional (von Neumann architecture) computer performance. Specialized hardware, purpose built to optimize AI applications, is not new. In fact, it should be no surprise that the very first .com internet domain was registered to Symbolics - a company that built the Lisp Machine, a dedicated AI workstation - in 1985. In the last three decades, of course, the performance of conventional computers has improved dramatically with advances in chip density (Moore’s Law) leading to faster processor speeds, memory speeds, and massively parallel architectures. And yet, some applications - like machine vision for real time video analysis and deep machine learning - always need more power.
Participants in this webinar will learn the fundamentals of the three hardware approaches that are receiving significant investments and demonstrating significant promise for AI applications.
- neuromorphic/neurosynaptic architectures (brain-inspired hardware)
- GPUs (graphics processing units, optimized for AI algorithms), and
- quantum computers (based on principles and properties of quantum-mechanics rather than binary logic).
Note - This webinar requires no previous knowledge of hardware or computer architectures.
Sqrrl February Webinar: Breaking Down Data SilosSqrrl
In this talk, Adam Fuchs, the CTO of Sqrrl and co-founder of the Accumulo project discusses some of the lessons learned for properly architecting, applying, and managing cell-level security labels in customer environments.
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...Timothy Spann
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipelines
https://www.meetup.com/futureofdata-newyork/events/298660453/
Unlocking Financial Data with Real-Time Pipelines
(Flink Analytics on Stocks with SQL )
By Timothy Spann
Financial institutions thrive on accurate and timely data to drive critical decision-making processes, risk assessments, and regulatory compliance. However, managing and processing vast amounts of financial data in real-time can be a daunting task. To overcome this challenge, modern data engineering solutions have emerged, combining powerful technologies like Apache Flink, Apache NiFi, Apache Kafka, and Iceberg to create efficient and reliable real-time data pipelines. In this talk, we will explore how this technology stack can unlock the full potential of financial data, enabling organizations to make data-driven decisions swiftly and with confidence.
Introduction: Financial institutions operate in a fast-paced environment where real-time access to accurate and reliable data is crucial. Traditional batch processing falls short when it comes to handling rapidly changing financial markets and responding to customer demands promptly. In this talk, we will delve into the power of real-time data pipelines, utilizing the strengths of Apache Flink, Apache NiFi, Apache Kafka, and Iceberg, to unlock the potential of financial data. I will be utilizing NiFi 2.0 with Python and Vector Databases.
Timothy Spann
Principal Developer Advocate, Cloudera
Tim Spann is a Principal Developer Advocate in Data In Motion for Cloudera. He works with Apache NiFi, Apache Kafka, Apache Pulsar, Apache Flink, Flink SQL, Apache Pinot, Trino, Apache Iceberg, DeltaLake, Apache Spark, Big Data, IoT, Cloud, AI/DL, machine learning, and deep learning. Tim has over ten years of experience with the IoT, big data, distributed computing, messaging, streaming technologies, and Java programming. Previously, he was a Developer Advocate at StreamNative, Principal DataFlow Field Engineer at Cloudera, a Senior Solutions Engineer at Hortonworks, a Senior Solutions Architect at AirisData, a Senior Field Engineer at Pivotal and a Team Leader at HPE. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton & NYC on Big Data, Cloud, IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as ApacheCon, DeveloperWeek, Pulsar Summit and many more. He holds a BS and MS in computer science.
https://twitter.com/PaaSDev
https://www.linkedin.com/in/timothyspann/
https://medium.com/@tspann
https://github.com/tspannhw/FLiPStackWeekly/
Preparing for the Cybersecurity RenaissanceCloudera, Inc.
We are in the midst of a fundamental shift in the way in which organizations protect themselves from the modern adversary.
Traditional rules based cybersecurity applications of the past are not able to protect organizations in the new mobile, social, and hyper-connected world they now operate within. However, the convergence of big data technology, analytic advancements, and a variety of other factors have sparked a cybersecurity renaissance that will forever change the way in which organizations protect themselves.
Join Rocky DeStefano, Cloudera's Cybersecurity subject matter expert, as he explores how modern organizations are protecting themselves from more frequent, sophisticated attacks.
During this webinar you will learn about:
The current challenges cybersecurity professionals are facing today
How big data technologies are extending the capabilities of cybersecurity applications
Cloudera customers that are future proofing their cybersecurity posture with Cloudera’s next generation data and analytics management system
What is Splunk? At the end of this session you’ll have a high-level understanding of the pieces that make up the Splunk Platform, how it works, and how it fits in the landscape of Big Data. You’ll see practical examples that differentiate Splunk while demonstrating how to gain quick time to value.
Daten anonymisieren und pseudonymisieren in Splunk Enterprisejenny_splunk
Es gibt unterschiedlichste Gründe, warum Maschinendaten vor unberechtigten Zugriffen geschützt werden sollten. Interne und Externe Compliance Vorgaben sowie "Privacy by Design" Strategien zur Verbesserung der Sicherheit oder als Teil einer Risiko-Minimierungsstrategie werden für Unternehmen im Big Data Bereich immer wichtiger.
In diesem Webinar erfahren Sie, wie Sie Ihre Maschinendaten auf unterschiedlichen Ebenen schützen:
- in Motion: sichern Sie die Verbindungen von und zu Splunk Enterprise ab
- Datenintegrität: stellen Sie die Datenintegrität der in Splunk gespeicherten Daten sicher
- At Rest: verschlüsseln Sie alle Daten, die Splunk auf Disk schreibt
- Einzelne sensible Felder in Ihren Maschinendaten anonymisieren / pseudonymisieren
What is Splunk? At the end of this session you’ll have a high-level understanding of the pieces that make up the Splunk Platform, how it works, and how it fits in the landscape of Big Data. You’ll see practical examples that differentiate Splunk while demonstrating how to gain quick time to value.
Many Organizations are currently processing various types of data and in different formats. Most often this data will be in free form, As the consumers of this data growing it’s imperative that this free-flowing data needs to adhere to a schema. It will help data consumers to have an expectation of about the type of data they are getting and also they will be able to avoid immediate impact if the upstream source changes its format. Having a uniform schema representation also gives the Data Pipeline a really easy way to integrate and support various systems that use different data formats.
SchemaRegistry is a central repository for storing, evolving schemas. It provides an API & tooling to help developers and users to register a schema and consume that schema without having any impact if the schema changed. Users can tag different schemas and versions, register for notifications of schema changes with versions etc.
In this talk, we will go through the need for a schema registry and schema evolution and showcase the integration with Apache NiFi, Apache Kafka, Apache Storm.
There is increasing need for large-scale recommendation systems. Typical solutions rely on periodically retrained batch algorithms, but for massive amounts of data, training a new model could take hours. This is a problem when the model needs to be more up-to-date. For example, when recommending TV programs while they are being transmitted the model should take into consideration users who watch a program at that time.
The promise of online recommendation systems is fast adaptation to changes, but methods of online machine learning from streams is commonly believed to be more restricted and hence less accurate than batch trained models. Combining batch and online learning could lead to a quickly adapting recommendation system with increased accuracy. However, designing a scalable data system for uniting batch and online recommendation algorithms is a challenging task. In this talk we present our experiences in creating such a recommendation engine with Apache Flink and Apache Spark.
DeepLearning is not just a hype - it outperforms state-of-the-art ML algorithms. One by one. In this talk we will show how DeepLearning can be used for detecting anomalies on IoT sensor data streams at high speed using DeepLearning4J on top of different BigData engines like ApacheSpark and ApacheFlink. Key in this talk is the absence of any large training corpus since we are using unsupervised machine learning - a domain current DL research threats step-motherly. As we can see in this demo LSTM networks can learn very complex system behavior - in this case data coming from a physical model simulating bearing vibration data. Once draw back of DeepLearning is that normally a very large labaled training data set is required. This is particularly interesting since we can show how unsupervised machine learning can be used in conjunction with DeepLearning - no labeled data set is necessary. We are able to detect anomalies and predict braking bearings with 10 fold confidence. All examples and all code will be made publicly available and open sources. Only open source components are used.
QE automation for large systems is a great step forward in increasing system reliability. In the big-data world, multiple components have to come together to provide end-users with business outcomes. This means, that QE Automations scenarios need to be detailed around actual use cases, cross-cutting components. The system tests potentially generate large amounts of data on a recurring basis, verifying which is a tedious job. Given the multiple levels of indirection, the false positives of actual defects are higher, and are generally wasteful.
At Hortonworks, we’ve designed and implemented Automated Log Analysis System - Mool, using Statistical Data Science and ML. Currently the work in progress has a batch data pipeline with a following ensemble ML pipeline which feeds into the recommendation engine. The system identifies the root cause of test failures, by correlating the failing test cases, with current and historical error records, to identify root cause of errors across multiple components. The system works in unsupervised mode with no perfect model/stable builds/source-code version to refer to. In addition the system provides limited recommendations to file/open past tickets and compares run-profiles with past runs.
Improving business performance is never easy! The Natixis Pack is like Rugby. Working together is key to scrum success. Our data journey would undoubtedly have been so much more difficult if we had not made the move together.
This session is the story of how ‘The Natixis Pack’ has driven change in its current IT architecture so that legacy systems can leverage some of the many components in Hortonworks Data Platform in order to improve the performance of business applications. During this session, you will hear:
• How and why the business and IT requirements originated
• How we leverage the platform to fulfill security and production requirements
• How we organize a community to:
o Guard all the players, no one gets left on the ground!
o Us the platform appropriately (Not every problem is eligible for Big Data and standard databases are not dead)
• What are the most usable, the most interesting and the most promising technologies in the Apache Hadoop community
We will finish the story of a successful rugby team with insight into the special skills needed from each player to win the match!
DETAILS
This session is part business, part technical. We will talk about infrastructure, security and project management as well as the industrial usage of Hive, HBase, Kafka, and Spark within an industrial Corporate and Investment Bank environment, framed by regulatory constraints.
HBase hast established itself as the backend for many operational and interactive use-cases, powering well-known services that support millions of users and thousands of concurrent requests. In terms of features HBase has come a long way, overing advanced options such as multi-level caching on- and off-heap, pluggable request handling, fast recovery options such as region replicas, table snapshots for data governance, tuneable write-ahead logging and so on. This talk is based on the research for the an upcoming second release of the speakers HBase book, correlated with the practical experience in medium to large HBase projects around the world. You will learn how to plan for HBase, starting with the selection of the matching use-cases, to determining the number of servers needed, leading into performance tuning options. There is no reason to be afraid of using HBase, but knowing its basic premises and technical choices will make using it much more successful. You will also learn about many of the new features of HBase up to version 1.3, and where they are applicable.
There has been an explosion of data digitising our physical world – from cameras, environmental sensors and embedded devices, right down to the phones in our pockets. Which means that, now, companies have new ways to transform their businesses – both operationally, and through their products and services – by leveraging this data and applying fresh analytical techniques to make sense of it. But are they ready? The answer is “no” in most cases.
In this session, we’ll be discussing the challenges facing companies trying to embrace the Analytics of Things, and how Teradata has helped customers work through and turn those challenges to their advantage.
In this talk, we will present a new distribution of Hadoop, Hops, that can scale the Hadoop Filesystem (HDFS) by 16X, from 70K ops/s to 1.2 million ops/s on Spotiy's industrial Hadoop workload. Hops is an open-source distribution of Apache Hadoop that supports distributed metadata for HSFS (HopsFS) and the ResourceManager in Apache YARN. HopsFS is the first production-grade distributed hierarchical filesystem to store its metadata normalized in an in-memory, shared nothing database. For YARN, we will discuss optimizations that enable 2X throughput increases for the Capacity scheduler, enabling scalability to clusters with >20K nodes. We will discuss the journey of how we reached this milestone, discussing some of the challenges involved in efficiently and safely mapping hierarchical filesystem metadata state and operations onto a shared-nothing, in-memory database. We will also discuss the key database features needed for extreme scaling, such as multi-partition transactions, partition-pruned index scans, distribution-aware transactions, and the streaming changelog API. Hops (www.hops.io) is Apache-licensed open-source and supports a pluggable database backend for distributed metadata, although it currently only support MySQL Cluster as a backend. Hops opens up the potential for new directions for Hadoop when metadata is available for tinkering in a mature relational database.
In high-risk manufacturing industries, regulatory bodies stipulate continuous monitoring and documentation of critical product attributes and process parameters. On the other hand, sensor data coming from production processes can be used to gain deeper insights into optimization potentials. By establishing a central production data lake based on Hadoop and using Talend Data Fabric as a basis for a unified architecture, the German pharmaceutical company HERMES Arzneimittel was able to cater to compliance requirements as well as unlock new business opportunities, enabling use cases like predictive maintenance, predictive quality assurance or open world analytics. Learn how the Talend Data Fabric enabled HERMES Arzneimittel to become data-driven and transform Big Data projects from challenging, hard to maintain hand-coding jobs to repeatable, future-proof integration designs.
Talend Data Fabric combines Talend products into a common set of powerful, easy-to-use tools for any integration style: real-time or batch, big data or master data management, on-premises or in the cloud.
While you could be tempted assuming data is already safe in a single Hadoop cluster, in practice you have to plan for more. Questions like: "What happens if the entire datacenter fails?, or "How do I recover into a consistent state of data, so that applications can continue to run?" are not a all trivial to answer for Hadoop. Did you know that HDFS snapshots are handling open files not as immutable? Or that HBase snapshots are executed asynchronously across servers and therefore cannot guarantee atomicity for cross region updates (which includes tables)? There is no unified and coherent data backup strategy, nor is there tooling available for many of the included components to build such a strategy. The Hadoop distributions largely avoid this topic as most customers are still in the "single use-case" or PoC phase, where data governance as far as backup and disaster recovery (BDR) is concerned are not (yet) important. This talk first is introducing you to the overarching issue and difficulties of backup and data safety, looking at each of the many components in Hadoop, including HDFS, HBase, YARN, Oozie, the management components and so on, to finally show you a viable approach using built-in tools. You will also learn not to take this topic lightheartedly and what is needed to implement and guarantee a continuous operation of Hadoop cluster based solutions.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Identify Pain Point
It is hard to find attackers moving from a beechhead machine to more interesting machines in a sea of login data
Every Windows machine has tens to hundreds of logins a day normally
Project datasets
Use a subset of windows event login data most likely to contain attacker movement
Constrain Output
LMs will only be of certain login chain shapes
Localized in time
Identify Algorithms
Use modern classifiers on projected log event set to identify logins that are more or less likely to be in a LM
Use motif search algorithms to find patterns of logins that fit the output constraints
Self-learning and feedback