This document discusses using machine learning and Python to detect malicious URLs. It presents a threat science framework with stages including know the user, know the threat, data acquisition and understanding, feature engineering, modeling and evaluation, and deployment. For detecting malicious URLs specifically, it describes collecting benign and malicious URL data, exploring and engineering features, using models like random forest and deep neural networks, and evaluating performance with metrics like F1 score and confusion matrices. Parameter tuning and model explainability are also covered. The overall goal is to build an intelligent ecosystem of ML models to provide superior cyber defense against evolving threats.
Practical Applications of Machine Learning in Cybersecurityscoopnewsgroup
This document discusses machine learning and analytics applications in cybersecurity. It provides an overview of machine learning concepts and terms. It then discusses McAfee's analytic ecosystem and how machine learning, deep learning, and AI are applied across McAfee products. The document outlines risks in analytic development like bias, adversarial machine learning, and lack of explainability. It emphasizes the importance of an analytic development protocol that includes validation, verification, and risk assessment. The goal is to develop analytics in a responsible way and mitigate hype around new techniques.
Application of Machine Learning in CybersecurityPratap Dangeti
The document discusses applying machine learning techniques in cybersecurity. It provides examples of using ML for automatic intrusion detection, including phishing URL detection, malware detection, network behavior anomaly detection, and insider threat detection. Additional applications covered include assessing password strength and using deep steganography for encrypting messages. The document references several datasets and outlines the machine learning workflow and evaluation metrics for each application.
When Cyber Security Meets Machine LearningLior Rokach
This document discusses machine learning approaches for cyber security, specifically malware detection. It begins with an introduction to cyber security and machine learning. It then discusses using machine learning for malware detection, including analyzing files through static and dynamic analysis. The document outlines extracting features from files and using text categorization approaches. It evaluates various machine learning classifiers and features for malware detection. Finally, it discusses applying these techniques on Android devices for abnormal state detection.
How is ai important to the future of cyber security Robert Smith
Today’s era is driven by technology in every aspect of our lives, so much that we’ve now increased our dependence on technology on a daily basis. With an increase in the dependency, we’re now very vulnerable and exposed to the intermittent threat posed as cyber-attacks. Cyber-attack threats have plagued businesses, corporates, governments, and institutions.
Machine learning and artificial intelligence techniques are increasingly being used in cyber security to detect threats like malware, fraud, and intrusions. By analyzing large amounts of data, machine learning algorithms can learn patterns of both normal and anomalous behavior and make predictions about new or unseen data. This allows threats to be identified more accurately and in real-time without being explicitly programmed. Some key benefits of machine learning for cyber security include improved spam filtering, malware detection, identifying advanced threats, and detecting insider threats and data leaks. It is helping to address challenges of data overload, speed of threats, and unknown threats that traditional rule-based detection was unable to handle effectively.
Overview of Artificial Intelligence in CybersecurityOlivier Busolini
If you are interested in understsanding a bit more the potential of Artifical Intelligence in Cybersecurity, you might want to have a look at this overview.
Written from my CISO -and non AI expert- point of view, for fellow security professional to navigate the AI hype, and (hopefully!) make better, informed decisions :-)
All feedback welcome !
Combating Cyber Security Using Artificial IntelligenceInderjeet Singh
Cyber Security & Data Protection India Summit 2018 aims to convene the best minds in Cybersecurity under one roof to create an interactive milieu for exchange of knowledge and ideas. The event will endeavour to address the emerging and continuing threats to Cybersecurity and its changing landscape, as well as respond to increasing risk of security breaches and security governance, application security, cloud based security, Network, Mobile and endpoint security and other cyber risks in the India and abroad.
Practical Applications of Machine Learning in Cybersecurityscoopnewsgroup
This document discusses machine learning and analytics applications in cybersecurity. It provides an overview of machine learning concepts and terms. It then discusses McAfee's analytic ecosystem and how machine learning, deep learning, and AI are applied across McAfee products. The document outlines risks in analytic development like bias, adversarial machine learning, and lack of explainability. It emphasizes the importance of an analytic development protocol that includes validation, verification, and risk assessment. The goal is to develop analytics in a responsible way and mitigate hype around new techniques.
Application of Machine Learning in CybersecurityPratap Dangeti
The document discusses applying machine learning techniques in cybersecurity. It provides examples of using ML for automatic intrusion detection, including phishing URL detection, malware detection, network behavior anomaly detection, and insider threat detection. Additional applications covered include assessing password strength and using deep steganography for encrypting messages. The document references several datasets and outlines the machine learning workflow and evaluation metrics for each application.
When Cyber Security Meets Machine LearningLior Rokach
This document discusses machine learning approaches for cyber security, specifically malware detection. It begins with an introduction to cyber security and machine learning. It then discusses using machine learning for malware detection, including analyzing files through static and dynamic analysis. The document outlines extracting features from files and using text categorization approaches. It evaluates various machine learning classifiers and features for malware detection. Finally, it discusses applying these techniques on Android devices for abnormal state detection.
How is ai important to the future of cyber security Robert Smith
Today’s era is driven by technology in every aspect of our lives, so much that we’ve now increased our dependence on technology on a daily basis. With an increase in the dependency, we’re now very vulnerable and exposed to the intermittent threat posed as cyber-attacks. Cyber-attack threats have plagued businesses, corporates, governments, and institutions.
Machine learning and artificial intelligence techniques are increasingly being used in cyber security to detect threats like malware, fraud, and intrusions. By analyzing large amounts of data, machine learning algorithms can learn patterns of both normal and anomalous behavior and make predictions about new or unseen data. This allows threats to be identified more accurately and in real-time without being explicitly programmed. Some key benefits of machine learning for cyber security include improved spam filtering, malware detection, identifying advanced threats, and detecting insider threats and data leaks. It is helping to address challenges of data overload, speed of threats, and unknown threats that traditional rule-based detection was unable to handle effectively.
Overview of Artificial Intelligence in CybersecurityOlivier Busolini
If you are interested in understsanding a bit more the potential of Artifical Intelligence in Cybersecurity, you might want to have a look at this overview.
Written from my CISO -and non AI expert- point of view, for fellow security professional to navigate the AI hype, and (hopefully!) make better, informed decisions :-)
All feedback welcome !
Combating Cyber Security Using Artificial IntelligenceInderjeet Singh
Cyber Security & Data Protection India Summit 2018 aims to convene the best minds in Cybersecurity under one roof to create an interactive milieu for exchange of knowledge and ideas. The event will endeavour to address the emerging and continuing threats to Cybersecurity and its changing landscape, as well as respond to increasing risk of security breaches and security governance, application security, cloud based security, Network, Mobile and endpoint security and other cyber risks in the India and abroad.
A technical seminar delivered on Machine learning in cybersecurity. Machine learning is trending and desired subject this presentation demonstrates how machine learning can be used to protect IT infrastructure
Priyanshu Ratnakar is an Indian teen entrepreneur and founder of Protocol X. He discusses artificial intelligence and how it can help with cybersecurity. Machine learning uses neural networks to classify data with a reasonable degree of certainty and can modify its analysis to improve over time. Deep learning extends machine learning capabilities across multilayered neural networks to learn from massive amounts of data and perform advanced tasks like cancer detection. Artificial intelligence needs large relevant data sets and specific rules to examine the data in order to make useful decisions.
AI shows promise to help address challenges in cybersecurity by automating tasks, enhancing human abilities, and detecting complex patterns that humans cannot. However, developing effective AI solutions is difficult and requires expertise in both cybersecurity and data science. When evaluating AI products, organizations should consider factors like data and training requirements, error rates, integration with existing tools and processes, and potential new risks introduced. While AI may help alleviate strain on security teams, its use is still nascent, and human oversight will likely remain important.
Security in the age of Artificial IntelligenceFaction XYZ
The document discusses how artificial intelligence will impact security and introduces both opportunities and challenges. It describes current AI techniques like deep learning and how they are being applied to security domains such as malware detection, network anomaly detection, and insider threat detection. While AI has the potential to make systems more scalable and adaptive, it also introduces new vulnerabilities if misused to generate sophisticated attacks. The document argues for developing morality systems to ensure autonomous systems continue making moral decisions even if compromised.
The document discusses cybersecurity, artificial intelligence, and how AI can help improve cybersecurity. It notes that while organizations spend billions on cybersecurity, chief information security officers still feel highly exposed. Traditional security methods focus on preventing infiltration but are always one step behind evolving threats. The document argues that AI can help enforce cyber hygiene practices like least privilege to shrink the attack surface, making the problem more bounded and manageable compared to always chasing threats. It discusses how AI is well-suited for understanding intended application behavior based on established rules and data from good software.
With the increasingly connected world revolving around the revolution of internet and new technologies like mobiles, smartphones, and tablets, and with the wide usage of wireless technologies, the information security risks have increased. Both individuals and organizations are under regular attacks for commercial or non-commercial gains. The objectives of such attacks may be to take revenge, malign the reputation of a competitor organization, understand the strategies and sensitive information about the competitor, simply have fun of exploiting the vulnerabilities. Hence, the need to protect information assets and ensure information security receives adequate attention.
In this session, I will discuss how AI and Machine Learning can be applied in detecting, predicting and preventing cyber security/information security vulnerabilities and what are the benefits of using Machine Learning and AI. We also touch upon some of the tools available to perform the same.
The role of big data, artificial intelligence and machine learning in cyber i...Aladdin Dandis
The document discusses the role of big data, artificial intelligence, and machine learning in cyber intelligence. It provides definitions of cyber intelligence and distinguishes between raw threat data and true threat intelligence. The document also outlines drivers for adopting AI-based cybersecurity technologies, including accelerating incident detection and response as well as improving risk communication and situational awareness. A cyber intelligence framework is proposed that involves collecting security data from various sources, processing the data using machine learning algorithms, and generating reports and alerts. Challenges with implementing such a framework are also noted.
“AI techniques in cyber-security applications”. Flammini lnu susec19Francesco Flammini
The document discusses using artificial intelligence techniques like Bayesian networks and event trees for cybersecurity applications. It describes how these techniques can help address issues with security operations centers being overwhelmed by too much information from various sensors and systems. Bayesian networks and event trees can help fuse data from different sources to detect threats more effectively. The document provides examples of how Bayesian networks can be built using historical threat data and customized for specific organizations. It also discusses how these models can be updated dynamically based on real-time data from systems.
The document is a presentation on threat hunting with Splunk. It discusses threat hunting basics, data sources for threat hunting, knowing your endpoint, and using the cyber kill chain framework. It outlines an agenda that includes a hands-on walkthrough of an attack scenario using Splunk's core capabilities. It also discusses advanced threat hunting techniques and tools, enterprise security walkthroughs, and applying machine learning and data science to security.
The document discusses the role of artificial intelligence in cyber security, explaining that AI and machine learning techniques can be used to detect cyber threats by analyzing large datasets to recognize abnormal behavior, and that AI approaches include defensive security applications like malware detection as well as offensive techniques such as creating conditional attacks. Key considerations for adopting AI-based cybersecurity platforms include how the system learns, data and resource requirements, and error rates.
How Machine Learning & AI Will Improve Cyber SecurityDevOps.com
Machine Learning (ML) and Artificial Intelligence (AI) have been proclaimed as perhaps the next great leap in human quality of life, as well as a potential reason for our extinction. Somewhere in between lies how ML & AI can potentially improve our Cyber Security efforts. But are ML & AI a true panacea or merely the next shiny trinket for the cyber industry to fixate on? In this webinar we will explore:
How ML & AI are currently being utilized in cyber security efforts.
What is working and what has not worked
What is on the both the short term and near-term horizon for ML &AI
Practical steps you can take now to begin leveraging these technologies to tangibly improve your cyber security posture
Join our panel of industry experts as we explore this brave new frontier in cyber security with a candid look cutting through the hype.
Threat hunting involves proactively searching networks to detect threats like advanced persistent threats that evade existing security systems. It is done through a hunting loop of forming hypotheses based on analytics, intelligence, or situational awareness, investigating through tools and data, uncovering patterns and indicators, and informing analytics. Various methods can be used for hunting like DNS fuzzing to find malicious domains, analyzing passive DNS data, web server logs, emails, and Windows logs. Open source tools used include Maeltego CE, YARA, and AIEngine, while commercial tools are Sqrrl, Exabeam, Infocyte HUNT, Mantix4, and AI Hunter.
cybersecurity strategy planning in the banking sectorOlivier Busolini
Olivier Busolini discusses cybersecurity strategy planning in the banking sector. He outlines an approach that includes understanding business risks, assessing gaps, agile planning, implementation, and monitoring. Key aspects are controls hygiene and compliance using frameworks like NIST and ANSSI. A security program should focus on people, processes, infrastructure, applications, and data, and increase maturity over multiple years. Risks and tips from experience are also covered, like focusing on people, defining risk appetite, and ensuring budget supports ongoing work.
The document discusses machine learning and its applications in cyber security. It provides an introduction to machine learning and how it is used to analyze large amounts of data and make decisions without being explicitly programmed. Examples of machine learning applications discussed include recommendation systems, activity recognition, weather forecasting, and image processing. The document also discusses how machine learning is being applied in cyber security to help detect sophisticated cyber attacks.
Use of Artificial Intelligence in Cyber Security - Avantika UniversityAvantika University
There are many uses of artificial intelligence in cyber security. Although artificial intelligence has so many advantages over human intelligence, it is dependent on humans. Due to the ever-increasing demand for engineers, there is a bright scope in the field of cyber security. Avantika University is one of the top engineering colleges in India.
To know more details, visit us at : https://www.avantikauniversity.edu.in/engineering-colleges/use-of-artificial-intelligence-in-cyber-security.php
Insider Threats Detection in Cloud using UEBALucas Ko
Lucas Ko presented on detecting insider threats in the cloud using User and Entity Behavior Analytics (UEBA). The system collects Google Drive access logs and the directory tree structure to build a collaborative filtering recommendation model. It detects anomalies by measuring file proximity scores based on access behaviors and flagging uncommon cross-group access. The system was able to identify high-risk users improperly collecting files, compromised accounts, and a shared account being abused in case studies.
From SIEM to SOC: Crossing the Cybersecurity ChasmPriyanka Aash
You own a SIEM, but to be secure, you need a Security Operations Center! How do you cross the chasm? Do you hire staff or outsource? And what skills are needed? Mike Ostrowski, a cybersecurity industry veteran, will review common pitfalls experienced through the journey from SIEM to SOC, the pros and cons of an all in-house SOC vs. outsourcing, and the benefits of a hybrid SOC model.
Learning Objectives:
1: You own a SIEM, but to be secure, you need a SOC. How do you cross the chasm?
2: What are the pros and cons of in-house, fully managed and hybrid security?
3: What considerations go into deciding whether to employ a hybrid strategy?
(Source: RSA Conference USA 2018)
Cyber Security Trends
Business Concerns
Cyber Threats
The Solutions
Security Operation Center
requirement
SOC Architecture model
SOC Implementation
SOC & NOC
SOC & CSIRT
SIEM & Correlation
-----------------------------------------------------------
Definition
Gartner defines a SOC as both a team, often operating in shifts around the clock, and a facility dedicated to and organized to prevent, detect, assess and respond to cybersecurity threats and incidents, and to fulfill and assess regulatory compliance. The term "cybersecurity operation center "is often used synonymously for SOC.
A network operations center (NOC) is not a SOC, which focuses on network device management rather than detecting and responding to cybersecurity incidents. Coordination between the two is common, however.
A managed security service is not the same as having a SOC — although a service provider may offer services from a SOC. A managed service is a shared resource and not solely dedicated to a single organization or entity. Similarly, there is no such thing as a managed SOC.
Most of the technologies, processes and best practices that are used in a SOC are not specific to a SOC. Incident response or vulnerability management remain the same, whether delivered from a SOC or not. It is a meta-topic, involving many security domains and disciplines, and depending on the services and functions that are delivered by the SOC.
Services that often reside in a SOC are:
• Cyber security incident response
• Malware analysis
• Forensic analysis
• Threat intelligence analysis
• Risk analytics and attack path modeling
• Countermeasure implementation
• Vulnerability assessment
• Vulnerability analysis
• Penetration testing
• Remediation prioritization and coordination
• Security intelligence collection and fusion
• Security architecture design
• Security consulting
• Security awareness training
• Security audit data collection and distribution
Alternative names for SOC :
Security defense center (SDC)
Security intelligence center
Cyber security center
Threat defense center
security intelligence and operations center (SIOC)
Infrastructure Protection Centre (IPC)
مرکز عملیات امنیت
AI Cybersecurity: Pros & Cons. AI is reshaping cybersecurityTasnim Alasali
Discover how AI is reshaping cybersecurity. This presentation delves into AI's role in enhancing threat detection, the balance of innovation and risk, and the strategies shaping the future of digital defense.
Architecting trust in the digital landscape, or lack thereofJonathan Sinclair
This document discusses the zero-trust security model and its implementation challenges. It notes that many data breaches are caused by internal actors like employees. The zero-trust model proposes restricting access and assuming all users may be compromised. However, fully implementing it poses architectural complexities and risks hindering productivity. True security requires balancing controls with usability. Emerging technologies like blockchain and distributed ledgers may help establish new chains of trust across systems. Overall, simplification is needed as complexity breeds new vulnerabilities. There are no perfect solutions, only ongoing efforts to strengthen security through principles like transparency, resiliency and accountability.
A technical seminar delivered on Machine learning in cybersecurity. Machine learning is trending and desired subject this presentation demonstrates how machine learning can be used to protect IT infrastructure
Priyanshu Ratnakar is an Indian teen entrepreneur and founder of Protocol X. He discusses artificial intelligence and how it can help with cybersecurity. Machine learning uses neural networks to classify data with a reasonable degree of certainty and can modify its analysis to improve over time. Deep learning extends machine learning capabilities across multilayered neural networks to learn from massive amounts of data and perform advanced tasks like cancer detection. Artificial intelligence needs large relevant data sets and specific rules to examine the data in order to make useful decisions.
AI shows promise to help address challenges in cybersecurity by automating tasks, enhancing human abilities, and detecting complex patterns that humans cannot. However, developing effective AI solutions is difficult and requires expertise in both cybersecurity and data science. When evaluating AI products, organizations should consider factors like data and training requirements, error rates, integration with existing tools and processes, and potential new risks introduced. While AI may help alleviate strain on security teams, its use is still nascent, and human oversight will likely remain important.
Security in the age of Artificial IntelligenceFaction XYZ
The document discusses how artificial intelligence will impact security and introduces both opportunities and challenges. It describes current AI techniques like deep learning and how they are being applied to security domains such as malware detection, network anomaly detection, and insider threat detection. While AI has the potential to make systems more scalable and adaptive, it also introduces new vulnerabilities if misused to generate sophisticated attacks. The document argues for developing morality systems to ensure autonomous systems continue making moral decisions even if compromised.
The document discusses cybersecurity, artificial intelligence, and how AI can help improve cybersecurity. It notes that while organizations spend billions on cybersecurity, chief information security officers still feel highly exposed. Traditional security methods focus on preventing infiltration but are always one step behind evolving threats. The document argues that AI can help enforce cyber hygiene practices like least privilege to shrink the attack surface, making the problem more bounded and manageable compared to always chasing threats. It discusses how AI is well-suited for understanding intended application behavior based on established rules and data from good software.
With the increasingly connected world revolving around the revolution of internet and new technologies like mobiles, smartphones, and tablets, and with the wide usage of wireless technologies, the information security risks have increased. Both individuals and organizations are under regular attacks for commercial or non-commercial gains. The objectives of such attacks may be to take revenge, malign the reputation of a competitor organization, understand the strategies and sensitive information about the competitor, simply have fun of exploiting the vulnerabilities. Hence, the need to protect information assets and ensure information security receives adequate attention.
In this session, I will discuss how AI and Machine Learning can be applied in detecting, predicting and preventing cyber security/information security vulnerabilities and what are the benefits of using Machine Learning and AI. We also touch upon some of the tools available to perform the same.
The role of big data, artificial intelligence and machine learning in cyber i...Aladdin Dandis
The document discusses the role of big data, artificial intelligence, and machine learning in cyber intelligence. It provides definitions of cyber intelligence and distinguishes between raw threat data and true threat intelligence. The document also outlines drivers for adopting AI-based cybersecurity technologies, including accelerating incident detection and response as well as improving risk communication and situational awareness. A cyber intelligence framework is proposed that involves collecting security data from various sources, processing the data using machine learning algorithms, and generating reports and alerts. Challenges with implementing such a framework are also noted.
“AI techniques in cyber-security applications”. Flammini lnu susec19Francesco Flammini
The document discusses using artificial intelligence techniques like Bayesian networks and event trees for cybersecurity applications. It describes how these techniques can help address issues with security operations centers being overwhelmed by too much information from various sensors and systems. Bayesian networks and event trees can help fuse data from different sources to detect threats more effectively. The document provides examples of how Bayesian networks can be built using historical threat data and customized for specific organizations. It also discusses how these models can be updated dynamically based on real-time data from systems.
The document is a presentation on threat hunting with Splunk. It discusses threat hunting basics, data sources for threat hunting, knowing your endpoint, and using the cyber kill chain framework. It outlines an agenda that includes a hands-on walkthrough of an attack scenario using Splunk's core capabilities. It also discusses advanced threat hunting techniques and tools, enterprise security walkthroughs, and applying machine learning and data science to security.
The document discusses the role of artificial intelligence in cyber security, explaining that AI and machine learning techniques can be used to detect cyber threats by analyzing large datasets to recognize abnormal behavior, and that AI approaches include defensive security applications like malware detection as well as offensive techniques such as creating conditional attacks. Key considerations for adopting AI-based cybersecurity platforms include how the system learns, data and resource requirements, and error rates.
How Machine Learning & AI Will Improve Cyber SecurityDevOps.com
Machine Learning (ML) and Artificial Intelligence (AI) have been proclaimed as perhaps the next great leap in human quality of life, as well as a potential reason for our extinction. Somewhere in between lies how ML & AI can potentially improve our Cyber Security efforts. But are ML & AI a true panacea or merely the next shiny trinket for the cyber industry to fixate on? In this webinar we will explore:
How ML & AI are currently being utilized in cyber security efforts.
What is working and what has not worked
What is on the both the short term and near-term horizon for ML &AI
Practical steps you can take now to begin leveraging these technologies to tangibly improve your cyber security posture
Join our panel of industry experts as we explore this brave new frontier in cyber security with a candid look cutting through the hype.
Threat hunting involves proactively searching networks to detect threats like advanced persistent threats that evade existing security systems. It is done through a hunting loop of forming hypotheses based on analytics, intelligence, or situational awareness, investigating through tools and data, uncovering patterns and indicators, and informing analytics. Various methods can be used for hunting like DNS fuzzing to find malicious domains, analyzing passive DNS data, web server logs, emails, and Windows logs. Open source tools used include Maeltego CE, YARA, and AIEngine, while commercial tools are Sqrrl, Exabeam, Infocyte HUNT, Mantix4, and AI Hunter.
cybersecurity strategy planning in the banking sectorOlivier Busolini
Olivier Busolini discusses cybersecurity strategy planning in the banking sector. He outlines an approach that includes understanding business risks, assessing gaps, agile planning, implementation, and monitoring. Key aspects are controls hygiene and compliance using frameworks like NIST and ANSSI. A security program should focus on people, processes, infrastructure, applications, and data, and increase maturity over multiple years. Risks and tips from experience are also covered, like focusing on people, defining risk appetite, and ensuring budget supports ongoing work.
The document discusses machine learning and its applications in cyber security. It provides an introduction to machine learning and how it is used to analyze large amounts of data and make decisions without being explicitly programmed. Examples of machine learning applications discussed include recommendation systems, activity recognition, weather forecasting, and image processing. The document also discusses how machine learning is being applied in cyber security to help detect sophisticated cyber attacks.
Use of Artificial Intelligence in Cyber Security - Avantika UniversityAvantika University
There are many uses of artificial intelligence in cyber security. Although artificial intelligence has so many advantages over human intelligence, it is dependent on humans. Due to the ever-increasing demand for engineers, there is a bright scope in the field of cyber security. Avantika University is one of the top engineering colleges in India.
To know more details, visit us at : https://www.avantikauniversity.edu.in/engineering-colleges/use-of-artificial-intelligence-in-cyber-security.php
Insider Threats Detection in Cloud using UEBALucas Ko
Lucas Ko presented on detecting insider threats in the cloud using User and Entity Behavior Analytics (UEBA). The system collects Google Drive access logs and the directory tree structure to build a collaborative filtering recommendation model. It detects anomalies by measuring file proximity scores based on access behaviors and flagging uncommon cross-group access. The system was able to identify high-risk users improperly collecting files, compromised accounts, and a shared account being abused in case studies.
From SIEM to SOC: Crossing the Cybersecurity ChasmPriyanka Aash
You own a SIEM, but to be secure, you need a Security Operations Center! How do you cross the chasm? Do you hire staff or outsource? And what skills are needed? Mike Ostrowski, a cybersecurity industry veteran, will review common pitfalls experienced through the journey from SIEM to SOC, the pros and cons of an all in-house SOC vs. outsourcing, and the benefits of a hybrid SOC model.
Learning Objectives:
1: You own a SIEM, but to be secure, you need a SOC. How do you cross the chasm?
2: What are the pros and cons of in-house, fully managed and hybrid security?
3: What considerations go into deciding whether to employ a hybrid strategy?
(Source: RSA Conference USA 2018)
Cyber Security Trends
Business Concerns
Cyber Threats
The Solutions
Security Operation Center
requirement
SOC Architecture model
SOC Implementation
SOC & NOC
SOC & CSIRT
SIEM & Correlation
-----------------------------------------------------------
Definition
Gartner defines a SOC as both a team, often operating in shifts around the clock, and a facility dedicated to and organized to prevent, detect, assess and respond to cybersecurity threats and incidents, and to fulfill and assess regulatory compliance. The term "cybersecurity operation center "is often used synonymously for SOC.
A network operations center (NOC) is not a SOC, which focuses on network device management rather than detecting and responding to cybersecurity incidents. Coordination between the two is common, however.
A managed security service is not the same as having a SOC — although a service provider may offer services from a SOC. A managed service is a shared resource and not solely dedicated to a single organization or entity. Similarly, there is no such thing as a managed SOC.
Most of the technologies, processes and best practices that are used in a SOC are not specific to a SOC. Incident response or vulnerability management remain the same, whether delivered from a SOC or not. It is a meta-topic, involving many security domains and disciplines, and depending on the services and functions that are delivered by the SOC.
Services that often reside in a SOC are:
• Cyber security incident response
• Malware analysis
• Forensic analysis
• Threat intelligence analysis
• Risk analytics and attack path modeling
• Countermeasure implementation
• Vulnerability assessment
• Vulnerability analysis
• Penetration testing
• Remediation prioritization and coordination
• Security intelligence collection and fusion
• Security architecture design
• Security consulting
• Security awareness training
• Security audit data collection and distribution
Alternative names for SOC :
Security defense center (SDC)
Security intelligence center
Cyber security center
Threat defense center
security intelligence and operations center (SIOC)
Infrastructure Protection Centre (IPC)
مرکز عملیات امنیت
AI Cybersecurity: Pros & Cons. AI is reshaping cybersecurityTasnim Alasali
Discover how AI is reshaping cybersecurity. This presentation delves into AI's role in enhancing threat detection, the balance of innovation and risk, and the strategies shaping the future of digital defense.
Architecting trust in the digital landscape, or lack thereofJonathan Sinclair
This document discusses the zero-trust security model and its implementation challenges. It notes that many data breaches are caused by internal actors like employees. The zero-trust model proposes restricting access and assuming all users may be compromised. However, fully implementing it poses architectural complexities and risks hindering productivity. True security requires balancing controls with usability. Emerging technologies like blockchain and distributed ledgers may help establish new chains of trust across systems. Overall, simplification is needed as complexity breeds new vulnerabilities. There are no perfect solutions, only ongoing efforts to strengthen security through principles like transparency, resiliency and accountability.
Cyber security and attack analysis : how Cisco uses graph analyticsLinkurious
Linkurious is a French startup that uses graph analytics and visualization to help organizations make sense of complex, interconnected data and gain insights. For example, Cisco uses graphs to model cybersecurity data like domains and IP addresses, allowing them to identify connections between known bad domains and previously unknown domains involved in attacks. They can then block these new domains to prevent further attacks. The document provides an example of how Cisco might use graph analysis and visualization to identify additional domains connected to an initial phishing attack and help prevent the attack from spreading.
AI Security : Machine Learning, Deep Learning and Computer Vision SecurityCihan Özhan
This document discusses technologies related to machine learning, deep learning, computer vision, and artificial intelligence. It covers topics such as ML/DL algorithms, applications, data objects, cloud computing services, distributed systems, security issues, model lifecycles, publishing ML projects, and adversarial attacks against various AI systems including image, speech, NLP, remote sensing, autonomous vehicles, and industrial applications. It also provides links to the founder's online profiles and contact information.
This document outlines a roadmap for developing an effective actionable threat intelligence program. It discusses what threat intelligence is, how it can enable businesses, and provides recommendations for collecting intelligence from internal and external sources. The roadmap involves initially developing a foundation, then formalizing processes, and moving toward maturity with a goal of demonstrating return on investment from averted threats.
A presentation on AI, Artificial Intelligence.
Intro of the Author
Automation vs AI
What is AI
History& Trends
Framework of Agents
Ethics
Social Economic Implications
High time to add machine learning to your information security stackMinhaz A V
Machine learning and deep learning techniques are increasingly being used for cybersecurity applications like malware detection, spam filtering, and anomaly detection. As attacks become more sophisticated, machine learning can help security teams focus on important threats by analyzing large amounts of data. While machine learning is a powerful tool, security experts still need to provide guidance on what problems to solve and how to structure machine learning pipelines and evaluate results. Individuals and organizations should embrace machine learning by participating in online courses and challenges to gain hands-on experience applying these techniques.
ARTIFICIAL INTELLIGENCE IN CYBER SECURITYCynthia King
Artificial intelligence techniques can help address challenges in cyber security that are difficult for humans to handle alone. Neural networks have proven effective for tasks like pattern recognition and classification that are well-suited to their speed of operation. Expert systems allow codifying security expertise to help with tasks like intrusion detection and response. As cyber threats evolve rapidly, applying learning approaches from artificial intelligence can help security systems adapt dynamically instead of relying only on fixed algorithms. Overall, artificial intelligence shows promise for enhancing cyber security capabilities by accelerating the intelligence of security systems.
[DSC Europe 23][AI:CSI] Goran Gvozden Improving Cybersecurity Posture with an...DataScienceConferenc1
The document discusses how an AI security assistant powered by large language models can help improve cybersecurity posture by automating the process of mapping vulnerabilities to known cyber threats, providing security teams with a better understanding of adversary tactics and helping prioritize risks and defenses. It outlines the challenges with traditional vulnerability and threat management approaches and describes how the AI system leverages techniques like semantic mapping and question answering to integrate different data sources and help organizations strengthen their security posture.
This document introduces Mike Goffin and provides an overview of the Cyber Threat Intelligence Repository (CRITs) platform. CRITs is described as a flexible malware and threat data repository that aggregates data from various sources and allows users to search, pivot, and correlate disparate data. It supports numerous object types and features like services, notifications, and relationships to enhance analysis capabilities. The document also outlines CRITs' core technologies, use cases, supported data types, and notable features like services, favorites, and grouping.
Cyber Defense - How to be prepared to APTSimone Onofri
This document provides an overview of a presentation on cyber defense and cyber attack simulations. It begins with an agenda and introductions. It then discusses the evolving threats landscape, with attacks increasing in scale, scope and sophistication. It outlines the cyber attack simulation methodology, including researching the target, infiltrating networks, establishing footholds, moving laterally and exfiltrating data. It describes three scenario examples - a web attack, phishing email, and exploiting physical access. Each scenario provides the rules of engagement, attack overview and lessons learned. It concludes with quotes emphasizing the importance of preparation and deception in warfare.
Necmiye Genc, SITA, at International Women's Day Global Event Series. The information security field is expected to see a deficit of 1.5 professionals by 2020. In the face of the desperate need for information security professionals, the report released by (ISC)2, the education and certification body of information security professionals, depicts that women have represented only 10% of the total security workforce. This talk aims to build awareness of the opportunities that exist in security for women of all backgrounds and to introduce advanced technologies such as analytics, threat intelligence and digital forensics to help burgeoning security professionals.
AI-Driven Logical Argumentation in Active Cyber DefenseShawn Riley
Shawn Riley discusses using artificial intelligence techniques like symbolic AI (top-down) and non-symbolic AI (bottom-up) to automate logical argumentation in active cyber defense. Symbolic AI uses deductive reasoning from existing knowledge to generate explanations, while non-symbolic AI uses inductive reasoning from data to generate predictions. Cognitive playbooks capture human reasoning to automate the claim, evidence, reasoning framework. The techniques help automate different parts of the cyber OODA loop like sensing, sense-making, decision-making, and acting with feedback to improve defenses.
This document provides an overview of Teri Radichel's background and experience in cybersecurity. It details her progression from software engineer to cloud architect and into cybersecurity roles. It lists her certifications, entrepreneurial ventures, speaking engagements, and publications. The document then discusses different career paths in cybersecurity including security operations, intrusion response, and working as a hacker or for the government/military. It provides examples of security assessments and reviews common frameworks, best practices, and regulations. Finally, it discusses getting a job in cybersecurity through skills acquisition, networking, and continuous learning.
The Role of Threat Intelligence and Layered Securiy for Intrusion Prevention ...JoAnna Cheshire
The document discusses the role of threat intelligence and layered security for intrusion prevention in the post-Target breach era. It defines threat intelligence as the real-time collection, normalization and analysis of data from users, applications and infrastructure that impacts security risk. Threat intelligence works with a layered security approach, where intelligence from various security tools is consolidated and used to block malware across the network. Publicly shared threat intelligence from security organizations can also improve protection when integrated into an organization's layered security and active threat intelligence strategy.
Artificial Intelligence Techniques for Cyber SecurityIRJET Journal
This document discusses how artificial intelligence techniques can help address challenges in cyber security. It describes how expert systems, neural networks, and intelligent agents are currently being used or could be used to improve intrusion detection, malware detection, and response times to cyber attacks. While AI shows promise in enhancing cyber security capabilities, the document also notes that AI systems have limitations and still require human guidance and training to effectively respond to intelligent adversaries. Overall, the document advocates for a combined human-AI approach to cyber security to take advantage of the capabilities of both.
This document discusses how artificial intelligence techniques can help address challenges in cyber security. It describes how expert systems, neural networks, and intelligent agents are currently being used or could be used to improve intrusion detection, malware detection, and response times to cyber attacks. While AI shows promise in enhancing cyber security abilities, the author notes it is not a complete solution on its own and still requires human guidance and training to address evolving security threats. Overall, the integration of AI and human experts is posited as a promising approach for cyber security.
The document provides an overview of an introductory course on artificial intelligence (AI), machine learning (ML), and deep learning (DL). Some key details include:
- The course title is AI (Machine Learning / Deep Learning) and runs for 6 months.
- The course aims to provide employable skills in AI programming, data science, deep learning, computer vision, natural language processing, and ML operations.
- Learning outcomes cover topics like AI fundamentals, data analytics, deep learning, computer vision, natural language processing, and core skills.
- The course prepares students for jobs like Python developer, data analyst, machine learning engineer, and more.
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
Machine Learning & Cyber Security: Detecting Malicious URLs in the Haystack
1. M A C H I N E L E A R N I N G & C Y B E R S E C U R I T Y
Detecting Malicious URLs
in the Haystack
2. Team
Triss
Data Scientist working in cyber security
Loves motorsport, chin-ups, learning and the
West Coast Eagles (mighty big birds!)
Alistair
@dizzy_data
Research Masters student working in
cyber security.
Enjoy random strolls in foreign cities.
4. So why are we here?
We have been working in data science and cyber security for some time...
We hope to share with you:
1. How Design Thinking can help teams create more meaningful machine learning products;
2. How data science frameworks can provide structure to machine learning product development;
3. How Python can make your machine learning dreams become reality
7. Method To The Madness
+
Design Thinking Data Science Process
8. Threat Science Framework
A framework for building human-centred machine learning in cyber security defence
Know The User
Modeling & Evaluation
Data Acquisition &
Understanding
Feature
Engineering
Deployment
Nail The Problem Ideate
Know The Threat
10. Know The User - Challenges
Management of numerous security tools
Alert fatigue
High staff turnover and knowledge loss
11. Know The User - Security Analyst Persona
Security Analysts working in Security Operations Centres tasked
with defending organisations against cyber adversarial threats
Goals:
- Maintain security architecture
- Defend against myriad threat vectors (Incident Response)
- Identify security flaws
Needs:
- Rapid incident response
- Rich tool set
- Coverage across the cyber attack life cycle
- Free time to work on interesting projects such as threat hunting
Pain points:
- Alert fatigue
- Lack of integration across tools and intelligence feeds
- Keeping up with a constantly evolving threat landscape. What will the next attack look like?
12. Nail the problem
Problem statements (POV)
{User} needs {User’s need} so that {benefit}
Security Teams are faced with a broad and complex threat landscape. Historically, the common
answer has been to focus on adopting numerous tools and staff to build any adequate defence.
However, this approach has proven to be unsustainable.
Security Analysts need rapid and intelligent cyber defence capability so that they can stand a chance
against a growing and often superior threat
13. Ideate - How Might We
How might we statements
How might we
Form the POV or Problem
Statement
- Brainstorming
(generate ideas from a seed question)
- Brainwriting
(each team member generates a few
ideas, think deeply about them, then
prioritise)
- Mindmapping
(grouping ideas together)
14. Ideate - The Vision
How might we build an automated and intelligent
ecosystem of machine learning models that work in
unison to provide superior defence against an ever-
evolving threat landscape
15. Ideate - One Prototype At A Time
Source: https://www.reddit.com/r/reactiongifs/
How might we detect malicious
URLs using machine learning and
Python
25. Modeling & Evaluation - Candidate Models
Random ForestDeep Neural Network Word Embedding
d
h
d
n
s
26. Modeling & Evaluation - Candidate Models
Random ForestDeep Neural Network Word Embedding
Models Deep Neural Network Random Forest Word Embedding
Accuracy 0.86 0.83 0.79
F1 Score 0.86 0.83 0.80
d
h
d
n
s
27. Modeling & Evaluation: F1 Score
Image source: https://towardsdatascience.com/precision-vs-recall-386cf9f89488
32. Shout outs
Katie Ford (@katiegford) for the wonderful artwork
Yi Fang and Paul who continue to push us
Research Papers and the wonderful Python community
33. References
Detecting malicious URLs using machine learning techniques
(Frank Vanhoenshoven ; Gonzalo Nápoles ; Rafael Falcon ; Koen Vanhoof ; Mario Köppen)
What is design thinking?
https://dribbble.com/stories/2019/03/22/what-is-design-thinking
Interactive Design
https://www.interaction-design.org/literature/article/stage-1-in-the-design-thinking-process-empathise-with-your-users
Inc - Brainstorming
Microsoft Data Science Process
https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/overview
34. References
Choi, H., Zhu, B.B. and Lee, H., 2011. Detecting Malicious Web Links and Identifying Their Attack Types. WebApps, 11(11),
p.218.
Bilge, L., Kirda, E., Kruegel, C. and Balduzzi, M., 2011, February. EXPOSURE: Finding Malicious Domains Using Passive
DNS Analysis. In Ndss (pp. 1-17).
Hi I’m Alistair…
Data Scientist in Security
Like Aussie Rules, Cooking ...
Hello I’m Triss,
Research master student working in security
Around about 5 years ago…
Developing websites for a number of clients.
One afternoon, I get a call from an angry client, informing me
I was embarrassed, was concerned for the client, and the incident cost a lot of time and money to fix.
Now, in some way, shape or form, a lot of us have probably come across malicious stuff on the Internet.
So why are we here?
We hope to share with you:
How Design Thinking can help teams create more meaningful machine learning products;
How data science frameworks can provide structure to machine learning product development;
How Python can make your machine learning dreams become reality
Back in 2018, Triss and I were very keen to use machine learning to detect bad stuff on networks. Our Team offered a bunch of standard detection use cases but we struggled to make a decision on what to work on.
We were jumping between open-source data sets, trying to build things, without a clear direction.
The detection use cases on a spreadsheet didnt resonated.
Constructive criticisms that came our way included:
What exactly is the problem you are trying to solve?
Can you even get real-world data for that use case?
Do you know what you’re doing?
This was mistake number 1. We hadn’t even thought about our end-user yet. We didn’t properly understand the domain we were trying to serve. And there we were attempting to jump straight into the code.
This is where Design Thinking was introduced to us. It’s not something you hear a lot about in Cyber Security and Data Science.
So what is design thinking?
Design thinking is a form of creative problem solving.
It provides a set of tools, for all folks, including non-creatives, to come up with great ideas to meaningful problems.
It forces team to focus on the end-user, which leads to better products.
If you’re starting a project in any domain, data science, security, finance, open-source, we suggest you take a look at what Design can offer your team.
For our approach, we focus on the first three phases, before kicking off our development.
https://dribbble.com/stories/2019/03/22/what-is-design-thinking
This is a bit like a forced marriage
Design Thinking
Threat Science Framework
So this is our highly leveraged, Threat Science Framework:
It is an end-to-end pattern used to guide our threat detection projects from ideation to deployment. Picking the best parts from the processes I mentioned before.
I’ll go through each step.
Know thy user (empathize)
Put yourself in the end users shoes - How do they feel? What is their goal? What are their challenges?
Define the problem (define)
After emphasizing with the end user, you can define the problem you’d like to solve. The problem statement.
Ideate
This is where you work with a diverse set of peers to come up with wild and wonderful ideas to solve the detection problem. This phase thrives when you include the unusual suspects.
Know thy threat
Next, begin to understand what the threat you are trying to detect. In our case what are the potential indicators of malicious websites. This understanding will drive what data we need to acquire.
Data Acquisition & Understanding
Now we gather the possible data, and explore it to understand it as best we can. Other tasks will include cleaning and wrangling of the data. This in my opinion, and many others, the most time consuming step.
Feature engineering
Through getting to know your data, you can begin to refine and enhance its feature space in preparing for the modeling phase.
One Hot Encoding and Label Encoding for Categorical variables, Normalization and Standardization of Numerical variables, Binning & Discretization, Feature selection
Modeling & Evaluation
Next we setup our experiment, firstly partitioning our dataset into training and validation subsets, then we define our performance metric (which aligns to our problem) and then evaluate what models perform best.
Fast.ai
Scikit-learn
PyTorch
Tensorflow
Keras
Deployment
Lastly we deploy our model, and allow it to be used by the necessary interfaces best suited to our end-user.
Flask
Starlette
Docker
Kubernetes
[Picture of the process we defined]
With this approach, before we start anything. We get to know the end-user.
Our end-user was cyber security Analysts working in Security Defence
Design Thinking tools and exercises offer a bunch of tools to get to know your end-user:
Interviewing
Service Safaris
Guided Tours
Empath maps
Affinity maps
Personas
Threat Intelligence
Open-source data
MITRE ATT&CK™
There are endless examples online if you’d like to search these.
So we interviewed security analysts and experts, to identify some key challenges faced in the industry. And these were...
Management of numerous security tools
The cyber security industry is a behemoth, and along with it comes a booming security software market that promises the world. Often security teams have too many tools at their disposal, meaning analysts must navigate between them to achieve an outcome.
Blue Teams employ a wide range of tools allowing them to detect an attack, collect forensic data, perform data analysis and make changes to threat future attacks and mitigate threats.
Alert Fatigue
With numerous tools, comes even more alerts. Analysts are inundated each day with false positives leading to “alert fatigue” and diminishing performance.
Alarm fatigue or alert fatigue occurs when one is exposed to a large number of frequent alarms (alerts) and consequently becomes desensitized to them.
Hard to hold onto top talent
The industry requires talented security analysts to maintain and utilise the complex security tools in the market. Not to mention, it takes considerable investment to not only grow new comers but retain them.
From this we were able to build a person of the user we’d like to help.
[Picture for each challenge]
Once we were across the key challenges, we built a persona of our end-user:
Security Analyst, Blue Team
Goals:
Defend, defend, defend
Respond, respond, respond
Needs:
Fast and effective triage and incident response
No false positives!
Tools providing detection and response capability across the attack cycle
Free time to work on interesting projects and threat hunting
Pain points:
Alert fatigue
Triaging numerous false positives
Keeping up with constantly evolving threat landscape. What will the next attack look like?
[Picture portrait of security analyst]
[Picture of the attack kill chain]
Problem/Needs Statements
How might we Statements
Now you know your end-user really well.
A typical structure when defining a problem/needs statement looks like so:
<Users> need <something> so that <benefit>.
For our project, we came up with:
Security Analysts need automated, intelligent monitoring and response capability across the cyber attack life cycle so that they can best defend against an ever-evolving threat landscape.
This phase ensures you have a coherent problem to solve.
[Diagram showing a bunch of models working together]
[Get high resolution]
Arrived at a how might we statement
Real problem
And a broad
Great ways to bring all your ideas together include:
Brainstorming (generate ideas from a seed question)
Brainwriting (each team member generates a few ideas, think deeply about them, then prioritise)
Mindmapping (grouping ideas together)
It’s really important to include a diverse set of inputs for this. We consulted::
Data Engineers
Data Scientists
Threat Hunters
Security Analysts
Project Managers
And… we finally landed...
https://zwick.nyc/nickel-dime-savings-app-design-sprint
What came out of our How might we mode...
So now we have an idea we’d like to build out, and it’s time to understand the threat domain. Malicious URLs can be associated with a number of different threats including:
Phishing
Cybersquatting
Typesquatting
Domain Hijacking
Registrar Hacking
Domain Generation Algorithms (DGA)
Researching these threats gave us an understanding of their potential indicators, for example:
DGA URLs are associated with highly randomised strings
Typesquatting URLs includes substrings that are highly similar to common web destinations such as google and facebook. E.g. goggle or facebok to lure users
It is also worth mentioning that URLS associated with these threats may share common attributes such as short domain life or expiry domain date.
We begin by ingesting the data using pandas library
To get a view of what the data structure looks like
Get shape and dimension
Screenshots etc
These are the data sources we have used for our model
Alex Top 1 Million Domains
We made an assumption that these popular sites would be reliable examples of benign URLs due to their popularity and traffic
The Alexa rank is calculated based on the browsing behavior of Internet users. Using a combination of estimated average daily Unique Visitors and Pageviews over a course of 3 months, the site ranking is calculated. Traffic ranks are updated daily. Unique Visitors are users who visit a site on a given day. Pageviews are the total number of user URL requests for a site. The data is collected using one of 25,000 browser extensions for Google Chrome, Firefox, and Internet Explorer. From <https://www.iplocation.net/alexa-traffic-rank>
IANA IPv4 Address Space Registry
Each registry is allocated a range of IPv4 address
The allocation of Internet Protocol version 4 (IPv4) address space to various registries is listed
here. Originally, all the IPv4 address spaces was managed directly by the IANA. Later parts of the
address space were allocated to various other registries to manage for particular purposes or
regional areas of the world.
Phishtank database-
Is a database of phishing websites URLs
Malware domain list -
Is a list of domain with malware
AlienVault Reputation Database -
list of IP addresses with reputation value
categorisation of malicious & benign host
Domain Generated Algorithm Database
What are the limitations of the data?
What would a solution like this require in product?
What are the possible data sources available for integration and data enrichment?
Binning
Normalization
Digits percentage
Domain Age
Say after you have trained a machine learning model and obtained an amazing accuracy of 90%
How do you know it is performing well on real world data that the model has not seen before?
Does it actually work?
this is why having a test set that represent unseen data is very important
It’s a common practice to split the data into training set, test set and validation set
Training set is used to train your model
Test set is used to test the performance of your model on previously unseen data, so that you know how well it performs
Validation set is typically used for parameter and model selection
It’s common to have around 20% of test set, but that ultimately depends on the overall size of your data
If your data has a million rows, 1 or 2 % of test and validation set would be equally sufficient.
We have attempted a few models for this use case and the top performing models are:
Deep neural network
Random forest
Word embedding, which is a lexicon based model
Neural network -
Neural network with 3 layers and more is considered a deep neural network
The layers and hidden units allow the model to learn complex features and high dimensional data
Typically, deep neural network requires a lot of tuning, which means you need an extensive knowledge of the model in order to use it. They are usually trained using conventional Pytorch or Tensorflow frameworks.
Alternatively, you can use pretrained models for transfer learning and libraries like fast.ai can make things a lot easier and faster, especially for beginners who are keen to get their hands dirty in dnn.
Fast.ai is a wrapper for Pytorch. It offers multiple pretrained models that you can use and some useful additional functionalities.
We used tabular model from fast.ai for this use case
Pros and cons of deep learning
Dl is -
Powerful and can be used on many difficult learning tasks - such as image classification, videos
It’s able to perform effective automatic feature extraction, reducing need for manual feature engineering
The disadvantages are -
Require massive amount of training data
Can require huge computing power
Architectures can be complex and hard to tune
The models may not be easily interpretable - don't knw why it selects certain feature
Random forest -
Random forest is an ensemble of decision trees, which basically means, a collection of trees
Decision trees predict the final label by splitting at multiple decision points based on selected features.
In this case, having many trees provide better generalisation and reduce overfitting. The algorithm makes predictions based on majority vote by each individual trees
pros -
commonly used and generates good predictions - performs pretty well
doesn't require extensive scaling of data
able to handle a mixture of feature types - numerical n categorical
cons -
model is difficult to interpret
not suitable for high dimensional data
Word embedding -
Word embedding is typically used to associate words with their label. For instance in imdb movie review use case, we are able to find words that are associated with positive and negative reviews
In our use case, we used character based word embedding
which creates a vector representation of each character in the url, using multiple dimensions.
This was trained using tensorflow
The table shows performance that we have obtained for each of our models
AND deep neural network is the winner!
Here we have few methods of evaluating machine learning models
Accuracy -
Is pretty straight forward
its the percentage of correct predicted decisions, divide by the total number
F1 score -
And we have f1 score
F1 score is quite the standard evaluation method, especially when it comes to machine learning competitions like Kaggle
F1 score is most suitable for our use case because it takes into consideration of false positive and false negative rate
We don’t want our model to give too many false positives to security analysts, because that would slow down incident response as they would be wasting time investigating false alerts -
F1 score is also preferred over precision and recall,
Because optimising precision or recall is trading off the other, for eg optimising precision will gv you a lower recall value
Whereas optimising f1 score gives you a balance of both
The diagram at the right is an example of a confusion matrix - which shows true positives and true negatives, and vice versa
Here’s the confusion matrix for our winning model, deep neural network
Confusion matrix gives a good visual representation of your model performance
As you can see here our deep neural network model has a higher false negative rate compare to false positive
Explainability of machine learning models has become quite of interest lately
Complex models like deep neural network are called black boxes because it is difficult to know what is going on within the model
Ideally we would want to avoid having models that rely on undesirable features such as those that could lead to biases in production.
for instance, in the context of image classification, relying on snow background to classify whether a picture has a wolf instead of relying on the wolf itself.
SHAP is a very cool library which allows further insights into the internal working of models
You are able to get a global or local explanation of your model prediction
Global tells you which feature influenced the model as a whole
local informs you which feature influenced the outcome of individual predictions
Here’s an example of how it works,
Take the meerkat example, the red is positive SHAP values that increases the probability of it being a meerkat according to your model
And blue is negative SHAP values that reduces probability of the class
This shows that the model is relying on the eyes to detect whether it’s a picture of a meerkat
You can find out more about this library on their github page, which also contains more examples
What is parameter tuning?
Parameter tuning is trying out different values for your model parameters in order to obtain a better result
Take neural network as an example, to tune it, we would be trying different number of layers, hidden units, learning rate, and activation function - such as sigmoid or relu
There are 2 main methods of optimizing parameters: they are
Grid search
Random search
Grid search -
searches exhaustively through a range of values that you have provided
And gives you the optimal combination
The downside of this is, it can be very time consuming and computationally expensive.
Random search -
On the other hand, random search searches through the range of values provided randomly
It requires less processing time.
But we aren’t guaranteed to find the optimal combination
I’ll hand this back to alistair for the closing bit