Research work of my student Lucas Galante, presented at SBSEG2019. We discuss the implications of adopting distinct machine learning models for malware detection.
DETECTION OF MALICIOUS EXECUTABLES USING RULE BASED CLASSIFICATION ALGORITHMSAAKANKSHA JAIN
Slide present statistical mining of Malicious-Executable dataset collected from various antivirus log-files and other sources.
Further classifications of malicious code as per their impact on user's system & distinguishes threats on the muse in their connected severity.
Implementation of the algorithms JRIP ,PART and RIDOR in additional economical manner to acquire a level of accuracy to the classification results.
Today’s threats have become very complex and serious in their packing and encryption techniques. Every day new malware variants are becoming increasingly in quantity together with quality by using packing and encrypting techniques. The challenges in this research field are the traditional malware detection systems sometimes might fail to detect new malware variants and produces false alarms. Malicious software in the form of virus, worm, trojan, ransom, and spy harms our computer systems, network environment, and organizations in various ways. Therefore, malware analysis for detection and family classification plays a significant role in Cyber Crime Incident Handling Systems. This system contributes malware family classification with 10 prominent features by conduction feature selection process. The process of labeling the malicious samples using Regular Expressions has been contributed in this approach. The proposed malware classification system provides 7 different families including malware and benign using machine learning classifiers. The finding from our experiment proves that the selected 10 API features provide the best evaluation metrics in terms of accuracy, precision-recall, and ROC scores.
Updated slides for my talk at the CHAQ meeting in Antwerp. I also added slides on some of my experiences on performing empirical studies with open source and industrial software systems.
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWAREIJNSA Journal
In the era of information technology and connected world, detecting malware has been a major security concern for individuals, companies and even for states. The New generation of malware samples upgraded with advanced protection mechanism such as packing, and obfuscation frustrate anti-virus solutions. API call analysis is used to identify suspicious malicious behavior thanks to its description capability of a
software functionality. In this paper, we propose an effective and efficient malware detection method that uses sequential pattern mining algorithm to discover representative and discriminative API call patterns. Then, we apply three machine learning algorithms to classify malware samples. Based on the experimental results, the proposed method assures favorable results with 0.999 F-measure on a dataset including 8152
malware samples belonging to 16 families and 523 benign samples.
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWAREIJNSA Journal
In the era of information technology and connected world, detecting malware has been a major security concern for individuals, companies and even for states. The New generation of malware samples upgraded with advanced protection mechanism such as packing, and obfuscation frustrate anti-virus solutions. API call analysis is used to identify suspicious malicious behavior thanks to its description capability of a software functionality. In this paper, we propose an effective and efficient malware detection method that uses sequential pattern mining algorithm to discover representative and discriminative API call patterns. Then, we apply three machine learning algorithms to classify malware samples. Based on the experimental results, the proposed method assures favorable results with 0.999 F-measure on a dataset including 8152 malware samples belonging to 16 families and 523 benign samples.
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDRabi Das
Presentation for the webinar held on 23rd May 2020, conducted by The IoT Academy for FDP program in collaboration with E&ICT Avademy, IIT Guwahati and delivered by Mr. Shree Kant Das, Growth and Digital Strategy Manager from noon.com.
DETECTION OF MALICIOUS EXECUTABLES USING RULE BASED CLASSIFICATION ALGORITHMSAAKANKSHA JAIN
Slide present statistical mining of Malicious-Executable dataset collected from various antivirus log-files and other sources.
Further classifications of malicious code as per their impact on user's system & distinguishes threats on the muse in their connected severity.
Implementation of the algorithms JRIP ,PART and RIDOR in additional economical manner to acquire a level of accuracy to the classification results.
Today’s threats have become very complex and serious in their packing and encryption techniques. Every day new malware variants are becoming increasingly in quantity together with quality by using packing and encrypting techniques. The challenges in this research field are the traditional malware detection systems sometimes might fail to detect new malware variants and produces false alarms. Malicious software in the form of virus, worm, trojan, ransom, and spy harms our computer systems, network environment, and organizations in various ways. Therefore, malware analysis for detection and family classification plays a significant role in Cyber Crime Incident Handling Systems. This system contributes malware family classification with 10 prominent features by conduction feature selection process. The process of labeling the malicious samples using Regular Expressions has been contributed in this approach. The proposed malware classification system provides 7 different families including malware and benign using machine learning classifiers. The finding from our experiment proves that the selected 10 API features provide the best evaluation metrics in terms of accuracy, precision-recall, and ROC scores.
Updated slides for my talk at the CHAQ meeting in Antwerp. I also added slides on some of my experiences on performing empirical studies with open source and industrial software systems.
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWAREIJNSA Journal
In the era of information technology and connected world, detecting malware has been a major security concern for individuals, companies and even for states. The New generation of malware samples upgraded with advanced protection mechanism such as packing, and obfuscation frustrate anti-virus solutions. API call analysis is used to identify suspicious malicious behavior thanks to its description capability of a
software functionality. In this paper, we propose an effective and efficient malware detection method that uses sequential pattern mining algorithm to discover representative and discriminative API call patterns. Then, we apply three machine learning algorithms to classify malware samples. Based on the experimental results, the proposed method assures favorable results with 0.999 F-measure on a dataset including 8152
malware samples belonging to 16 families and 523 benign samples.
MINING PATTERNS OF SEQUENTIAL MALICIOUS APIS TO DETECT MALWAREIJNSA Journal
In the era of information technology and connected world, detecting malware has been a major security concern for individuals, companies and even for states. The New generation of malware samples upgraded with advanced protection mechanism such as packing, and obfuscation frustrate anti-virus solutions. API call analysis is used to identify suspicious malicious behavior thanks to its description capability of a software functionality. In this paper, we propose an effective and efficient malware detection method that uses sequential pattern mining algorithm to discover representative and discriminative API call patterns. Then, we apply three machine learning algorithms to classify malware samples. Based on the experimental results, the proposed method assures favorable results with 0.999 F-measure on a dataset including 8152 malware samples belonging to 16 families and 523 benign samples.
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDRabi Das
Presentation for the webinar held on 23rd May 2020, conducted by The IoT Academy for FDP program in collaboration with E&ICT Avademy, IIT Guwahati and delivered by Mr. Shree Kant Das, Growth and Digital Strategy Manager from noon.com.
Reliability is concerned with decreasing faults and their impact. The earlier the faults are detected the better. That's why this presentation talks about automated techniques using machine learning to detect faults as early as possible.
Malware Detection Using Data Mining Techniques Akash Karwande
Computer programs which have a destructive content and applied to systems from invader, are called malware and the systems on which this program are applied is called victim system .
Malwares are classified into several kinds based on behavior or attack methods.
Um panorama de binários maliciosos na plataforma Linux.
Best undergrad research work @ SBSEG'18.
Trabalho premiado como melhor projeto de iniciação científica no SBSEG2018.
Apresentação referente a minha co-orientação do aluno Lucas Galante.
Near-memory & In-Memory Detection of Fileless MalwareMarcus Botacin
Proposal of a hardware-based AV embedded within the memory controller to mitigate the performance penalty when searching for fileless malware samples. Presented at 2020 MEMSYS.
Malware Detection Using Machine Learning TechniquesArshadRaja786
Malware viruses can be easily detected using machine learning Techniques such as K-Mean Algorithms, KNN algorithm, Boosted J48 Decision Tree and other Data Mining Techniques. Among them J48 proved to be more effective in detecting computer virus and upcoming networks worms...
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...IJNSA Journal
Malicious software is constantly being developed and improved, so detection and classification of malwareis an ever-evolving problem. Since traditional malware detection techniques fail to detect new/unknown malware, machine learning algorithms have been used to overcome this disadvantage. We present a Convolutional Neural Network (CNN) for malware type classification based on the API (Application Program Interface) calls. This research uses a database of 7107 instances of API call streams and 8 different malware types:Adware, Backdoor, Downloader, Dropper, Spyware, Trojan, Virus,Worm. We used a 1-Dimensional CNN by mapping API calls as categorical and term frequency-inverse document frequency (TF-IDF) vectors and compared the results to other classification techniques.The proposed 1-D CNN outperformed other classification techniques with 91% overall accuracy for both categorical and TF-IDF vectors.
Near-memory & In-Memory Detection of Fileless MalwareMarcus Botacin
My keynote at the Brazilian Security Symposium (SBSeg), as part of the Computer Forensics Workshop (WFC), talking about fileless malware, the challenges for antivirus detection, and new detection strategies. I present the prototype of a hardware AV with integrated signature matching to decrease the performance penalty imposed by software-only AVs.
More Related Content
Similar to Machine Learning for Malware Detection: Beyond Accuracy Rates
Reliability is concerned with decreasing faults and their impact. The earlier the faults are detected the better. That's why this presentation talks about automated techniques using machine learning to detect faults as early as possible.
Malware Detection Using Data Mining Techniques Akash Karwande
Computer programs which have a destructive content and applied to systems from invader, are called malware and the systems on which this program are applied is called victim system .
Malwares are classified into several kinds based on behavior or attack methods.
Um panorama de binários maliciosos na plataforma Linux.
Best undergrad research work @ SBSEG'18.
Trabalho premiado como melhor projeto de iniciação científica no SBSEG2018.
Apresentação referente a minha co-orientação do aluno Lucas Galante.
Near-memory & In-Memory Detection of Fileless MalwareMarcus Botacin
Proposal of a hardware-based AV embedded within the memory controller to mitigate the performance penalty when searching for fileless malware samples. Presented at 2020 MEMSYS.
Malware Detection Using Machine Learning TechniquesArshadRaja786
Malware viruses can be easily detected using machine learning Techniques such as K-Mean Algorithms, KNN algorithm, Boosted J48 Decision Tree and other Data Mining Techniques. Among them J48 proved to be more effective in detecting computer virus and upcoming networks worms...
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...IJNSA Journal
Malicious software is constantly being developed and improved, so detection and classification of malwareis an ever-evolving problem. Since traditional malware detection techniques fail to detect new/unknown malware, machine learning algorithms have been used to overcome this disadvantage. We present a Convolutional Neural Network (CNN) for malware type classification based on the API (Application Program Interface) calls. This research uses a database of 7107 instances of API call streams and 8 different malware types:Adware, Backdoor, Downloader, Dropper, Spyware, Trojan, Virus,Worm. We used a 1-Dimensional CNN by mapping API calls as categorical and term frequency-inverse document frequency (TF-IDF) vectors and compared the results to other classification techniques.The proposed 1-D CNN outperformed other classification techniques with 91% overall accuracy for both categorical and TF-IDF vectors.
Near-memory & In-Memory Detection of Fileless MalwareMarcus Botacin
My keynote at the Brazilian Security Symposium (SBSeg), as part of the Computer Forensics Workshop (WFC), talking about fileless malware, the challenges for antivirus detection, and new detection strategies. I present the prototype of a hardware AV with integrated signature matching to decrease the performance penalty imposed by software-only AVs.
GPThreats-3: Is Automated Malware Generation a Threat?Marcus Botacin
My talk about generating malware automatically using GPT-3, the differences for ChatGPT, limits, and possibilities. Multiple malware variants are generated and submitted to Antivirus (AV) scans. We also present a defense perspective on how defenders can use aritificial intelligence to deobfuscate malware samples.
[HackInTheBOx] All You Always Wanted to Know About AntivirusesMarcus Botacin
My talk at the HackInTheBox security conference Amsterdam 2023 about the reverse engineering of AV engines, covering signatures, whitelists, blocklists, kernel drivers, hooking, and much more.
[Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change!Marcus Botacin
My talk at the USENIX Enigma 2023 discussing challenges and pitfalls in malware research. I discuss 5 aspects to change, from diversity of research work to reproducibility crisis.
In this talk, I cover the basic idea of hardware-assisted, two-level architectures for security monitoring and its applications to the malware detection problem. I propose detection triggers involving branch predictor, MMU, memory controller, co-processors, and FPGAs.
Talk presented at the Real Time systems group seminar series at the University of York.
How do we detect malware? A step-by-step guideMarcus Botacin
Slides from my talk at Texas A&M University (TAMU) seminar series (2002), where I present a landscape of the malware detection pipeline currently used by the industry and how academia can contribute to that. I present new solutions ranging from the use of ML, sandbox solutions, and hardware support for the development of more performance-efficient Antivirus.
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022Marcus Botacin
My talk at Federal University of Minas Gerais (UFMG) to present some aspects of modern malware research and some of my contributions to the field (derived from my PhD defense). I cover all steps of a detection pipelines: threat hunting, malware triage, sandbox execution, threat intelligence, and endpoint protection.
On the Malware Detection Problem: Challenges & Novel ApproachesMarcus Botacin
Marcus Botacin's PhD Defense at Federal University of Paraná (UFPR).
Advisor: Dr André Grégio
Co-Advisor: Paulo de Geus
Evaluation Committee:
Dr Leigh Metcalf, Dr Leyla Bilge, Daniel Alfonso Oliveira
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...Marcus Botacin
Describing our experience in the MLSec competition for the seminar series of the University of Waikato. Presenteed by Fabricio Ceschin and Marcus Botacin from the Federal University of Paraná.
Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...Marcus Botacin
My talk at USENIX ENIGMA 2021 about Brazilian Financial Malware. It encompasses desktop and mobile environments, analyzed both statically and dynamically.
On the Security of Application Installers & Online Software RepositoriesMarcus Botacin
My presentation for the DIMVA 2020 conference about the security of application installers. I show the operation dynamics of the repositories and reverse engineer some application installers to show their vulnerabilities, such as to man-in-the-middle attacks.
Towards Malware Decompilation and ReassemblyMarcus Botacin
I present RevEngE, the Reverse Engineering Engine, a PoC for the debug-based decompilation approach. Presentation given at Reverse Engineering (ROOTS) confence in Vienna, Austria, 20219.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
15. Motivation Methodology Evaluation Conclusion
Conclusion
Conclusion
Our results show that:
Dynamic features outperforms static features
Discrete features present smaller accuracy variance
Dataset’s distinct characteristics impose challenges to ML
models
Feature analysis can be used as feedback information
Machine Learning for Malware Detection: Beyond Accuracy Rates 15 / 17
16. Motivation Methodology Evaluation Conclusion
Conclusion
Acknowledgement
This work is supported by:
Brazilian National Counsel of Technological and Scientific
Development
CESeg assistance
Machine Learning for Malware Detection: Beyond Accuracy Rates 16 / 17
17. Motivation Methodology Evaluation Conclusion
Conclusion
Questions, Critics and Suggestions.
Contact
galante@lasca.ic.unicamp.br
Complete version
https://github.com/marcusbotacin/ELF.Classifier
Previous work
https://github.com/marcusbotacin/Linux.Malware
Reverse Engineering Workshop
Thursday @ 13:30
Machine Learning for Malware Detection: Beyond Accuracy Rates 17 / 17