No 1: Ontology-Driven Text Mining for Digital Forensics

Supervisors: Dr Warren Jin and Dr Nianjun Liu

Period: Semesters ...
with known attacks provided by human experts. They are unable to detect novel and
unanticipated attacks.

This project aim...
First, the addition of multi forward linkages in the BN+HMM network structure to
allow direct linkages between BN nodes at...
Supervisors: Dr Nianjun Liu and Dr Warren Jin

Multi-criteria approaches to the analysis of complex issues in environment ...
Upcoming SlideShare
Loading in...5
×

by Warren Jin

129

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
129
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

by Warren Jin

  1. 1. No 1: Ontology-Driven Text Mining for Digital Forensics Supervisors: Dr Warren Jin and Dr Nianjun Liu Period: Semesters 1 and 2, 2007 The use of digital devices such as computers, Internet, personal digital assistants (PDAs), cell phones, and cameras, etc., as sources of evidence in terrorism, fraud, white-collar crime, and other criminal investigations has been steadily increasing in recent years. Digital forensics involves understanding specific aspects of digital evidence and the general forensic procedures used when analysing any form of digital evidence. Digital evidence can be any information of probative value that is either stored or transmitted in a binary form, such as Emails, Office documents, computer system log files, as well as digital audio and video. It can be used to decide whether a crime has been committed and can provide a link between a crime and its victim or a crime and its perpetrator. This project aims at implementing (or developing) effective text mining techniques to analyse textual information, especially Emails and computer system log files by embodying terminological ontology. Ontology can bring necessary background knowledge, e.g., terms “ball”, “football”, and “basketball” are semantically related to each other. Driven by ontology, textual information then can be indexed, summarised or analysed semantically. This enables the text mining techniques can highlight most interesting textual digital evidences automatically and effectively for a digital forensic investigation. The project involves various techniques such as text mining, temporal data mining, information retrieval, machine learning and statistics. Applicants are expected to have a major in information technology, computer engineering, computer science or electrical engineering, preferably with excellent programming skills (C/C++, Java, and/or Python). Applicants who are interested in research are also welcome, preferably with strong background in the information retrieval, temporal data mining, statistics or/and optimisation. Contact Dr Warren Jin (Huidong.Jin@nicta.com.au) No 2: Apply Data Mining Techniques for Cyber Intrusion Detection Supervisors: Dr Warren Jin and Dr Nianjun Liu Period: Semesters 1 and/or 2, 2007 Intrusion detection is the process of monitoring the events occurring in a computer system or network and analysing them for signs of intrusions, which are defined as attempts to bypass the security mechanisms of a computer or network (“compromise the confidentiality, integrity, availability of information resources”). Due to the proliferation of high-speed Internet access, more and more organizations are becoming vulnerable to time-varying cyber attacks (intrusions). Most existing intrusion detection systems are based on extensive knowledge of patterns associated
  2. 2. with known attacks provided by human experts. They are unable to detect novel and unanticipated attacks. This project aims at applying data mining techniques to learn real-time profiles that represent normal behaviour of users, hosts, or networks, and then detect attacks as significant deviations from these profiles. The project will implement (or develop) an efficient learning technique to establish stochastic process models from large volume of network accessing data, such as source IP address and port, protocol type and accessing timestamps. The dataset may be sized in gigabytes. The stochastic process model can be temporal association rules, sequential patterns, dynamic Bayesian networks, or a mixture of Markov models. The technique will be examined on real- world network intrusion data. Its performance as well as intrusion signs will be visualised in order to help non-domain experts for understanding. The project involves various techniques, including temporal data mining, time series analysis and computer network security. Applicants are expected to have a major in information technology, computer science, computer engineering or electrical engineering, preferably with excellent programming skills (Matlab, C/C++, R, and/or Python) for implementation. Applicants who are interested in research are also welcome, preferably with strong background in the data mining, statistical machine learning, artificial intelligence or/and time series. Contact Dr Warren Jin (Huidong.Jin@nicta.com.au) No 3: Apply Dynamical Bayesian Network to Query Digital Forensics Period: Semesters 1 and 2, 2007 Supervisors: Dr Nianjun Liu and Dr Warren Jin Digital forensics undertakes the post-mortem reconstruction of the causal sequence of events arising from an intrusion perpetrated by one or more external agents, or as a result of unauthorised activities generated by authorised users, in one or more digital systems. The field of digital forensics covers a broad set of applications, uses a variety of evidence and is supported by a number of techniques. Application areas include forensic accounting, law enforcement, commodity flow analysis and threat analysis. Forensic investigations often focus on unusual and interesting events that may not have arisen previously. A major objective of a digital investigation is to extract these interesting pieces of evidence and to identify the causal relationship between this evidence. This project aims at extending an existing Dynamical Bayesian Network model developed for digital forensics by investigating a number of possible topics. The model developed uses a Bayesian network and hidden Markov model network structure to (i) estimate typical digital crime scenario models from data and (ii) given such models, infer the most likely criminal act given current observations and past criminal acts.
  3. 3. First, the addition of multi forward linkages in the BN+HMM network structure to allow direct linkages between BN nodes at consecutive time intervals. Second is the design of an SQL based query tool to explore the activities of criminals and their interactions and explain what happened in the past, as well as predict what will happen in the near future. Finally, the application of a graphical model to data mining of relational digital forensic databases, including construction of a relational pattern structural database for known types of digital crime portfolios and their associated forensics Bayesian Network models. Applicants must have a major in information technology, computer science, or electrical engineering, preferably with excellent programming abilities (MATLAB, C/ C++ and JAVA) OR strong mathematical/machine learning/data mining/statistics background. Contact Dr Nianjun Liu (nianjun.liu@nicta.com.au) No 4: Intelligent Environmental Query on Spatial Data Period: Semesters 1 or 2, 2007 Supervisors: Dr Nianjun Liu and Dr Warren Jin Analysis of spatial information in natural resources management is crucial to support a decision making process. However, with the advent of various technologies to acquire the data, analysis of multiple spatial data becomes a very challenging area. Those technologies will produce different accuracy and different resolution in the data. In spite of multi representation of spatial data, evidences of an area can be from different time and different observer’s view that makes combining those evidences is quite complicated. Combining spatial data or evidences is not just simply combining evidences from different technologies, but it is also combining multi criteria evidences. Australian Bureau of Rural Science (BRS) has developed a system known as multi criteria analysis shell for spatial decision support (MCAS-S). The project is to incorporate an option for use of (Dynamical) Bayesian network approaches to model multiple types of evidences for intelligent environmental query and decision supports. The project involves various techniques: image processing to preprocess the spatial GIS data, machine learning, pattern recognition and probability theory. Applicants must have a major in information technology, computer science, or electrical engineering, preferably with excellent programming abilities (MATLAB, C/C++ and JAVA) OR strong mathematical/machine learning/data mining/statistics/GIS background. Contact Dr Nianjun Liu (nianjun.liu@nicta.com.au) No 5: Intelligent Land Planning on Relational Spatial Data Period: Semesters 1 or 2, 2007
  4. 4. Supervisors: Dr Nianjun Liu and Dr Warren Jin Multi-criteria approaches to the analysis of complex issues in environment decision systems have found wide application across business, government and communities around the world. Such approaches may be readily applied in the context of land planning, which is a prerequisite to the development of a city, town or suburb. Generally, planners collect a range of information about an area, including information about natural resources, topography, demographics, political issues, economic characteristics and proximity to neighbouring settlements and services, and combine this information to make planning decisions. Computer aided Multi-criteria decision support tools allow for measurement and analysis of alternatives or options, involving a variety of both qualitative and quantitative dimensions. The project involves collaboration with the ACT Planning and Land Authority (ACTPLA) to present a sample demonstration of an Intelligent Land Planning tool using Bayesian Networks. Specifically, it will aim to develop a tool for the selection of optimal sites for community services within existing areas of settlement in the ACT, including recreational facilities, schools, childcare centres, aged care facilities and community centres. The factors of interest include existing settlement patterns, demographics (current and anticipated), future development, available land resources, existing services and community need. Bayesian Network is designed to apply when there is uncertainty about evidence and how it should be combined in decision making. The proposed approach is to use a supervised strategy whereby experts provide known thematic layers of the land cover GIS spatial database as known successful decisions. The Bayesian Networks then trained to optimize the predications of such decision with the aim of applying the optimal decision model to new situations or scenarios. After exploring the ACT spatial database and other data sources, the scholar will identify the relevant decision factors with the aid of ACTPLA experts and will then build the Bayesian Network model. After the iterative data mining on the relational database, the model will be continuously learned and its structure and parameters adjusted accordingly. Finally, the scholar will create the new tool and test it in the real-world context within ACTPLA. The project involves various techniques: spatial relational database, Structure Query Languages (SQL), machine learning, pattern recognition and time series. Applicants must have a major in information technology, computer science, or electrical engineering, preferably with excellent programming abilities (C/C++ and JAVA) OR strong mathematical/machine learning/data mining/statistics/demography background. Contact Dr Nianjun Liu (nianjun.liu@nicta.com.au)

×