This document discusses using Bayesian networks to evaluate DNA evidence in criminal and civil identification cases. It begins by providing background on Bayesian networks and their use in expert systems. It then discusses DNA databases in several European countries and how they differ in their entry criteria. The document analyzes a criminal case where DNA from a crime scene matches a suspect, showing how a Bayesian network can calculate the likelihood ratio to evaluate the hypotheses that the suspect is guilty or innocent. It also discusses how a Bayesian network could approach a civil identification problem involving a volunteer's DNA profile.
This document discusses techniques for document classification using machine learning. It begins with an introduction to natural language processing and discusses challenges with manual document classification. It then covers approaches for vectorizing text using bag-of-words, TF-IDF, and Word2Vec models. Finally, it describes several machine learning algorithms for text classification, including Naive Bayes, support vector machines, and neural networks, focusing on their mathematical formulations and how they are applied to document classification tasks.
- The document proposes a multi-view stacking ensemble method for drug-target interaction (DTI) prediction that combines predictions from multiple machine learning models trained on different drug and target feature view combinations.
- It generates 126 view combination datasets from 14 drug views and 9 target views, then trains extra trees, random forest, and XGBoost classifiers on each view combination. Predictions from these base models are then combined using a stacking ensemble with an extra trees meta-learner.
- The method is shown to outperform single models and voting ensembles, and calibration of the meta-learner and use of local imbalance measures provide further improvements to predictive performance on DTI prediction tasks.
Traffic analysis is a process of great importance, when it comes in securing a network. This analysis can be classified in
different levels and one of most interest is Deep Packet Inspection (DPI). DPI is a very effective way of monitoring the network,
since it performs traffic control over mostly of the OSI model’s layers (from L3 to L7). Regular Expressions (RegExp) on the
other hand is used in computer science and can make use of a group of characters, in order to create a searching pattern. This
technique can be combined with a series of mathematical algorithms for helping the individual to quickly find out the search
pattern within a text and even replace it with another value.
In this paper, we aim to prove that the use of Regular Expressions is much more productive and effective when used for
creating matching rules needed in DPI. We design, test and put into comparison Regular Expression rules and compare it
against the conventional methods. In addition to the above, we have created a case study of detecting EternalBlue and
DoublePulsar threats, in order to point out the practical and realistic value of our proposal.
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...IJCNCJournal
Machine learning (ML) and Deep Learning (DL) methods are being adopted rapidly, especially in computer network security, such as fraud detection, network anomaly detection, intrusion detection, and much more. However, the lack of transparency of ML and DL based models is a major obstacle to their implementation and criticized due to its black-box nature, even with such tremendous results. Explainable Artificial Intelligence (XAI) is a promising area that can improve the trustworthiness of these models by giving explanations and interpreting its output. If the internal working of the ML and DL based models is understandable, then it can further help to improve its performance. The objective of this paper is to show that how XAI can be used to interpret the results of the DL model, the autoencoder in this case. And, based on the interpretation, we improved its performance for computer network anomaly detection. The kernel SHAP method, which is based on the shapley values, is used as a novel feature selection technique. This method is used to identify only those features that are actually causing the anomalous behaviour of the set of attack/anomaly instances. Later, these feature sets are used to train and validate the autoencoderbut on benign data only. Finally, the built SHAP_Model outperformed the other two models proposed based on the feature selection method. This whole experiment is conducted on the subset of the latest CICIDS2017 network dataset. The overall accuracy and AUC of SHAP_Model is 94% and 0.969, respectively.
Neural Information Retrieval: In search of meaningful progressBhaskar Mitra
The emergence of deep learning based methods for search poses several challenges and opportunities not just for modeling, but also for benchmarking and measuring progress in the field. Some of these challenges are new, while others have evolved from existing challenges in IR benchmarking exacerbated by the scale at which deep learning models operate. Evaluation efforts such as the TREC Deep Learning track and the MS MARCO public leaderboard are intended to encourage research and track our progress, addressing big questions in our field. The goal is not simply to identify which run is "best" but to move the field forward by developing new robust techniques, that work in many different settings, and are adopted in research and practice. This entails a wider conversation in the IR community about what constitutes meaningful progress, how benchmark design can encourage or discourage certain outcomes, and about the validity of our findings. In this talk, I will present a brief overview of what we have learned from our work on MS MARCO and the TREC Deep Learning track--and reflect on the state of the field and the road ahead.
A Deep Learning Model to Predict Congressional Roll Call Votes from Legislati...mlaij
This document describes a deep learning model called the Predict Text Classification Network (PTCN) that was developed to predict the outcome (pass/fail) of congressional roll call votes based solely on the text of legislation. The PTCN uses a hybrid convolutional and long short-term memory neural network architecture to analyze legislative texts and predict whether a vote will pass or fail. The model was tested on legislative texts from 2000-2019 and achieved an average prediction accuracy of 67.32% using 10-fold cross-validation, suggesting it can recognize patterns in language that correlate with congressional voting behaviors.
Application of Boolean pre-algebras to the foundations of Computer ScienceMarcelo Novaes
Senior thesis
Field: Mathematical Logic
Supervisor: Steffen Lewitzka
University: Universidade Federal da Bahia (UFBA)
Abstract:
"Increasing the expressiveness of a logical system is a goal of many fields in Computer Science such as Formal Systems, Knowledge construction, Linguistics, Universal Logic and Model Theory. The increasing of this expressiveness can be reached by the use of non-Fregean Logic, a non-classical logic. In non-Fregean Logic, formulas with the same truth value can have different denotations or meanings (also called situations). This concept breaks the Frege Axiom, the reason for the name non-Fregean Logic. Recently, it was shown that there is an equivalence between Boolean pre-algebras and non-Fregean logic models. This fact linked fields which were already using Boolean pre-algebras to represent their semantic models. In this thesis, an investigation on this equivalence is done and applications are exposed in the fields of Modal Logic, Truth Theory, Logic with Quantifiers and Epistemic Logic."
The full thesis can be found at http://repositorio.ufba.br/ri/handle/ri/1938
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...IIIT Hyderabad
Online Social Networks (OSNs) are popular platforms for online users. Users typically register and maintain their accounts (user identities) across different OSNs to share a variety of content and remain connected with their friends. Consequently, linking user identities across OSN platforms, referred to as user identity linkage (UIL) becomes a critical problem. Solving this problem enables us to build a more comprehensive view of user’s activities across OSNs, which is highly beneficial for targeted advertisements, recommendations, and many more applications. In the thesis, we propose approaches for analyzing data collection methods, investigating biases in identity linkage datasets, linkage of user identities across social networks, control-ability of user identity linkage, and application of user identity linkage solutions to solve related problems.
This document discusses techniques for document classification using machine learning. It begins with an introduction to natural language processing and discusses challenges with manual document classification. It then covers approaches for vectorizing text using bag-of-words, TF-IDF, and Word2Vec models. Finally, it describes several machine learning algorithms for text classification, including Naive Bayes, support vector machines, and neural networks, focusing on their mathematical formulations and how they are applied to document classification tasks.
- The document proposes a multi-view stacking ensemble method for drug-target interaction (DTI) prediction that combines predictions from multiple machine learning models trained on different drug and target feature view combinations.
- It generates 126 view combination datasets from 14 drug views and 9 target views, then trains extra trees, random forest, and XGBoost classifiers on each view combination. Predictions from these base models are then combined using a stacking ensemble with an extra trees meta-learner.
- The method is shown to outperform single models and voting ensembles, and calibration of the meta-learner and use of local imbalance measures provide further improvements to predictive performance on DTI prediction tasks.
Traffic analysis is a process of great importance, when it comes in securing a network. This analysis can be classified in
different levels and one of most interest is Deep Packet Inspection (DPI). DPI is a very effective way of monitoring the network,
since it performs traffic control over mostly of the OSI model’s layers (from L3 to L7). Regular Expressions (RegExp) on the
other hand is used in computer science and can make use of a group of characters, in order to create a searching pattern. This
technique can be combined with a series of mathematical algorithms for helping the individual to quickly find out the search
pattern within a text and even replace it with another value.
In this paper, we aim to prove that the use of Regular Expressions is much more productive and effective when used for
creating matching rules needed in DPI. We design, test and put into comparison Regular Expression rules and compare it
against the conventional methods. In addition to the above, we have created a case study of detecting EternalBlue and
DoublePulsar threats, in order to point out the practical and realistic value of our proposal.
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...IJCNCJournal
Machine learning (ML) and Deep Learning (DL) methods are being adopted rapidly, especially in computer network security, such as fraud detection, network anomaly detection, intrusion detection, and much more. However, the lack of transparency of ML and DL based models is a major obstacle to their implementation and criticized due to its black-box nature, even with such tremendous results. Explainable Artificial Intelligence (XAI) is a promising area that can improve the trustworthiness of these models by giving explanations and interpreting its output. If the internal working of the ML and DL based models is understandable, then it can further help to improve its performance. The objective of this paper is to show that how XAI can be used to interpret the results of the DL model, the autoencoder in this case. And, based on the interpretation, we improved its performance for computer network anomaly detection. The kernel SHAP method, which is based on the shapley values, is used as a novel feature selection technique. This method is used to identify only those features that are actually causing the anomalous behaviour of the set of attack/anomaly instances. Later, these feature sets are used to train and validate the autoencoderbut on benign data only. Finally, the built SHAP_Model outperformed the other two models proposed based on the feature selection method. This whole experiment is conducted on the subset of the latest CICIDS2017 network dataset. The overall accuracy and AUC of SHAP_Model is 94% and 0.969, respectively.
Neural Information Retrieval: In search of meaningful progressBhaskar Mitra
The emergence of deep learning based methods for search poses several challenges and opportunities not just for modeling, but also for benchmarking and measuring progress in the field. Some of these challenges are new, while others have evolved from existing challenges in IR benchmarking exacerbated by the scale at which deep learning models operate. Evaluation efforts such as the TREC Deep Learning track and the MS MARCO public leaderboard are intended to encourage research and track our progress, addressing big questions in our field. The goal is not simply to identify which run is "best" but to move the field forward by developing new robust techniques, that work in many different settings, and are adopted in research and practice. This entails a wider conversation in the IR community about what constitutes meaningful progress, how benchmark design can encourage or discourage certain outcomes, and about the validity of our findings. In this talk, I will present a brief overview of what we have learned from our work on MS MARCO and the TREC Deep Learning track--and reflect on the state of the field and the road ahead.
A Deep Learning Model to Predict Congressional Roll Call Votes from Legislati...mlaij
This document describes a deep learning model called the Predict Text Classification Network (PTCN) that was developed to predict the outcome (pass/fail) of congressional roll call votes based solely on the text of legislation. The PTCN uses a hybrid convolutional and long short-term memory neural network architecture to analyze legislative texts and predict whether a vote will pass or fail. The model was tested on legislative texts from 2000-2019 and achieved an average prediction accuracy of 67.32% using 10-fold cross-validation, suggesting it can recognize patterns in language that correlate with congressional voting behaviors.
Application of Boolean pre-algebras to the foundations of Computer ScienceMarcelo Novaes
Senior thesis
Field: Mathematical Logic
Supervisor: Steffen Lewitzka
University: Universidade Federal da Bahia (UFBA)
Abstract:
"Increasing the expressiveness of a logical system is a goal of many fields in Computer Science such as Formal Systems, Knowledge construction, Linguistics, Universal Logic and Model Theory. The increasing of this expressiveness can be reached by the use of non-Fregean Logic, a non-classical logic. In non-Fregean Logic, formulas with the same truth value can have different denotations or meanings (also called situations). This concept breaks the Frege Axiom, the reason for the name non-Fregean Logic. Recently, it was shown that there is an equivalence between Boolean pre-algebras and non-Fregean logic models. This fact linked fields which were already using Boolean pre-algebras to represent their semantic models. In this thesis, an investigation on this equivalence is done and applications are exposed in the fields of Modal Logic, Truth Theory, Logic with Quantifiers and Epistemic Logic."
The full thesis can be found at http://repositorio.ufba.br/ri/handle/ri/1938
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...IIIT Hyderabad
Online Social Networks (OSNs) are popular platforms for online users. Users typically register and maintain their accounts (user identities) across different OSNs to share a variety of content and remain connected with their friends. Consequently, linking user identities across OSN platforms, referred to as user identity linkage (UIL) becomes a critical problem. Solving this problem enables us to build a more comprehensive view of user’s activities across OSNs, which is highly beneficial for targeted advertisements, recommendations, and many more applications. In the thesis, we propose approaches for analyzing data collection methods, investigating biases in identity linkage datasets, linkage of user identities across social networks, control-ability of user identity linkage, and application of user identity linkage solutions to solve related problems.
The document proposes a new approach to compare stock market patterns to DNA sequences using compression techniques. Stock market data is converted to binary sequences representing increases and decreases, which are then encoded into DNA nucleotides. These nucleotide sequences are divided and matched against human genome sequences using BLAST. The analysis found certain sub-sequences of the stock market patterns matched 100% to the human genome, suggesting this approach could potentially predict stock market behavior.
call for papers, research paper publishing, where to publish research paper, journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJEI, call for papers 2012,journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, research and review articles, engineering journal, International Journal of Engineering Inventions, hard copy of journal, hard copy of certificates, journal of engineering, online Submission, where to publish research paper, journal publishing, international journal, publishing a paper, hard copy journal, engineering journal
This document provides an overview of Bayesian statistics and its applications in data mining for pharmacovigilance. It discusses how Bayesian statistics uses probability to represent subjective degrees of belief and incorporates prior knowledge. It also describes how the Bayesian Confidence Propagation Neural Network (BCPNN) is used as a hypothesis generating tool to identify unexpected drug-adverse event associations in large safety databases. The BCPNN calculates an Information Component value to indicate the strength of associations. Bayesian approaches are advantageous for pharmacovigilance as they allow incorporation of external evidence and handle missing data.
This document discusses applying a neural network approach to decision making in a self-organizing computing network (SOCN). It proposes using concepts from fuzzy logic and neural networks to build a computing network that can handle mixed data types, like symbolic and numeric data. The network would have input, hidden, and output layers connected by transfer functions. The hidden cells would self-organize based on training data to learn relationships between input and output cells. This approach aims to allow the network to make decisions on data sets with diverse attribute types in a more effective way than other techniques.
On the identifiability of phylogenetic networks under a pseudolikelihood modelArrigo Coen
This document summarizes research on the identifiability of phylogenetic networks under a pseudolikelihood model. It presents two main results: 1) Hybridization cycles of size 4 or more nodes are detectable from concordance factors, while cycles of size 2 nodes are undetectable. Cycles of size 3 may be detectable under certain conditions. 2) Numerical parameters can be estimated for hybridization cycles of size 4 or more nodes, but not for cycles of size 3 nodes or less. The document discusses the implications of these results for using pseudolikelihood estimation to model evolution involving hybridization.
San Francisco Crime Analysis Classification Kaggle contestSameer Darekar
This document presents a project to analyze and predict crime in San Francisco using data mining techniques. The objectives are to analyze the spatial and temporal relationships of crime, predict the category of crime in a location based on variables like location and date, and suggest safest paths between places. The authors describe using naive Bayes, decision tree, random forest and support vector machine classifiers on a dataset of over 878,000 crime incidents to classify crimes and identify patterns. Cross-validation is used to evaluate the classifiers on the training data. The results are intended to help the police department understand crime patterns and deploy resources more efficiently.
Uncertainty classification of expert systems a rough set approachEr. rahul abhishek
In this paper, we discussed about the un certainity classifications of the Expert Systems using a Rough Set Approach. It is a Softcomputing technique using this we classified the types of Expert Systems. An expert system has a unique structure, different from traditional programs. It is divided into two parts, one fixed, independent of the expert system: the inference engine, and one variable: the knowledge base. To run an expert system, the engine reasons about the knowledge base like a human. In the 80's a third part appeared: a dialog interface to communicate with users. This ability to conduct a conversation with users was later called "conversational". Rough set theory is a technique deals with uncertainty.
An information-theoretic, all-scales approach to comparing networksJim Bagrow
My presentation at NetSci 2018 on Portrait Divergence, a new approach to comparing networks that is simple, general-purpose, and easy to interpret.
The preprint: https://arxiv.org/abs/1804.03665
The code: https://github.com/bagrow/portrait-divergence
The Systematic Methodology for Accurate Test Packet Generation and Fault Loca...IOSRjournaljce
As we probably aware now a days networks are broadly dispersed so administrators relies on upon different devices, for example, pings and follow course to troubleshoot the issue in network. So we proposed a robotized and orderly approach for testing and troubleshooting network called "Automatic Test Packet Generation"(ATPG). ATPG first peruses switch arrangement and produces a gadget free model. The model is utilized to produce least number of test packets to cover each connection in a network and every control in network. ATPG is equipped for researching both useful and execution issues. Test packets are sent at customary interims and separate strategy is utilized to confine flaws. The working of few disconnected devices which automatically create test packets are additionally given, yet ATPG goes past the prior work in static (checking liveness and fault localization).
This document discusses using artificial neural networks for fault detection and diagnosis of power systems. It presents neural networks as a suitable approach for power system fault diagnosis due to their ability to generalize, learn online, and operate in real-time. The document describes a case study where a neural network was trained on data from a model power system to detect and diagnose 53 different fault types based on relay and circuit breaker status inputs. The neural network was able to accurately diagnose faults, distinguish between single and multiple faults, and showed graceful degradation when diagnosis was imperfect.
Gears are the most essential and commonly utilised power transmission components. It is really essential to operate machines involving different weights and speeds. When a load is increased beyond a certain limit, gear teeth frequently fail. Composite materials, in comparison to other metallic gears, offer significantly better mechanical qualities, such as a higher strength-to-weight ratio, increased hardness, and hence a lower risk of failure. Al6063 and SiC were employed to build a metal matrix composite for spur gear production in this work.
Sound plays a crucial part in every element of human life. Sound is a crucial component in the development of automated systems in a variety of domains, from personal security to essential monitoring. There are a few systems on the market now, but their efficiency is a worry for their use in real-world circumstances. Image classification and feature classification are the same as sound classification, just like other classification algorithms like machine learning.
Heaviness has been related to stroke, depression, and cancer are some of the most serious dangers to human existence. Heart disease, stroke, obesity, and type II diabetes are all disorders that have an impact on our way of life. Using data mining and machine learning approaches to forecast disease based on patient treatment history and health data has been a battle for decades.
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MININGijnlc
Nearly 70% of people are concerned about the propagation of fake news. This paper aims to detect fake news in online articles through the use of semantic features and various machine learning techniques. In this research, we investigated recurrent neural networks vs. the naive bayes classifier and random forest classifiers using five groups of linguistic features. Evaluated with real or fake dataset from kaggle.com, the best performing model achieved an accuracy of 95.66% using bigram features with the random forest classifier. The fact that bigrams outperform unigrams, trigrams, and quadgrams show that word pairs as opposed to single words or phrases best indicate the authenticity of news.
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MININGkevig
This document summarizes a research paper that aims to detect fake news articles through machine learning techniques. It compares random forest, naive bayes, and recurrent neural network classifiers using linguistic features including n-grams and word embeddings. The best performing model was random forest with bigram features, achieving 95.66% accuracy on a real or fake news dataset. Bigrams were found to outperform other n-grams as indicators of news authenticity.
The document proposes a new approach to compare stock market patterns to DNA sequences using compression techniques. Stock market data is converted to binary sequences representing increases and decreases, which are then encoded into DNA nucleotides. These nucleotide sequences are divided and matched against human genome sequences using BLAST. The analysis found certain sub-sequences of the stock market patterns matched 100% to the human genome, suggesting this approach could potentially predict stock market behavior.
call for papers, research paper publishing, where to publish research paper, journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJEI, call for papers 2012,journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, research and review articles, engineering journal, International Journal of Engineering Inventions, hard copy of journal, hard copy of certificates, journal of engineering, online Submission, where to publish research paper, journal publishing, international journal, publishing a paper, hard copy journal, engineering journal
This document provides an overview of Bayesian statistics and its applications in data mining for pharmacovigilance. It discusses how Bayesian statistics uses probability to represent subjective degrees of belief and incorporates prior knowledge. It also describes how the Bayesian Confidence Propagation Neural Network (BCPNN) is used as a hypothesis generating tool to identify unexpected drug-adverse event associations in large safety databases. The BCPNN calculates an Information Component value to indicate the strength of associations. Bayesian approaches are advantageous for pharmacovigilance as they allow incorporation of external evidence and handle missing data.
This document discusses applying a neural network approach to decision making in a self-organizing computing network (SOCN). It proposes using concepts from fuzzy logic and neural networks to build a computing network that can handle mixed data types, like symbolic and numeric data. The network would have input, hidden, and output layers connected by transfer functions. The hidden cells would self-organize based on training data to learn relationships between input and output cells. This approach aims to allow the network to make decisions on data sets with diverse attribute types in a more effective way than other techniques.
On the identifiability of phylogenetic networks under a pseudolikelihood modelArrigo Coen
This document summarizes research on the identifiability of phylogenetic networks under a pseudolikelihood model. It presents two main results: 1) Hybridization cycles of size 4 or more nodes are detectable from concordance factors, while cycles of size 2 nodes are undetectable. Cycles of size 3 may be detectable under certain conditions. 2) Numerical parameters can be estimated for hybridization cycles of size 4 or more nodes, but not for cycles of size 3 nodes or less. The document discusses the implications of these results for using pseudolikelihood estimation to model evolution involving hybridization.
San Francisco Crime Analysis Classification Kaggle contestSameer Darekar
This document presents a project to analyze and predict crime in San Francisco using data mining techniques. The objectives are to analyze the spatial and temporal relationships of crime, predict the category of crime in a location based on variables like location and date, and suggest safest paths between places. The authors describe using naive Bayes, decision tree, random forest and support vector machine classifiers on a dataset of over 878,000 crime incidents to classify crimes and identify patterns. Cross-validation is used to evaluate the classifiers on the training data. The results are intended to help the police department understand crime patterns and deploy resources more efficiently.
Uncertainty classification of expert systems a rough set approachEr. rahul abhishek
In this paper, we discussed about the un certainity classifications of the Expert Systems using a Rough Set Approach. It is a Softcomputing technique using this we classified the types of Expert Systems. An expert system has a unique structure, different from traditional programs. It is divided into two parts, one fixed, independent of the expert system: the inference engine, and one variable: the knowledge base. To run an expert system, the engine reasons about the knowledge base like a human. In the 80's a third part appeared: a dialog interface to communicate with users. This ability to conduct a conversation with users was later called "conversational". Rough set theory is a technique deals with uncertainty.
An information-theoretic, all-scales approach to comparing networksJim Bagrow
My presentation at NetSci 2018 on Portrait Divergence, a new approach to comparing networks that is simple, general-purpose, and easy to interpret.
The preprint: https://arxiv.org/abs/1804.03665
The code: https://github.com/bagrow/portrait-divergence
The Systematic Methodology for Accurate Test Packet Generation and Fault Loca...IOSRjournaljce
As we probably aware now a days networks are broadly dispersed so administrators relies on upon different devices, for example, pings and follow course to troubleshoot the issue in network. So we proposed a robotized and orderly approach for testing and troubleshooting network called "Automatic Test Packet Generation"(ATPG). ATPG first peruses switch arrangement and produces a gadget free model. The model is utilized to produce least number of test packets to cover each connection in a network and every control in network. ATPG is equipped for researching both useful and execution issues. Test packets are sent at customary interims and separate strategy is utilized to confine flaws. The working of few disconnected devices which automatically create test packets are additionally given, yet ATPG goes past the prior work in static (checking liveness and fault localization).
This document discusses using artificial neural networks for fault detection and diagnosis of power systems. It presents neural networks as a suitable approach for power system fault diagnosis due to their ability to generalize, learn online, and operate in real-time. The document describes a case study where a neural network was trained on data from a model power system to detect and diagnose 53 different fault types based on relay and circuit breaker status inputs. The neural network was able to accurately diagnose faults, distinguish between single and multiple faults, and showed graceful degradation when diagnosis was imperfect.
Gears are the most essential and commonly utilised power transmission components. It is really essential to operate machines involving different weights and speeds. When a load is increased beyond a certain limit, gear teeth frequently fail. Composite materials, in comparison to other metallic gears, offer significantly better mechanical qualities, such as a higher strength-to-weight ratio, increased hardness, and hence a lower risk of failure. Al6063 and SiC were employed to build a metal matrix composite for spur gear production in this work.
Sound plays a crucial part in every element of human life. Sound is a crucial component in the development of automated systems in a variety of domains, from personal security to essential monitoring. There are a few systems on the market now, but their efficiency is a worry for their use in real-world circumstances. Image classification and feature classification are the same as sound classification, just like other classification algorithms like machine learning.
Heaviness has been related to stroke, depression, and cancer are some of the most serious dangers to human existence. Heart disease, stroke, obesity, and type II diabetes are all disorders that have an impact on our way of life. Using data mining and machine learning approaches to forecast disease based on patient treatment history and health data has been a battle for decades.
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MININGijnlc
Nearly 70% of people are concerned about the propagation of fake news. This paper aims to detect fake news in online articles through the use of semantic features and various machine learning techniques. In this research, we investigated recurrent neural networks vs. the naive bayes classifier and random forest classifiers using five groups of linguistic features. Evaluated with real or fake dataset from kaggle.com, the best performing model achieved an accuracy of 95.66% using bigram features with the random forest classifier. The fact that bigrams outperform unigrams, trigrams, and quadgrams show that word pairs as opposed to single words or phrases best indicate the authenticity of news.
FAKE NEWS DETECTION WITH SEMANTIC FEATURES AND TEXT MININGkevig
This document summarizes a research paper that aims to detect fake news articles through machine learning techniques. It compares random forest, naive bayes, and recurrent neural network classifiers using linguistic features including n-grams and word embeddings. The best performing model was random forest with bigram features, achieving 95.66% accuracy on a real or fake news dataset. Bigrams were found to outperform other n-grams as indicators of news authenticity.
Similar to Criminal and Civil Identification with DNA Databases Using Bayesian Networks (20)
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.pptHenry Hollis
The History of NZ 1870-1900.
Making of a Nation.
From the NZ Wars to Liberals,
Richard Seddon, George Grey,
Social Laboratory, New Zealand,
Confiscations, Kotahitanga, Kingitanga, Parliament, Suffrage, Repudiation, Economic Change, Agriculture, Gold Mining, Timber, Flax, Sheep, Dairying,
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
Criminal and Civil Identification with DNA Databases Using Bayesian Networks
1. Marina Andrade & Manuel Alberto M. Ferreira
International Journal of Security, (IJS), Volume (3) : Issue(4) 65
Criminal and Civil Identification with DNA Databases Using
Bayesian Networks
Marina Andrade marina.andrade@iscte.pt
Department of Quantitative Methods
ISCTE – Lisbon University Institute
Lisbon, 1649-026, Portugal
Manuel Alberto M. Ferreira manuel.ferreira@iscte.pt
Department of Quantitative Methods
ISCTE – Lisbon University Institute
Lisbon, 1649-026, Portugal
Abstract
Forensic identification problems are examples in which the study of DNA profiles
is a common approach. Here we present some problems and develop their
treatment putting the focus in the use of Object-Oriented Bayesian Networks -
OOBN. The use of DNA databases, which began in 1995 in England, has
created new challenges about its use. In Portugal the legislation for the
construction of a genetic database was defined in 2008. With this it is important
to determine how to use it in an appropriate way.
For a crime that has been committed, forensic laboratories identify genetic
characteristics in order to connect one or more individuals to it. Apart the
laboratories results it is a matter of great importance to quantify the information
obtained, i.e., to know how to evaluate and interpret the results obtained
providing support to the judicial system. Other forensic identification problems
are body identification; whether the identification of a body (or more than one)
found, together with the information of missing persons belonging to one or more
known families, for which there may be information of family members who
claimed the disappearance. In this work we intend to discuss how to use the
database; the hypotheses of interest and the database use to determine the
likelihood ratios, i.e., how to evaluate the evidence for different situations.
Keywords: Bayesian networks, DNA profiles, identification problems.
1. INTRODUCTION
The use of networks transporting probabilities began with the geneticist Sewall Wright in the
beginning of the 20th century (1921). Since then their use had different forms in several areas like
social sciences and economy – in which the used models are, in general, linear named Path
Diagrams or Structural Equations Models (SEM), and in artificial intelligence – usually non-linear
models named Bayesian networks also called Probabilistic Expert Systems (PES), [11],[14].
Bayesian networks are graphical structures for representing the probabilistic relationships among
a large number of variables and for doing probabilistic inference with those variables, [13]. Before
2. Marina Andrade & Manuel Alberto M. Ferreira
International Journal of Security, (IJS), Volume (3) : Issue(4) 66
we approach the use of Bayesian networks to our interest problems we briefly discuss some
aspects of PES in connection with uncertainty problems in section 2.
In section 3 after presenting some possible forensic identification problems the creation and use
of DNA profile databases in some European countries is discussed putting the focus in the entry
criteria and the differences observed. How to approach evaluate and interpret the results,
whether it is a criminal or a civil identification problem is presented in section 4. Thus, it is
important to describe the Portuguese law establishing the principles to maintain a DNA database
file for civil and criminal identification purposes. The study of a criminal identification problem
considering one single perpetrator, and a civil identification problem with one volunteer, for two
different situations, are considered.
2. EXPERT SYSTEMS
Expert systems are attempts to crystallize and codify the knowledge and skills of one or more
experts into a tool that can be used by non-specialists, [14]. An expert system can be
decomposed as follows:
Expert system = knowledge base + Inference engine.
The first term on the right-hand side of the equation, knowledge base, refers the specific
knowledge domain of the problem. The inference engine is given by a set of algorithms, which
process the codification of the knowledge base jointly with any specific information known for the
application in study.
Usually it is presented in a software program, as the one we are going to show hereafter, but
such is not an imperative rule. Each of those parts is important for the inferences, but knowledge
base is crucial. The inferences obtained depend naturally on the quality of the knowledge base, of
course in association with a sophisticated inference engine. The better those parts are the best
results we can get.
A PES is a representation of a complex probability structure by means of a directed acyclic graph,
having a node for each variable, and directed links describing probabilistic causal relationships
between variables, [1]. Bayesian approach is the adequate for making inferences in probabilistic
expert systems.
2.1 Bayesian networks
Bayesian networks are graphical representations expressing qualitative relationships of
dependence and independence between variables. A Bayesian network is a directed acyclic
graph G (DAG) having a set of V vertices or nodes and directed arrows. Each node υ є V
represents a random variable Xυ with a set of possible values or states. The arrows connecting
the nodes describe conditional probability dependencies between the variables.
The set of parents, pa(υ), of a node υ comprises all the nodes in the graph with arrows ending in
υ. The probability structure is completed by specifying the conditional probability distributions for
each random variable Xυ and each possible configuration of variables associated with its parent
nodes xpa(υ). The conditional distribution of Xυ is expressed given Xpa(υ) = xpa(υ). The joint
distribution is p(x) = Π υєV p(x υ| xpa(υ)). There are algorithms to transform the network into a new
graphical representation, named junction tree of cliques, so that the conditional probability p(x υ|
xA) can be efficiently computed, for all υ є V, any set of nodes VA ⊆ , and any configuration xA of
the nodes XA. The nodes in the conditioning set A are generally nodes of observation and input of
evidence XA= xA, or they may specify hypotheses being assumed.
3. Marina Andrade & Manuel Alberto M. Ferreira
International Journal of Security, (IJS), Volume (3) : Issue(4) 67
Software such as Hugin
1
can be used to build the Bayesian network through the graph G. That
can be done by specifying the graph nodes, their space of states and the conditional probabilities
p(x υ| xpa(υ)). In the compiling process the software will construct its internal junction tree
representation. Then, by entering the evidence XA= xA at the nodes in A, and requesting its
propagation to the remaining nodes in the network, the conditional probabilities p(x υ| xA) are
obtained. Version 6.4 of Hugin and upgrades allow the graphical use of OOBN.
OOBN are one example of the general class of Bayesian networks. An instance or object is a
regular network possessing input and output nodes as well as ordinary internal nodes. The
interface nodes have grey fringes, with the input nodes exhibiting a dotted line and the output
nodes a solid line. The instances of a given class have identical conditional probability tables for
non-input nodes. The objects are connected by directed links from output to input nodes. The
links represent identification of nodes. We use bold face to refer the object classes and math
mode to refer the nodes. The modular flexibility structure of the OOBN is of great advantage in
complex cases
3. Forensic identification problems
The use of DNA profiles in forensic identification problems has become, in the last years, an
almost regular procedure in many and different situations. Among those are: 1) disputed paternity
problems, in which it is necessary to determine if the putative father of a child is or is not the true
father; 2) criminal cases as if a certain individual Y was the origin of a stain found in the scene of
a crime; or in more complex cases to determine if an individual or more did contribute to a mixture
trace found; 3) civil identification problems, i.e., the case of a body identification, together with the
information of a missing person belonging to a known family, or the identification of more than
one body resultant of a disaster or an attempt. And even immigration cases in which it is
important to establish family relations.
Here the focus is to approach the civil and criminal identification problems. The establishment
and use of DNA database files for a great number of European countries worked as a motivation
to study in more detail the mentioned problems and the use of these database files identification.
The use of a DNA profile database may allow delinquents’ identification and/or the connection of
criminal conducts and the respective individuals, the exclusion of innocents as well as the
recognition and civil identification of missing people. A genetic profile database may be an
important help in forensic investigation, particularly in crimes of repetitive tendency, when DNA
samples of condemned individuals are collected. In the context of the civil identification it may be
very useful when unidentified corpses appear and may be identified by comparison of their DNA
profiles family volunteer’s profiles.
3.1 DNA database files
The discovery of biological fingerprints in 1984 opened new perspectives in forensic identification
area. Since then the technical advances and results obtained have allowed studying and solving
increasingly complex problems. Almost twenty five years after Alec Jeffreys’ team discovery,
Portugal established the legislation for the construction of a genetic database, law nº5/2008.
The advances in DNA technology and knowledge opened new perspectives for civil and criminal
investigation. Apart from the ethical and legal questions that are in the domain of the legal
system, we should draw some attention to the experience and knowledge acquired by those
countries that already have their own databases operating and try to learn from them ways to
improve on how to operate with the Portuguese database, [5].
1
http://www.hugin.com - OOBN a resource available in the Hugin 6.4 software.
4. Marina Andrade & Manuel Alberto M. Ferreira
International Journal of Security, (IJS), Volume (3) : Issue(4) 68
TABLE 1: National DNA databases in Europe.
There has been a considerable discussion about the individuals to include in a DNA profiles
database, specially with different results in countries with different legal systems. The main
differences are linked to the emphasis given by the countries: to the individuals or the social
order.
In the table above (TABLE 1) it is possible to observe significant differences between the
European countries in what concerns criteria to enter a person into a database (while a suspect
or only after conviction, different types of conviction). The criteria to remove records and the
number of entries in the database also have important differences all over Europe.
As we have seen there are clear differences between countries more on the north and more on
the south of Europe, which is essentially due the different perspective valuing more the social
order or the individual itself.
4. Criminal and civil identification using DNA profile databases
Let us consider a criminal case in which a DNA profile has been recovered from a crime scene,
and it is admitted to be left by the culprit (only one perpetrator); and a civil identification
problem with one volunteer giving his/her genetic information.
The Portuguese law n°5/2008 establishes the principles for creating and maintaining a database
of DNA profiles for identification purposes, and regulates the collection, processing and
Country Year Entry criteria for suspects (convicted offenders)
England 1995 Any recordable offence
Austria 1997 Any recordable offence
Croatia/Switzerland 2000 Any recordable offence
Germany 1998 > 1 year in prison (after court decision)
Finland 1999 > 1 year in prison
Denmark 2000 > 1.5 years in prison
Norway 1999 Many serious offences (after court decision)
Hungary 2003 5 years in prison
Sweden 2000 No suspects entered (> 2 years convicted)
Belgium 2002 No suspects entered (after court decision)
Netherlands 1997 No suspects entered (> 4 years convicted)
France 2001 No suspects entered (serious offences, voluntary samples only)
Spain 1998 Phoenix program – civil database for civil ident. vol. donations
Portugal 2008 Vol.,“problem samples”, “reference samples” (≥ 3 years convicted)
Italy - Law in preparation
5. Marina Andrade & Manuel Alberto M. Ferreira
International Journal of Security, (IJS), Volume (3) : Issue(4) 69
conservation of samples of human cells, their analysis and collection of DNA profiles, the
methodology for comparison of DNA profiles taken from the samples, and the processing and
storage of information in a computer file.
Here it is assumed that the database is composed of a file containing information of samples from
convicted offenders with 3 years of imprisonment or more - α ; a file containing the information of
samples of volunteers - β ; a file containing information on the "problem samples" or "reference
samples" from corpses, or parts of corpses, or thing or place where the authorities collect
samples - γ .
4.1 Criminal identification – one single perpetrator
For a crime that has been committed, forensic laboratories identify genetic characteristics in order
to connect one or more individuals to the crime. Apart from the laboratories results it is a matter of
great importance to quantify the information obtained, i.e., to know how to evaluate and interpret
the results obtained providing a support to the judicial decision. Here it is assumed a DNA trace
found at a scene of a crime left by only one perpetrator.
“The experience with some databases seams to indicate that before the commitment of a serious
crime, some suspects have already been involved in minor offences. This fact associated to the
repetitive motif of some serious crimes can support the importance of DNA databases not only to
the criminal investigation but also to the prevention of crime, mainly if there are large entry
criteria.”, [4].
Some notation:
Let cC be the genetic characteristic (DNA profile) found at the crime scene, and sC the
suspect’s genetic characteristic. The evidence is ( )sc CCE ,= , the DNA typing results for the
crime sample and the suspect. Our interest is to discuss how to evaluate the hypotheses and the
odds ratio. In court the hypotheses are: HP: The suspect (s) left the crime stain vs HD: Some other
person left the crime stain.
There is a match between the crime scene profile and the suspect’s profile. The court wants to
compare the two preceding hypotheses. It is important to discuss the presentation of the
evidence in court and how to evaluate the hypotheses of interest. The posterior odds are:
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )α
α
α
α
α
α
α
α
α
α
∈
∈
∈
∈
=
∈
∈
∈
∈
=
∈
∈
sHP
sHP
sHCCP
sHCCP
sHP
sHP
sHEP
sHEP
sEHP
sEHP
D
P
LR
Dsc
Psc
D
P
D
P
D
P
|
|
,|,
,|,
|
|
,|
,|
,|
,|
444 3444 21
The likelihood ratio, LR, takes the form:
( )
( )
( )
( )α
α
α
α
∈
∈
∈
∈
=
sHCP
sHCP
sHCCP
sHCCP
LR
Ds
Ps
Dsc
Psc
,|
,|
,,|
,,|
6. Marina Andrade & Manuel Alberto M. Ferreira
International Journal of Security, (IJS), Volume (3) : Issue(4) 70
Whether or not the suspect left the crime sample that does not provide any information to our
uncertainty about his/her genetic characteristic or genotype, i.e.,
( ) ( )αα ∈=∈ sHCPsHCP DsPs ,, . Therefore
( )
( )
.
,,|
,,|
α
α
∈
∈
=
sHCCP
sHCCP
LR
Dsc
Psc
In a case with a trace of a single perpetrator, for each marker, one can use the object-oriented
Bayesian (OOBN) network shown in Figure 1. Each node (instance) in the network represents
itself a Bayesian network. Nodes spg and smg are all of class founder, a network with only one
node which states are the alleles in the problem and the respective frequencies in the population,
and represent the suspect’s paternal and maternal inheritance. Node sgt and cgt are of class
genotype. The genotype of an individual is an unordered pair of alleles inherited from paternal
(pg) and maternal (mg) genes, here represented by gtmin:=min{pg, mg} and gtmax:=max{pg,
mg}, where pg and mg are input nodes identical to the gene node of founder. The nodes cmg and
cpg specify whether the correspondent allele is or is not from the suspect. If s=c? has true for
value then the true perpetrator’s allele will be identical with the suspect’s allele, otherwise the true
perpetrator’s allele is chosen randomly from another man in the population. The single node s=c?
represent the binary query ‘Is the suspect the perpetrator?’
FIGURE 1: Network for a criminal case with a single perpetrator.
Here, if the suspect, s, is in the file of convicted offenders then ( ) ( )αα ∈>∈ sHPsHP DP
|| ,
otherwise those probabilities may be assumed equal. Therefore, the posterior odds are:
If s is in the file of convicted offenders Otherwise
( )
( )
( ) ( )
.
,,|
1
|
|
,,|
1
1
αα
α
α ∈
>
∈
∈
∈
>
sHCCPsHP
sHP
sHCCP DscD
P
Dsc
44 344 21
7. Marina Andrade & Manuel Alberto M. Ferreira
International Journal of Security, (IJS), Volume (3) : Issue(4) 71
We need to assess ( )α∈sHP P
| and ( )α∈sHP D
| . The posterior odds computation is in the
judges’ domain, we can only explain how to do it and how to interpret the evidence, i.e., how to
use it as a decision support element.
4.2 Civil identification – one missing person and one volunteer
A missing individual is reported. A body is found. The hypotheses are: HP: The body found is the
body of the claimed individual X vs HD: The body found is any other individual’s, not the claimed
individual X. People want their relatives alive and reject the hypothesis that states their lost
relative is dead. A volunteer supplies genetic material to be used in the test of a partial match.
The evidence is ( )volBf CCE ,= - the genetic characteristics of the body found, BFC , and the
volunteer, volC . The posterior odds is
( )
( )
( )
( )
( )
( )γβ
γβ
γβ
γβ
γβ
γβ
,|
,|
,,|
,,|
,,|
,,|
∈
∈
∈
∈
=
∈
∈
volHP
volHP
volHEP
volHEP
volEHP
volEHP
D
P
D
P
D
P
One may assume ( ) ( )γβγβ ,|,| ∈=∈ volHPvolHP DP then
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
.
,,|
,,|
,,,|,
,,,|
,,|
,,|
,,|,
,,|,
,,|
,,|
1
4444 34444 21
44444 344444 21
=
∈
∈
∈
∈
=
∈
∈
∈
∈
=
∈
∈
γβ
γβ
γβ
γβ
γβ
γβ
γβ
γβ
γβ
γβ
volHCP
volHCP
volHCCP
volHCCP
volEHP
volEHP
volHCCP
volHCCP
volEHP
volEHP
Dvol
Pvol
DvolBF
PvolBF
D
P
LR
DvolBF
PvolBF
D
P
The conditioning does not include the information of the body found. Whether or not the volunteer
is related with the individual whose body was found that does not provide information to our
uncertainty about his genotype.
Depending on the volunteer and the claimed individual family relation we only may observe a
partial match with one volunteer. Apart form that, it is important to check if there is a match
between BFC and any of the “problem samples” inγ . Assuming no match of BFC and any sample
inγ , the likelihood ratio may be written as:
( )
( )
.
|
,|
DBF
PvolBF
HCP
HCCP
In a case having only one volunteer the likelihood ratio can be computed using a Bayesian
network. Suppose we have a missing individual, an elder person and a son or a daughter
claiming the disappearance and who gives a genetic profile voluntarily. The likelihood ratio can be
computed using the following network:
8. Marina Andrade & Manuel Alberto M. Ferreira
International Journal of Security, (IJS), Volume (3) : Issue(4) 72
FIGURE 2: Network for civil identification with one volunteer, paternal or maternal relationship case.
As above nodes ancpg and ancmg are all of class founder, a network with only one node which
states are the alleles in the problem and the respective frequencies in the population, and
represent the volunteer’s ancient paternal and maternal inheritance. Node volgt and bfgt are of
class genotype, the volunteer and body found genotypes.
The nodes tancmg and tancpg specify whether the correspondent allele is or is not from the
volunter. If bf_match_vol? has true for value then the volunteer’s allele will be identical with the
volunteer’s allele. The node bfancg defines the Mendel inheritance in which the allele of the
individual whose body was found is chosen at random from the ancient’s paternal and maternal
gene.
For a case with only one volunteer, but now for example a brother or a sister of an individual who
is missing and is being searched, the likelihood ratio may be computed using the following
Bayesian network:
FIGURE 3: for civil identification with one volunteer, brother or sister relationship case.
9. Marina Andrade & Manuel Alberto M. Ferreira
International Journal of Security, (IJS), Volume (3) : Issue(4) 73
Nodes ancpmg and ancppg are all of class founder and represent the volunteer’s ancient genes
of paternal and maternal inheritance. That inheritance will pass to nodes volpg and volmg, which
are going to form volunteer genotype. Node volgt and bfgt are of class genotype, the volunteer
and body found genotypes. The remaining nodes are the same as the ones presented in the
previous problem.
5. CONCLUSION & FUTURE WORK
To connect an individual with a crime on the basis of a profile match may be dangerous because
the database may contain undetected errors. In order to avoid misclassification with DNA from
the database it is important to admit, at least, a second and independent analysis.
After computing the likelihood, whether it is a criminal case or a civil identification case, it is
possible to compute the posterior odds, i.e., multiplying the likelihood ratio and the prior odds, in
order to perform a comparative evaluation between the prosecution and the defense hypotheses.
The database file α is a subset of the population set P, P⊂α . If the size of the database file is
small, then one may only have a small fraction of the possible offenders. Therefore, it is important
to take that into account. This topic should be considered in future work.
Whether it is criminal or civil identification in many situations the evidence may have more than
one individual involved. In future work that must be considered. Also, in civil identification
problems an important issue is to study how to compute the likelihood ratios when there is a
match or a partial match between the genetic characteristic of the individual whose body was
found and the file of “problem samples” and “reference samples”, γ .
6. REFERENCES
1. A. P. Dawid, J. Mortera, V. L. Pascali, D. W. Van Boxel. “Probabilistic expert systems for
forensic inference from genetic markers”. Scandinavian Journal of Statistics, 29:577-595,
2002
2. B. P. Battula, K. Rani , S. Prasad , T. Sudha. ”Techniques in Computer Forensics: A
Recovery Perspective”. International Journal of Security, Volume 3, Issue 2:27-35, 2009
3. David J. Balding. ”The DNA database controversy”. Biometrics, 58(1):241-244, 2002
4. F. Corte-Real. “Forensic DNA databases”. Forensic Science International, 146s:s143-s144,
2004
5. G. Skinner. “Multi-Dimensional Privacy Protection for Digital Collaborations”. International
Journal of Security, Volume 1, Issue 1:22-31, 2007
6. I. Evett and B. S. Weir. “Interpreting DNA Evidence: Statistical Genetics for Forensic
Scientists”, Sinauer Associates, Inc. (1998)
7. M. Andrade, M. A. M. Ferreira. “Bayesian networks in forensic identification problems”.
Journal of Applied Mathematics. Volume 2, number 3, 13-30, 2009
10. Marina Andrade & Manuel Alberto M. Ferreira
International Journal of Security, (IJS), Volume (3) : Issue(4) 74
8. M. Andrade, M. A. M. Ferreira, J. A. Filipe. “Evidence evaluation in DNA mixture traces”.
Journal of Mathematics and Allied Fields (Scientific Journals International-Published online).
Volume 2, issue 2, 2008
9. M. Andrade, M. A. M. Ferreira, J. A. Filipe., M. Coelho. “Paternity dispute: is it important to be
conservative?”. Aplimat – Journal of Applied Mathematics. Volume 1, number 2, 2008
10. M. Guillén, M. V. Lareu, C. Pestoni, A. Salas and A. Carrecedo. “Ethical-legal problems of
DNA databases in criminal investigation”. Journal of Medical Ethics, 26:266-271, 2000
11. M. N. Anyanwu, S. Shiva. “Comparative Analysis of Serial Decision Tree
Classification Algorithms”. International Journal of Computer Science and Security,
Volume 3, Issue 3:230-240, 2009
12. P. Martin. “National DNA databases – practice and practability. A forum for discussion”. In
International Congress Series 1261, 1-8
13. R. E. Neapolitan. “Learning Bayesian networks” , Pearson Prentice Hall, (2004)
14. R. G. Cowell, A. P. Dawid, S. L. Lauritzen, D. J. Spiegelhalter. “Probabilistic Expert Systems”,
Springer, New York, (1999).