1.
Computational Intelligence for Data Mining Włodzisław Duch Department of Informatics Nicholas Copernicus University Torun, Poland W ith help from R . Adamczak , K . Grąbczewski K . Grudziński , N . Jankowski , A . Naud http://www.phys.uni.torun.pl/kmk WCCI 200 2 , Honolulu, HI
3.
Plan <ul><li>What this tutorial is about ? </li></ul><ul><li>How to discover knowledge in data; </li></ul><ul><li>how to create comprehensible models of data; </li></ul><ul><li>how to evaluate new data. </li></ul><ul><li>AI, CI & Data Mining </li></ul><ul><li>Forms of useful knowledge </li></ul><ul><li>GhostMiner philosophy </li></ul><ul><li>Exploration & Visualization </li></ul><ul><li>Rule-based data analysis </li></ul><ul><li>Neurofuzzy models </li></ul><ul><li>Neural models </li></ul><ul><li>Similarity-based models </li></ul><ul><li>Committees of models </li></ul>
4.
AI, CI & DM <ul><li>Artificial Intelligence: symbolic models of knowledge. </li></ul><ul><li>Higher-level cognition: reasoning, problem solving, planning, heuristic search for solutions. </li></ul><ul><li>Machine learning, inductive, rule-based methods. </li></ul><ul><li>Technology: expert systems. </li></ul><ul><li>Computational Intelligence, Soft Computing: </li></ul><ul><li>methods inspired by many sources: </li></ul><ul><li>biology – evolutionary, immune, neural computing </li></ul><ul><li>statistics, patter recognition </li></ul><ul><li>probability – Bayesian networks </li></ul><ul><li>logic – fuzzy, rough … </li></ul><ul><li>Perception, object recognition. </li></ul><ul><li>Data Mining, Knowledge Discovery in Databases. </li></ul><ul><li>discovery of interesting patterns, rules, knowledge. </li></ul><ul><li>building predictive data models. </li></ul>
5.
Forms of useful knowledge <ul><li>AI/Machine Learning camp: </li></ul><ul><li>Neural nets are black boxes. </li></ul><ul><li>Unacceptable! Symbolic rules forever. </li></ul><ul><li>But ... knowledge accessible to humans is in: </li></ul><ul><li>symbols, </li></ul><ul><li>similarity to prototypes, </li></ul><ul><li>images, visual representations. </li></ul><ul><li>What type of explanation is satisfactory? </li></ul><ul><li>Interesting question for cognitive scientists. </li></ul><ul><li>Different answers in different fields. </li></ul>
6.
Forms of knowledge <ul><li>Humans remember examples of each category and refer to such examples – as similarity-based or nearest-neighbors methods do. </li></ul><ul><li>Humans create prototypes out of many examples – as Gaussian classifiers, RBF networks, neurofuzzy systems do. </li></ul><ul><li>Logical rules are the highest form of summarization of knowledge. </li></ul><ul><li>Types of explanation: </li></ul><ul><li>exemplar-based: prototypes and similarity; </li></ul><ul><li>logic-based: symbols and rules; </li></ul><ul><li>visualization-based: maps, diagrams, relations ... </li></ul>
7.
GhostMiner Philosophy <ul><li>GhostMiner, data mining tools from our lab. </li></ul><ul><li>Separate the process of model building and knowledge discovery from model use => GhostMiner Developer & GhostMiner Analyzer </li></ul><ul><li>There is no free lunch – provide different type of tools for knowledge discovery. Decision tree, neural, neurofuzzy, similarity-based, committees. </li></ul><ul><li>Provide tools for visualization of data. </li></ul><ul><li>Support the process of knowledge discovery/model building and evaluating, organizing it into projects. </li></ul>
8.
Wine data example <ul><li>Chemical analysis of wine from grapes grown in the same region in Italy, but derived from three different cultivars. Task: recognize the source of wine sample. 13 quantities measured, continuous features: </li></ul><ul><li>alcohol content </li></ul><ul><li>ash content </li></ul><ul><li>magnesium content </li></ul><ul><li>flavanoids content </li></ul><ul><li>proanthocyanins phenols content </li></ul><ul><li>OD280/D315 of diluted wines </li></ul><ul><li>malic acid content </li></ul><ul><li>alkalinity of ash </li></ul><ul><li>total phenols content </li></ul><ul><li>nonanthocyanins phenols content </li></ul><ul><li>color intensity </li></ul><ul><li>hue </li></ul><ul><li>proline. </li></ul>
9.
Exploration and visualization <ul><li>General info about the data </li></ul>
10.
Exploration: data <ul><li>Inspect the data </li></ul>
11.
Exploration: data statistics <ul><li>Distribution of feature values </li></ul>Proline has very large values, the data should be standardized before further processing.
12.
Exploration: data standardized <ul><li>Standardized data: unit standard deviation, about 2/3 of all data should fall within [mean-std,mean+std] </li></ul>Other options: normalize to fit in [-1,+1], or normalize rejecting some extreme values.
13.
Exploration: 1D histograms <ul><li>Distribution of feature values in clasess </li></ul>Some features are more useful than the others.
14.
Exploration: 1D/3D histograms <ul><li>Distribution of feature values in clasess, 3D </li></ul>
15.
Exploration: 2D projections <ul><li>Projections on selected 2D </li></ul>Projections on selected 2D
16.
Visualize data <ul><li>Hard to imagine relations in more than 3D. </li></ul>SOM mappings: popular for visualization, but rather inaccurate, no measure of distortions. Measure of topographical distortions: map all X i points from R n to x i points in R m , m < n , and ask: how well are R ij = D ( X i , X j ) distances reproduced by distances r ij = d ( x i , x j ) ? Use m = 2 for visualization, use higher m for dimensionality reduction.
17.
Visualize data: MDS <ul><li>Multidimensional scaling: invented in psychometry by Torgerson (1952), re-invented by Sammon (1969) and myself (1994) … </li></ul>Minimize measure of topographical distortions moving the x coordinates.
18.
Visualize data: Wine <ul><li>3 clusters are clearly distinguished, 2D is fine. </li></ul>The green outlier can be identified easily.
19.
Decision trees <ul><li>Simplest things first: use decision tree to find logical rules. </li></ul>Test single attribute, find good point to split the data, separating vectors from different classes. DT advantages: fast, simple, easy to understand, easy to program, many good algorithms.
20.
Decision borders <ul><li>Univariate trees: test the value of a single attribute x < a . </li></ul>Multivariate trees: test on combinations of attributes. Result: feature space is divided in hyperrectangular areas.
21.
SSV decision tree <ul><li>Separability Split Value tree: based on the separability criterion. </li></ul>Define left and right sides of the splits: SSV criterion: separate as many pairs of vectors from different classes as possible; minimize the number of separated from the same class.
22.
SSV – complex tree <ul><li>Trees may always learn to achieve 100% accuracy. </li></ul>Very few vectors are left in the leaves!
23.
SSV – simplest tree <ul><li>Pruning finds the nodes that should be removed to increase generalization – accuracy on unseen data. </li></ul>Trees with 7 nodes left: 15 errors/178 vectors.
24.
SSV – logical rules <ul><li>Trees may be converted to logical rules. </li></ul><ul><li>Simplest tree leads to 4 logical rules: </li></ul><ul><li>if proline > 719 and flavanoids > 2.3 then class 1 </li></ul><ul><li>if proline < 719 and OD280 > 2.115 then class 2 </li></ul><ul><li>if proline > 719 and flavanoids < 2.3 then class 3 </li></ul><ul><li>if proline < 719 and OD280 < 2.115 then class 3 </li></ul>How accurate are such rules? Not 15/178 errors, or 91.5% accuracy! Run 10-fold CV and average the results. 85±10%? Run 10X!
25.
SSV – optimal trees/rules <ul><li>Optimal: estimate how well rules will generalize. </li></ul><ul><li>Use stratified crossvalidation for training; </li></ul><ul><li>use beam search for better results. </li></ul><ul><li>if OD280/D315 > 2.505 and proline > 726.5 then class 1 </li></ul><ul><li>if OD280/D315 < 2.505 and hue > 0.875 and malic-acid < 2.82 then class 2 </li></ul><ul><li>if OD280/D315 > 2.505 and proline < 726.5 then class 2 </li></ul><ul><li>if OD280/D315 < 2.505 and hue > 0.875 and malic-acid > 2.82 then class 3 </li></ul><ul><li>if OD280/D315 < 2.505 and hue < 0.875 then class 3 </li></ul>Note 6/178 errors, or 91.5% accuracy! Run 10-fold CV: results are 90.4 ± 6.1 %? Run 10X!
26.
Logical rules <ul><li>Crisp logic rules: for continuous x use linguistic variables (predicate functions). </li></ul>s k ( x ) ş True [ X k Ł x Ł X' k ], for example: small( x ) = True{ x | x < 1} medium( x ) = True{ x | x [1,2]} large( x ) = True{ x | x > 2} Linguistic variables are used in crisp (prepositional, Boolean) logic rules: IF small-height( X ) AND has-hat( X ) AND has-beard( X ) THEN ( X is a Brownie) ELSE IF ... ELSE ...
27.
Crisp logic decisions <ul><li>Crisp logic is based on rectangular membership functions: </li></ul>True/False values jump from 0 to 1. Step functions are used for partitioning of the feature space. Very simple hyper-rectangular decision borders. Sever limitation on the expressive power of crisp logical rules!
28.
Logical rules - advantages <ul><li>Logical rules, if simple enough, are preferable. </li></ul><ul><li>Rules may expose limitations of black box solutions. </li></ul><ul><li>Only relevant features are used in rules. </li></ul><ul><li>Rules may sometimes be more accurate than NN and other CI methods. </li></ul><ul><li>Overfitting is easy to control, rules usually have small number of parameters. </li></ul><ul><li>Rules forever !? A logical rule about logical rules is: </li></ul>IF the number of rules is relatively small AND the accuracy is sufficiently high. THEN rules may be an optimal choice.
29.
Logical rules - limitations <ul><li>Logical rules are preferred but ... </li></ul><ul><li>Only one class is predicted p ( C i | X , M ) = 0 or 1 </li></ul><ul><li>black-and-white picture may be inappropriate in many applications. </li></ul><ul><li>Discontinuous cost function allow only non-gradient optimization. </li></ul><ul><li>Sets of rules are unstable: small change in the dataset leads to a large change in structure of complex sets of rules. </li></ul><ul><li>Reliable crisp rules may reject some cases as unclassified. </li></ul><ul><li>Interpretation of crisp rules may be misleading. </li></ul><ul><li>Fuzzy rules are not so comprehensible. </li></ul>
30.
How to use logical rules? <ul><li>Data has been measured with unknown error. Assume Gaussian distribution: </li></ul>x – fuzzy number with Gaussian membership function . A set of logical rules R is used for fuzzy input vectors : Monte Carlo simulations for arbitrary system => p ( C i | X ) Analytical evaluation p ( C | X ) is based on cumulant : Error function is identical to logistic f. < 0.02
31.
Rules - choices <ul><li>Simplicity vs. accuracy. </li></ul><ul><li>Confidence vs. rejection rate. </li></ul>p is a hit; p false alarm; p is a miss. S ( M ) = p = p /p Specificity S + ( M ) = p +|+ = p ++ /p + Sensitivity R ( M ) =p +r +p r = 1 L ( M ) A ( M ) Rejection rate L ( M ) = p + p Error rate A ( M ) = p + p Accuracy (overall)
32.
Rules – error functions <ul><li>T he overall accuracy is equal to a combination of sensitivity and s pecificity weighted by the a priori probabilities: </li></ul>A ( M ) = p S ( M ) +p S ( M ) Optimization of rules for the C + class; large means no errors but high rejection rate. E ( M ) = L ( M ) A ( M ) = ( p +p ) ( p +p ) min M E ( M; ) min M {(1+ ) L ( M ) +R ( M )} Optimization with different costs of errors min M E ( M; ) = min M { p + p } = min M { p S ( M )) p r ( M ) + [ p S ( M )) p r ( M ) ] } ROC (Receiver Operating Curve): p ( p , hit(false alarm).
33.
Fuzzification of rules <ul><li>Rule R a ( x ) = { x a } is fulfilled by G x with probability: </li></ul>Error function is approximated by logistic function ; assuming error distribution ( x ) x )), for s 2 =1.7 approximates Gauss < 3.5% Rule R a b ( x ) = { b > x a } is fulfilled by G x with probability:
34.
Soft trapezoids and NN <ul><li>The difference between two sigmoids makes a soft trapezoidal membership functions . </li></ul>Conclusion : fuzzy logic with ( x ) ( x-b ) m.f. is equivalent to crisp logic + Gaussian uncertainty .
35.
Optimization of rules <ul><li>Fuzzy: large receptive fields , rough estimations . </li></ul><ul><li>G x – uncertainty of inputs , small receptive fields . </li></ul>Minimization of the number of errors – difficult, non-gradient, but now Monte Carlo or analytical p ( C | X ; M ). <ul><li>Gradient optimization works for large number of parameters. </li></ul><ul><li>Parameters s x are known for some features, use them as optimization parameters for others! </li></ul><ul><li>Probabilities instead of 0/1 rule outcomes. </li></ul><ul><li>Vectors that were not classified by crisp rules have now non-zero probabilities. </li></ul>
36.
Mushrooms <ul><li>The Mushroom Guide: no simple rule for mushrooms; no rule like: ‘ leaflets three, let it be’ for Poisonous Oak and Ivy. </li></ul>8124 cases, 51.8% are edible, the rest non-edible. 22 symbolic attributes, up to 12 values each, equivalent to 118 logical features, or 2 118 =3 . 10 35 possible input vectors. Odor: almond, anise, creosote, fishy, foul, musty, none, pungent, spicy Spore p rint c olor: black, brown, buff, chocolate, green, orange, purple, white, yellow. Safe rule for edible mushrooms: odor=(almond.or.anise.or.none) Ů spore-print-color = Ř green 48 errors, 99.41% correct This is why animals have such a good sense of smell! What does it tell us about odor receptors?
37.
Mushrooms rules <ul><li>To eat or not to eat, this is the question! Not any more ... </li></ul>A mushroom is poisonous if: R 1 ) odor = Ř (almond anise none); 120 errors, 98.52% R 2 ) spore-print-color = green 48 errors, 99.41% R 3 ) odor = none Ů stalk-surface-below-ring = scaly Ů stalk-color-above-ring = Ř brown 8 errors, 99.90% R 4 ) habitat = leaves Ů cap-color = white no errors! R 1 + R 2 are quite stable, found even with 10% of data; R 3 and R 4 may be replaced by other rules , ex : R' 3 ): gill-size=narrow Ů stalk-surface-above-ring=(silky scaly) R' 4 ): gill-size=narrow Ů population=clustered Only 5 of 22 attributes used! Simplest possible rules? 100% in CV tests - structure of this data is completely clear.
38.
Recurrence of breast cancer <ul><li>Institute of Oncology, University Medical Center, Ljubljana. </li></ul>286 cases, 201 no (70.3%), 85 recurrence cases (29.7%) 9 symbolic features: age (9 bins), tumor-size (12 bins), nodes involved (13 bins), degree-malignant (1,2,3), area, radiation, menopause, node-caps. no-recurrence,40-49,premeno,25-29,0-2,?,2, left, right_low, yes Many systems tried, 65-78% accuracy reported . Single rule : IF (nodes-involved [0,2] degree-malignant = 3 THEN recurrence ELSE no-recurrence 77% accuracy, only trivial knowledge in the data: highly malignant cancer involving many nodes is likely to strike back.
39.
Neurofuzzy system <ul><li>Feature Space Mapping (FSM) neurofuzzy system. </li></ul><ul><li>Neural adaptation, estimation of probability density distribution (PDF) using single hidden layer network (RBF-like) with nodes realizing separable functions: </li></ul>Fuzzy: x (no/yes) replaced by a degree x . Triangular, trapezoidal, Gaussian or other membership f. M.f-s in many dimensions:
40.
FSM <ul><li>Rectangular functions: simple rules are created, many nearly equivalent descriptions of this data exist. </li></ul><ul><li>If proline > 929.5 then class 1 (48 cases, 45 correct </li></ul><ul><li>+ 2 recovered by other rules). </li></ul><ul><li>If color < 3.79285 then class 2 (63 cases, 60 correct) </li></ul><ul><li>Interesting rules, but overall accuracy is only 88±9% </li></ul>Initialize using clusterization or decision trees. Triangular & Gaussian f. for fuzzy rules. Rectangular functions for crisp rules. Between 9-14 rules with triangular membership functions are created; accuracy in 10xCV tests about 96±4.5% Similar results obtained with Gaussian functions.
41.
Prototype-based rules <ul><li>IF P = arg min R D(X,R) THAN Class(X)=Class(P) </li></ul>C-rules (Crisp), are a special case of F-rules (fuzzy rules). F-rules (fuzzy rules) are a special case of P-rules (Prototype). P-rules have the form: D(X,R) is a dissimilarity (distance) function, determining decision borders around prototype P. P-rules are easy to interpret! IF X=You are most similar to the P=Superman THAN You are in the Super-league. IF X=You are most similar to the P=Weakling THAN You are in the Failed-league. “ Similar” may involve different features or D(X,P).
42.
P-rules Euclidean distance leads to a Gaussian fuzzy membership functions + product as T-norm. Manhattan function => (X;P)=exp{ |X-P|} Various distance functions lead to different MF. Ex. data-dependent distance functions, for symbolic data:
43.
Promoters DNA strings, 57 aminoacids, 53 + and 53 - samples tactagcaatacgcttgcgttcggtggttaagtatgtataatgcgcgggcttgtcgt Euclidean distance, symbolic s =a, c, t, g replaced by x=1, 2, 3, 4 PDF distance, symbolic s=a, c, t, g replaced by p(s|+)
44.
P-rules New distance functions from info theory => interesting MF. MF => new distance function, with local D(X,R) for each cluster. Crisp logic rules: use L norm: D Ch ( X , P ) = || X P|| = max i W i | X i P i | D Ch ( X , P ) = const => rectangular contours. Chebyshev distance with thresholds P IF D Ch ( X , P ) P THEN C( X )=C( P ) is equivalent to a conjunctive crisp rule IF X 1 [ P 1 P W 1 , P 1 P W 1 ] … X N [ P N P W N , P N P W N ] THEN C( X )=C( P )
45.
Decision borders Euclidean distance from 3 prototypes, one per class. Minkovski =20 distance from 3 prototypes. D(P,X) =const and decision borders D(P,X)=D(Q,X) .
47.
Neural networks <ul><li>MLP – Multilayer Perceptrons, most popular NN models. </li></ul><ul><li>Use soft hyperplanes for discrimination. </li></ul><ul><li>Results are difficult to interpret, complex decision borders. </li></ul><ul><li>Prediction, approximation: infinite number of classes. </li></ul><ul><li>RBF – Radial Basis Functions. </li></ul><ul><li>RBF with Gaussian functions are equivalent to fuzzy systems with Gaussian membership functions, but … </li></ul><ul><li>No feature selection => complex rules. </li></ul><ul><li>Other radial functions => not separable! </li></ul><ul><li>Use separable functions, not radial => FSM. </li></ul><ul><li>Many methods to convert MLP NN to logical rules. </li></ul>
48.
Rules from MLPs <ul><li>Why is it difficult? </li></ul>Multi-layer perceptron (MLP) networks: stack many perceptron units, performing threshold logic: M-of-N rule: IF ( M conditions of N are true ) THEN ... Problem: for N inputs number of subsets is 2 N . Exponentially growing number of possible conjunctive rules.
49.
MLP2LN <ul><li>Converts MLP neural networks into a network performing logical operations (LN). </li></ul>Input layer Aggregation: better features Output: one node per class. Rule units: threshold logic Linguistic units: windows, filters
50.
MLP2LN training <ul><li>Constructive algorithm: add as many nodes as needed. </li></ul>Optimize cost function: minimize errors + enforce zero connections + leave only +1 and -1 weights makes interpretation easy.
51.
L-units <ul><li>Create linguistic variables. </li></ul>Numerical representation for R-nodes V sk = ( ) for s k = low V sk = ( ) for s k = normal L-units: 2 thresholds as adaptive parameters; logistic ( x ), or tanh( x ) [ Soft trapezoidal functions change into rectangular filters (Parzen windows). 4 types, depending on signs S i . Product of bi-central functions is logical rule, used by IncNet NN.
52.
Iris example <ul><li>Network after training: </li></ul>iris setosa: q=1 (0,0,0;0,0,0;+1,0,0;+1,0,0) iris versicolor: q=2 (0,0,0;0,0,0;0,+1,0;0,+1,0) iris virginica: q=1 (0,0,0;0,0,0;0,0,+1;0,0,+1) Rules: If ( x 3 =s x 4 =s ) setosa If ( x 3 =m x 4 =m ) versicolor If ( x 3 =l x 4 =l ) virginica 3 errors only (98%).
53.
Learning dynamics Decision regions shown every 2 00 training epochs in x 3 , x 4 coordinates; borders are optimally placed with wide margins.
54.
Thyroid screening <ul><li>Garavan Institute, Sydney, Australia </li></ul><ul><li>15 binary, 6 continuous </li></ul><ul><li>Training: 93+191+3488 Validate: 73+177+3178 </li></ul><ul><li>Determine important clinical factors </li></ul><ul><li>Calculate prob. of each diagnosis. </li></ul>Hidden units Final diagnoses TSH T4U Clinical findings Age sex … … T3 TT4 TBG Normal Hyperthyroid Hypothyroid
55.
Thyroid – some results. Accuracy of diagnoses obtained with several systems – rules are accurate. Method Rules/Features Training % Test % MLP2LN optimized 4/6 99.9 99.36 CART/SSV Decision Trees 3/5 99.8 99.33 Best Backprop MLP -/21 100 98.5 Naïve Bayes -/- 97.0 96.1 k-nearest neighbors -/- - 93.8
56.
Psychometry <ul><li>Use CI to find knowledge, create Expert System. </li></ul><ul><li>MMPI (Minnesota Multiphasic Personality Inventory) psychometric test. </li></ul><ul><li>Printed forms are scanned or computerized version of the test is used. </li></ul><ul><li>Raw data: 550 questions, ex: I am getting tired quickly: Yes - Don’t know - No </li></ul><ul><li>Results are combined into 10 clinical scales and 4 validity scales using fixed coefficients. </li></ul><ul><li>Each scale measures tendencies towards hypochondria, schizophrenia, psychopathic deviations, depression, hysteria, paranoia etc. </li></ul>
60.
Psychometry: goal <ul><li>There is no simple correlation between single values and final diagnosis. </li></ul><ul><li>Results are displayed in form of a histogram, called ‘ a psychogram ’. Interpretation depends on the experience and skill of an expert, takes into account correlations between peaks. </li></ul>Goal : an expert system providing evaluation and interpretation of MMPI tests at an expert level. Problem : agreement between experts only about 70% of the time; alternative diagnosis and personality changes over time are important.
62.
Psychometric data <ul><li>1600 cases for woman, same number for men. </li></ul><ul><li>27 classes: norm, psychopathic, schizophrenia, paranoia, neurosis, mania, simulation, alcoholism, drug addiction, criminal tendencies, abnormal behavior due to ... </li></ul>Extraction of logical rules: 14 scales = features. Define linguistic variables and use FSM, MLP2LN, SSV - giving about 2-3 rules/class.
63.
Psychometric results 10-CV for FSM is 82-85%, for C4.5 is 79-84%. Input uncertainty + G x around 1.5% (best ROC) improves FSM results to 90-92%. 96.9 95.9 98 ♂ 97.6 95.4 69 ♀ FSM 93.1 92.5 61 ♂ 93.7 93.0 55 ♀ C 4.5 + G x % Accuracy N. rules Data Method
64.
Psychometric Expert <ul><li>Probabilities for different classes. For greater uncertainties more classes are predicted. </li></ul><ul><li>Fitting the rules to the conditions: </li></ul><ul><li>typically 3-5 conditions per rule, Gaussian distributions around measured values that fall into the rule interval are shown in green. </li></ul><ul><li>Verbal interpretation of each case, rule and scale dependent. </li></ul>
68.
Visualization <ul><li>Probability of classes versus input uncertainty. </li></ul><ul><li>Detailed input probabilities around the measured values vs. change in the single scale; changes over time define ‘patients trajectory’. </li></ul><ul><li>Interactive multidimensional scaling: zooming on the new case to inspect its similarity to other cases. </li></ul>
72.
Summary <ul><li>Computational intelligence methods: neural, decision trees, similarity-based & other, help to understand the data. </li></ul><ul><li>Understanding data: achieved by rules, prototypes, visualization. </li></ul><ul><li>Small is beautiful => simple is the best! </li></ul><ul><li>Simplest possible, but not simpler - regularization of models; accurate but not too accurate - handling of uncertainty; </li></ul><ul><li>high confidence, but not paranoid - rejecting some cases. </li></ul><ul><li>Challenges: </li></ul><ul><li>hierarchical systems, discovery of theories rather than data models, integration with image/signal analysis, reasoning in complex domains/objects, applications in bioinformatics, text analysis ... </li></ul>
73.
References <ul><li>Many papers, comparison of results for numerous datasets are kept at: </li></ul><ul><li>http://www.phys.uni.torun.pl/kmk </li></ul><ul><li>See also my homepage at: </li></ul><ul><li>http://www.phys.uni.torun.pl/ ~duch </li></ul><ul><li>for this and other presentations and some papers. </li></ul>We are slowly getting there. All this and more is included in the Ghostminer , data mining software (in collaboration with Fujitsu) just released … http://www.fqspl.com.pl/ghostminer/
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.
Be the first to comment