This document discusses data mining and the RapidMiner tool. It defines data mining as a discipline that studies methods for extracting knowledge or finding patterns from large amounts of data. It outlines the CRISP-DM process for data mining including data collection, preprocessing, modeling, evaluation, and knowledge. Common data preprocessing, modeling techniques like classification, clustering and association, and performance metrics are described. RapidMiner is presented as a popular open-source tool for visualizing the data mining process with an intuitive graphical user interface.
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Â
Webinar Data Mining dengan Rapidminer | Universitas Budi Luhur
1. DATA MINING
DENGAN
RAPIDMINERDr. Achmad Solichin, M.T.I | Universitas
Budi Luhur
Ngobrol Sore Santai
Via Zoom | Jum’at, 8 Januari 2021 @19.00 WIB
http://youtube.com/c/AchmadSo
2. Apa itu Data Mining?
Disiplin ilmu yang mempelajari metode untuk
mengekstrak pengetahuan atau menemukan
pola
dari suatu data yang besar.
4. 1. Himpunan
Data
(Pahami dan
Persiapkan Data)
2. Metode
Data Mining
(Pilih Metode
Sesuai Karakter Data)
3. Pengetahuan
(Pahami Model dan
Pengetahuan yg Sesuai )
4. Evaluation
(Analisis Model dan
Kinerja Metode)
PROSES DATA MINING
DATA PREPROCESSING
Data Cleaning
Data Integration
Data Reduction
Data Transformation
MODELING
Estimation
Prediction
Classification
Clustering
Association
MODEL
Formula
Tree
Cluster
Rule
Correlation
KINERJA
Akurasi
Tingkat Error
Jumlah Cluster
MODEL
Atribute/Faktor
Korelasi
Bobot
6. DATA PREPROCESSING
Data cleaning
• Fill in missing
values
• Smooth noisy
data
• Identify or
remove
outliers
• Resolve
inconsistencie
s
Data reduction
• Dimensionality
reduction
• Numerosity
reduction
• Data
compression
Data
transformation
and
discretization
• Normalization
• Concept
hierarchy
generation
Data integration
• Integration of
multiple
databases or
files
1 2 3 4
7. METODE DATA MINING
Klasifikasi
(Classification)
•Decision Tree / C4.5
•Naïve Bayes
•K-NN
•ID3
•dll
Klasterisasi
(Clustering)
•K-Means
•K-Medoids
•DBSCAN
•Fuzzy C-Means
•dll
Asosiasi
(Association)
•Apriori / Association
Rule
•FP-Growth
•dll
Estimasi dan
Peramalan
•Linear Regression
•Neural Network
•Support Vector
Machine
•dll
8. EVALUASI MODEL DATA MINING
Klasifikasi
(Classification)
• Confusion Matrix:
Accuracy
• ROC Curve: Area
Under Curve (AUC)
• dll
Klasterisasi
(Clustering)
• Davies–Bouldin
index
• Dunn index
• dll
Asosiasi
(Association)
• Lift Ratio
• F-measure
• dll
Estimasi dan
Peramalan
• RMSE
• MSE
• MAPE
• dll