DATA MINING AND DATA WAREHOUSE
W.H. Inmon
OLAP, (On-line analytical processing)
OLTP, ( On-line transaction processing )
Data Cleaning
Data Integration
Data Selection
Data Transformation
Data warehouse vs Data Mining
Use in Urban Planning
1. DATA MINING AND DATA WAREHOUSE
Satish Kumar (21n78e@gmail.com)
Architect ,Urban Planner ,Researcher, G.C.A ,Lucknow,
Jamia Millia Islamia-Delhi, G..B.U-Greater Noida
UP-103,Statistical Technique and Computer Programming
2. Data Warehouse
• W.H. Inmon, a renowned architect built the data warehouse.
• A data warehouse is the primary depository of an organization's historical
data, its corporate memory.
• It contains the raw material for management's decision support system.
• The critical factor leading to the use of a data warehouse is that a data analyst
can perform complex queries and analysis.
• Data mining, on the information without slowing down the operational systems.
• Data warehouse contains :
1. OLAP, (On-line analytical processing): the functional or performance
requirements
2. OLTP, ( On-line transaction processing ) applications traditionally supported by
the operational databases.
• Machine learning builds computer frameworks which have the capacity to
extend the execution in a indicated space through experience.
3. Data Mining
• It has been characterized as "the nontrivial extraction of understood, already
obscure, and possibly valuable data from information and the science of
extricating valuable data from huge information sets or databases ".
• Data mining includes sorting through large sums of information/data and picking
out significant data.
• It is utilized by Trade insights organizations, and budgetary examiners.
• It is progressively utilized within the sciences to extricate data from the gigantic
information sets produced by cutting edge exploratory and observational methods.
Steps of Data mining process :
1. Data Cleaning : Removes noisy data and corrects inconsistencies in data, it
corrects the wrong data.
2. Data Integration: Data integration may be a information pre-processing handle that
combine the information from numerous heterogeneous information sources into
information store.
3. Data Selection : In this the relevant data pertaining to the analysis task is
segregated from the complete database.
4. Data Transformation: Data is changed or solidified into shapes which are perfect
for mining by performing rundown or accumulation operations.
4. Data Mining
5. Data Mining : Characterized as extricating data from a
huge set of information. Data mining is mining the
information from huge sum of database.
6. Pattern Evaluation : In this step the analysis is done
to find if there is certain kind of pattern formation in the
data.
7. Knowledge Presentation : Summarised knowledge
and analysis report is presented
Features of Data warehouse:
1. It provides subject oriented (product & customers)
data rather of organisational operation.
2. It is integrated data from heterogeneous sources,
which helps in better integration, interrelation and
analysis of data.
3. Data warehouse offers data to be collected from
particular time period, also from historical records.
4. Past data is non volatile or not deleted when the
current data is added.
5. Data warehouse vs Data Mining
1. Data Mining perfect example is fraudulent use of your credit card by some else
in other location far from where you live, the credit card companies are put on
alert to a possible fraud since their data mining shows that you don’t normally
make purchases in that city.
2. Data warehouse example can be Facebook which gathers all of your data – your
friends, your likes, who you stalk, etc – and then stores that data into one central
repository.
3. They want to make sure that you see the most relevant ads after data mining the
data from data warehouse , in which meaningful data and patterns are extracted
from the aggregated data.
Eg in Urban Planning :
1. E-Governance in socio and economic life: E-governance is not merely
providing information about various activities of a government to its citizens and
other organizations but it also involves citizens to communicate with government
and participate in government decision-making process.
2. Traffic control and managements: Infrastructure-based sensor systems to
collect real-time data on important aspects of driver and traffic behaviour, vehicle
emissions, pollutant dispersion and concentration, and human exposure .