Data extraction is the first stage in data analysis, it requires business executives, skilled data specialists, and a good data extraction ETL tool or self-activating programming script. Reach us at info@in2inglobal.com or call at 095525 54566
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Data Extraction Service |Extract, Transform and Load
1. TYPES OF EXTRACTION IN ETL
ETL helps in breaking down the data silos and helps your data scientists
analyze business data. Data scientists turn this data into business
intelligence reports which play a key role in understanding your business
graph (profit, loss).
What is the ETL process in a data warehouse?
Before going into the types of extraction in ETL, we will first understand the
ETL process. ETL stands for Extract, Transform and Load. It is a process in
which the data is extracted from different source systems, transformation
logic is applied in the staging area and finally, the transformed data is
loaded into the data warehouse.
What are the types of extraction in ETL?
In ETL, there are 2 types of extraction methods from which the data can be
loaded from a source system into the staging area. The below picture
illustrates the ETL [Extraction, Transformation, and Loading] process in
data warehousing.
2. There are two extraction types in ETL. They are
1)Logical extraction and
2)Physical extraction
Now let us see them in detail:
LOGICAL EXTRACTION OF DATA
Logical extraction can be done by two methods as explained below. They
are
Full extraction of data
In this method of extraction, the data is extracted in a single trip from the
source system. There is no necessity to keep track of changes as the
extraction reflects all the data. For example, exporting the full table into a
flat file. This is a less complicated process if the right data extraction tools
are used.
Incremental extraction of data
In this method of extraction, data extraction is a complex and ongoing
process. The extraction is not limited to initial retrieval. Since the last data
extraction, we need to track the changes in the source system. To
3. determine the recent changes to the source data, additional logic is
needed. This logic is called Change Data Capture (CDC).
Change Data Capture (CDC)
An incremental type of extraction can be done by Change Data Capture
(CDC). The CDC process captures the changes made to the source
system and applies them throughout the enterprise. With CDC, the
resources required for ETL can be minimized. A data warehouse should
maintain a history of the changes the business is undergoing on a
day-to-day basis. The CDC helps achieve this goal.
In the above example, Santhosh and Piyush are doing regular transactions
like deposits and withdrawals. So, CDC aims in capturing these changes
and calculating the right amount. The final calculated amount is updated in
the table, as shown above.
Now let us jump into another type of extraction in ETL.
PHYSICAL EXTRACTION OF DATA
Physical extraction can be done by two methods, as explained below. They
are
Online extraction of data
In this method of extraction, information is extracted directly from the
source system. However, the data can be accessed through an
intermediate system.
Offline extraction of data
In this method of extraction, data is not extracted directly from the source
system but instead staged intentionally outside the original source system.
4. The data either already had a structure or was created by an extraction
method. The following structures are considered.
1) flat files: in a generic format
2) Dump files: database-specific files
CONCLUSION
The type of extraction to be chosen depends on the type of source and
business needs. In2In global provides data extraction as a solution for data
analysis at affordable prices.