Excerpt from the CloverETL Basic Training slides.
The basic course lasts 3 days and covers basic principles, CloverETL Designer walkthrough, transaction analysis, lookups, database connections, working with structured data, XML etc.
More at www.cloveretl.com/services/training
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
CloverETL Basic Training Excerpt
1. Basic Training Course
for CloverETL software
Training teaser ―
excerpt from Basic Training Course
All rights reserved Javlin 2011
2. Training Course Documentation
This presentations accompanies the training course
delivery
It can serve as a baseline for self-study
The course focuses on fundamentals of CloverETL
platform which are needed for graph development
and management
This document includes additional topics which are
intended to be used as introductions to more
advanced concepts and techniques
The additional topics are not a formal part of the
course; they may or may not be referenced during the
class time depending on factors such as time
constraints and project relevance
2 All rights reserved Javlin 2011
3. Training Course Objectives
On successful completion of this course you will be
able to:
› Develop solutions to business problems using CloverETL
platform
› Compose graphs using Designer and Engine components
› Describe data formats with metadata definitions
› Access data from multiple sources including files and
databases
› Detect and react to errors in data
› Optimize your existing graphs
› Deploy and manage graphs in CloverETL Server
environment
3 All rights reserved Javlin 2011
4. Agenda
DAY 1
Introduction
Basic Principles
Getting Started
Designer Walkthrough
Transaction Analysis
4 All rights reserved Javlin 2011
5. Agenda
DAY 2
Graphs for Real World
Customer Profile Analysis
Lookups: Searching in Data
5 All rights reserved Javlin 2011
6. Agenda
DAY 3
Database Datasources
Working with Structured Data
XML input/output
Final Review
Test
Q&A
6 All rights reserved Javlin 2011
7. B6. Task Discussion
Sometimes data need to be enriched with referential
information:
Who are the debtors?
Steps:
› Find customers identifiers who have negative personal
balance
› Look up details for all such customers – first and last name.
How:
› Use lookup tables to prepare the data for searching
› Use LookupJoin component to search the table
7 All rights reserved Javlin 2011
8. Lookup Tables
Lookup tables are data structures that allow fast
searches over data
Simple lookup is a hash table in memory
Database lookup is a database table with local cache
Range lookup allows performing range queries
› “Is the value A in range <10,20> or (20,100> ?”
Persistent lookup uses index files to search data
Aspell lookup allows similarity search over strings
› “Find matches for keyword ‘car’”. “Bar, card, cars”
8 All rights reserved Javlin 2011
9. Lookup Table Structure
Data stored in lookup tables has the following
structure:
Search key
› One or multiple fields
Return value
› Returned when a match with key is found
› Some tables allow storing duplicate keys
› More than one match can be found
9 All rights reserved Javlin 2011
10. Populating Lookup Tables
Data for a lookup table can be provided by several means:
Manual data entry
› Data are part of lookup table definition
File reference
› Table definition contains URL of the input file
› Metadata describe format of input file
› Simple parsing
Dynamic population
› Designated component for writing into lookup files
› Data can be created dynamically by a graph
10 All rights reserved Javlin 2011
11. Using Lookup Tables
Lookup tables are reusable and can be accessed from
all reformat-like components.
Reduce the size of the lookup by reducing record
width and including only applicable records in it.
Lookup table must fit into memory or the graph will
fail
› does not apply to database and persistent lookups
Comparable to Hash Join in performance
Offer more flexibility than joiners for partial matching
11 All rights reserved Javlin 2011
12. Component LookupTableReaderWriter
The component can read or write contents of a
lookup tables of any type
Use lookup table to:
› Dynamically populate lookup table with data
› Prepare the data for lookup when advanced parsing is
needed
› Dump lookup table into file or database
Found in the Others section of Component Palette
To configure the component, you need to provide:
› Target lookup table
12 All rights reserved Javlin 2011
13. B6. Complete Graph Section
Step B6. Populate lookup table with data
Key points:
Use Simple lookup table type
Drop unnecessary fields prior to loading into table.
Split the graph into two phases, 0 and 1.
13 All rights reserved Javlin 2011
14. Component LookupJoin
LookupJoin component searching a lookup table for
match with records from regular data flow.
Use lookup table to:
› Search any kind of lookup table for a match.
› Find records that did not have any match
› Comfortably handle multiple matches
Found in the Joiners section of Component Palette
To configure the component, you need to provide:
› Lookup table
› Joining key
14 All rights reserved Javlin 2011
15. B6. Complete Graph Section
Step B5. Populate lookup table with data
Key points:
Use ExtFilter to find customers with negative
balance.
Use LookupJoin to search lookup table
15 All rights reserved Javlin 2011
16. B7. Task Discussion
Range queries can be used to group similar records:
What level of risk do the debtors impose?
Steps:
› Use three risk levels: low, medium, high
› Risk level is assigned based on amount of money owed
How:
› Use range lookup table to accommodate the range query
› Use lookup(<table_name>).get() to search the table from
transformation code
16 All rights reserved Javlin 2011
17. Range Lookup Definition
Data for range lookup:
-1000|0|Low
-10000|-1000|Medium
-1000000|-10000|High
Interval Return
range value
Interval
Inclusivity
Notes Interval
range
› Only first match is returned -> order of data matters
› null value in range definition means “unlimited”
• Data to match everything:
||the rest
17 All rights reserved Javlin 2011
18. B7. Complete Graph Section
Step B7. What level of risk do the debtors impose?
Key points:
Use range lookup to create risk level intervals
Use Reformat and lookup() to perform search
18 All rights reserved Javlin 2011