A "How To" to load CSV files into HPCC Systems and query them. You can use this method to migrate your RDBMS data ,MySQL / Oracle / SQL, into HPCC Systems.
2. Non-Indexed Full Data Set
1 20
Customers Development Business
http://hpccsystems.com/why-hpcc/benchmarks
3. ECL (Enterprise Control Language)
C++ based query language
SQL w/ JOINS
Map/Reduce
GraphDB
Machine
Learning
Simple to Complex Queries
4. “I’m sub-second
fast.”
“I can query all
or part of your
data.”
Architecture
Thor Roxie
Hard Disk
Index(optional)
Hard Disk
Index(optional)
In-memory Index
SSD
Either/Both
5. Example
File Load File into HPCC Query
CSV data sample source
http://catalog.data.gov/dataset/consumer-complaint-database
7. 4. add ,t
5.
1. Upload file*!
2. Distribute to cluster!
3. Name of file in cluster!
4. Most CSV have t!
5. Push to cluster
*2GB file size limit through web
No limit if uploaded via SOAP
Load !! ! ! Data
9. How do I Query HPCC Systems ?
What Is ECL?
ECL (Enterprise Control Language) is a C++ based query
language for use with HPCC Systems Big Data platform.
ECLs syntax and format is very simple and easy to learn.!
!
Note - ECL is very similar to Hadoop’s pig ,but!
more expressive and feature rich.
11. 1. Go to playground!
2. Edit ECL!
3. Pick “thor” Cluster!
4. Submit
Practice
http://www.meetup.com/HPCC-SV/pages/ECL_EXAMPLE__-
_CSV_LOAD_and_QUERY
12. Schema Made EZ
http://hpccsystems.com/demos/data-profiling-demo
CSV
IN
Schema
Click OUT
Storing a new file and want to make a quick schema?
!
Take a small part of your CSV data and
go to the link below to make an ECL Schema
14. For More HPCC!
“How To’s”!
Go to SlideShare
http://www.slideshare.net/FujioTurner/
15. Watch how to install
HPCC Systems
in 5 Minutes
Download HPCC Systems
Open Source
Community Edition
http://hpccsystems.com/download/
http://www.youtube.com/watch?v=8SV43DCUqJg
or
Source Code
https://github.com/hpcc-systems