HPCC Systems - ECL for Programmers - Big Data - Data Scientist

HPCC Systems - ECL Intro
Big Data Querying Made EZ
By Fujio Turner
Enterprise Control Language
explained for Programmers
@FujioTurner

Comparison
Block Based File Based
JAVA C++
Petabytes
1-80,000 Jobs/day
Since 2005
Exabytes
Non-Indexed 4X-13X
Indexed: 2K-3K Jobs/sec
Since 2000
? ? ? ? ? ?
Thor Roxie

What Is ECL?
ECL (Enterprise Control Language) is a C++ based query
language for use with HPCC Systems Big Data platform.
ECLs syntax and format is very simple and easy to learn.!
!
Note - ECL is very similar to Hadoop’s pig ,but!
more expressive and feature rich.

Comparing ECL to General Programming
In this presentation you will see how in ECL loading and
querying data is just like reading and finding data in a
plain text file.!
general programming (general common logic)!
vs.!
ECL
General Code HERE ECL Code HERE
General ECL

Example Text File
Name State Age
Kevin CA 45
Mark MI 27
Sara FL 64
Customer Data May 2010
~/cdata_2010.txt!
example file name
= ~/hpcc::cdata_2010.txt
ECL example file distributed in HPCC cluster

Opening File: general programming vs ECL
d = fopen(‘~/cdata_2010.txt’)
File Location
d := DATASET(‘~hpcc::cdata_2010’,cs,THOR);
General ECL

Opening File: general programming vs ECL
File Location
Open File Function
General ECL

Organizing: general programming vs ECL
new_d = split( d ,“rn”)
Split Data(d) by Row
Kevin CA 45
Mark MI 27
Sara FL 64
General ECL

Organizing: general programming vs ECL
cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
Split Data(d) by Row
Use This Schema on this file!
to Give Structure to Data
Kevin CA 45
Mark MI 27
Sara FL 64
General ECL

Find “Sara”: general programming vs ECL
cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
Split Data by Column
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! new_row = split(row,“ ”)!
! if(new_row[0] == ‘Sara’){!
! ! print ”Found Sara”!
! }!
}
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
General ECL

cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! }!
}
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
Filter Data By
General ECL

cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! }!
}
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
Filter Data By
Output
General ECL

cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! }!
}
sara := d(Name = ‘Sara’);
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
Filter Data By
Output
General ECL

cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! }!
}
sara := d(Name = ’Sara’);
OUTPUT(sara);
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
Filter Data By
Output
General ECL

Find “Sara” & Older then 50: general programming vs ECL
cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! new_row = row.split(“ ”)!
! if(new row[0] == ‘Sara’ and row[2] >50){!
! }!
}
sara := d(Name = ‘Sara’ AND Age > 50);
OUTPUT(sara);
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
General ECL

ECL is EZ
•Make your own functions & libraries in ECL.!
•Modularize your code with “Import”: reuse old code
Machine Learning Built-in
http://hpccsystems.com/ml

ECL Plugin for Eclipse IDE
http://hpccsystems.com/products-and-services/products/plugins/eclipse-ide

ECL + Others Languages
ECL is C++ based so all your C/C++ code can be used in ECL.!
&!
Use other languages and methods like below to query too.

ECL GUIDE
http://hpccsystems.com/download/docs/ecl-language-reference
JOIN!
MERGE!
LENGTH!
REGEX!
ROUND!
SUM!
COUNT!
TRIM!
WHEN!
AVE!
ABS!
CASE!
DEDUP!
NORMALIZE!
DENORMALIZE!
IF!
SORT!
GROUP!
more ….

For More HPCC “How To’s” Go to
Query with
Plain SQL
http://www.slideshare.net/hpccsystems/jdbc-hpcc
or SQL TO ECL
http://www.slideshare.net/FujioTurner/meet-up-sqldemopp

Watch how to install
HPCC Systems
in 5 Minutes
Download HPCC Systems
Open Source
Community Edition
http://hpccsystems.com/download/
http://www.youtube.com/watch?v=8SV43DCUqJg
or
Source Code
https://github.com/hpcc-systems

HPCC Systems - ECL for Programmers - Big Data - Data Scientist

HPCC Systems - ECL for Programmers - Big Data - Data Scientist

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to HPCC Systems - ECL for Programmers - Big Data - Data Scientist

Similar to HPCC Systems - ECL for Programmers - Big Data - Data Scientist (20)

More from Fujio Turner

More from Fujio Turner (8)

Recently uploaded

Recently uploaded (20)

HPCC Systems - ECL for Programmers - Big Data - Data Scientist