SlideShare a Scribd company logo
HPCC Systems 
Loading csv Data 
& 
Querying 
By Fujio Turner 
@FujioTurner
Non-Indexed Full Data Set 
1 20 
Customers Development Business 
http://hpccsystems.com/why-hpcc/benchmarks
ECL (Enterprise Control Language) 
C++ based query language 
SQL w/ JOINS 
Map/Reduce 
GraphDB 
Machine 
Learning 
Simple to Complex Queries
“I’m sub-second 
fast.” 
“I can query all 
or part of your 
data.” 
Architecture 
Thor Roxie 
Hard Disk 
Index(optional) 
Hard Disk 
Index(optional) 
In-memory Index 
SSD 
Either/Both
Example 
File Load File into HPCC Query 
CSV data sample source 
http://catalog.data.gov/dataset/consumer-complaint-database
Administrator Web GUI! 
on 
IP / Url of HPCC install Port 8010
4. add ,t 
5. 
1. Upload file*! 
2. Distribute to cluster! 
3. Name of file in cluster! 
4. Most CSV have t! 
5. Push to cluster 
*2GB file size limit through web 
No limit if uploaded via SOAP 
Load !! ! ! Data
*optional file rename Loaded 
In Thor Cluster
How do I Query HPCC Systems ? 
What Is ECL? 
ECL (Enterprise Control Language) is a C++ based query 
language for use with HPCC Systems Big Data platform. 
ECLs syntax and format is very simple and easy to learn.! 
! 
Note - ECL is very similar to Hadoop’s pig ,but! 
more expressive and feature rich.
Query w/ ECL 
Com := DATASET(‘~test::complaints’,ComS, 
CSV(HEADING(1), SEPARATOR([',','t']))); 
ComS :=RECORD 
UNSIGNED3 ComplaintID; 
STRING23 Product; 
STRING38 State; 
…………………………. 
…………………………. 
STRING31Consumer_disputed; 
END; 
Ma := Com(State = ‘MA’); 
Ma; //output 
WHERE `State` = ‘MA’ 
File Type 
File Location,! 
“FROM Table” 
“USE DATABASE;” 
“SELECT * ….” 
Schema
1. Go to playground! 
2. Edit ECL! 
3. Pick “thor” Cluster! 
4. Submit 
Practice 
http://www.meetup.com/HPCC-SV/pages/ECL_EXAMPLE__- 
_CSV_LOAD_and_QUERY
Schema Made EZ 
http://hpccsystems.com/demos/data-profiling-demo 
CSV 
IN 
Schema 
Click OUT 
Storing a new file and want to make a quick schema? 
! 
Take a small part of your CSV data and 
go to the link below to make an ECL Schema
ECL Guide 
http://hpccsystems.com/download/docs/ecl-language-reference 
JOIN! 
MERGE! 
LENGTH! 
REGEX! 
ROUND! 
SUM! 
COUNT! 
TRIM! 
WHEN! 
AVE! 
ABS! 
CASE! 
DEDUP! 
NORMALIZE! 
DENORMALIZE! 
IF! 
SORT! 
GROUP! 
more ….
For More HPCC! 
“How To’s”! 
Go to SlideShare 
http://www.slideshare.net/FujioTurner/
Watch how to install 
HPCC Systems 
in 5 Minutes 
Download HPCC Systems 
Open Source 
Community Edition 
http://hpccsystems.com/download/ 
http://www.youtube.com/watch?v=8SV43DCUqJg 
or 
Source Code 
https://github.com/hpcc-systems

More Related Content

What's hot

Redis Developers Day 2015 - Secondary Indexes and State of Lua
Redis Developers Day 2015 - Secondary Indexes and State of LuaRedis Developers Day 2015 - Secondary Indexes and State of Lua
Redis Developers Day 2015 - Secondary Indexes and State of LuaItamar Haber
 
Introduction to SparkR | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to SparkR | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to SparkR | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to SparkR | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Cloudera, Inc.
 
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingHive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingMitsuharu Hamba
 
Native erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentationNative erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentationlin bao
 
Hypertable - massively scalable nosql database
Hypertable - massively scalable nosql databaseHypertable - massively scalable nosql database
Hypertable - massively scalable nosql databasebigdatagurus_meetup
 
Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Rupak Roy
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016Duyhai Doan
 
Database Architectures and Hypertable
Database Architectures and HypertableDatabase Architectures and Hypertable
Database Architectures and Hypertablehypertable
 
Hypertable
HypertableHypertable
Hypertablebetaisao
 
Hadoop Essential for Oracle Professionals
Hadoop Essential for Oracle ProfessionalsHadoop Essential for Oracle Professionals
Hadoop Essential for Oracle ProfessionalsChien Chung Shen
 
Redis/Lessons learned
Redis/Lessons learnedRedis/Lessons learned
Redis/Lessons learnedTit Petric
 
Datasets and tools_from_ncbi_and_elsewhere_for_microbiome_research_v_62817
Datasets and tools_from_ncbi_and_elsewhere_for_microbiome_research_v_62817Datasets and tools_from_ncbi_and_elsewhere_for_microbiome_research_v_62817
Datasets and tools_from_ncbi_and_elsewhere_for_microbiome_research_v_62817Ben Busby
 
Let's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and ElasticsearchLet's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and ElasticsearchInfluxData
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R StudioRupak Roy
 
Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meetingJohannes Keizer
 
Parquet Twitter Seattle open house
Parquet Twitter Seattle open houseParquet Twitter Seattle open house
Parquet Twitter Seattle open houseJulien Le Dem
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016Duyhai Doan
 

What's hot (20)

Redis Developers Day 2015 - Secondary Indexes and State of Lua
Redis Developers Day 2015 - Secondary Indexes and State of LuaRedis Developers Day 2015 - Secondary Indexes and State of Lua
Redis Developers Day 2015 - Secondary Indexes and State of Lua
 
Introduction to SparkR | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to SparkR | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to SparkR | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to SparkR | Big Data Hadoop Spark Tutorial | CloudxLab
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
 
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingHive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
 
Native erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentationNative erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentation
 
Hadoop
HadoopHadoop
Hadoop
 
Hypertable - massively scalable nosql database
Hypertable - massively scalable nosql databaseHypertable - massively scalable nosql database
Hypertable - massively scalable nosql database
 
Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Introduction to hadoop ecosystem
Introduction to hadoop ecosystem
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
 
Database Architectures and Hypertable
Database Architectures and HypertableDatabase Architectures and Hypertable
Database Architectures and Hypertable
 
Hypertable
HypertableHypertable
Hypertable
 
Hadoop Essential for Oracle Professionals
Hadoop Essential for Oracle ProfessionalsHadoop Essential for Oracle Professionals
Hadoop Essential for Oracle Professionals
 
Redis/Lessons learned
Redis/Lessons learnedRedis/Lessons learned
Redis/Lessons learned
 
Datasets and tools_from_ncbi_and_elsewhere_for_microbiome_research_v_62817
Datasets and tools_from_ncbi_and_elsewhere_for_microbiome_research_v_62817Datasets and tools_from_ncbi_and_elsewhere_for_microbiome_research_v_62817
Datasets and tools_from_ncbi_and_elsewhere_for_microbiome_research_v_62817
 
Let's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and ElasticsearchLet's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and Elasticsearch
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R Studio
 
Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meeting
 
Parquet Twitter Seattle open house
Parquet Twitter Seattle open houseParquet Twitter Seattle open house
Parquet Twitter Seattle open house
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016
 
Full Text Search in PostgreSQL
Full Text Search in PostgreSQLFull Text Search in PostgreSQL
Full Text Search in PostgreSQL
 

Similar to Big Data - Load CSV File & Query the EZ way - HPCC Systems

Runtime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya RathoreRuntime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya RathoreEsha Yadav
 
Hacking Oracle From Web Apps 1 9
Hacking Oracle From Web Apps 1 9Hacking Oracle From Web Apps 1 9
Hacking Oracle From Web Apps 1 9sumsid1234
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14Sri Ambati
 
Accelerated data access
Accelerated data accessAccelerated data access
Accelerated data accessgordonyorke
 
Sql Injection Adv Owasp
Sql Injection Adv OwaspSql Injection Adv Owasp
Sql Injection Adv OwaspAung Khant
 
Advanced SQL Injection
Advanced SQL InjectionAdvanced SQL Injection
Advanced SQL Injectionamiable_indian
 
OWB11gR2 - Extending ETL
OWB11gR2 - Extending ETL OWB11gR2 - Extending ETL
OWB11gR2 - Extending ETL Suraj Bang
 
Quantifying Container Runtime Performance: OSCON 2017 Open Container Day
Quantifying Container Runtime Performance: OSCON 2017 Open Container DayQuantifying Container Runtime Performance: OSCON 2017 Open Container Day
Quantifying Container Runtime Performance: OSCON 2017 Open Container DayPhil Estes
 
20090109 Dsl2cpp Md Workbench
20090109 Dsl2cpp Md Workbench20090109 Dsl2cpp Md Workbench
20090109 Dsl2cpp Md Workbenchazubi
 
"PHP from soup to nuts" -- lab exercises
"PHP from soup to nuts" -- lab exercises"PHP from soup to nuts" -- lab exercises
"PHP from soup to nuts" -- lab exercisesrICh morrow
 
Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부Hyun-Mook Choi
 
C5 c++ development environment
C5 c++ development environmentC5 c++ development environment
C5 c++ development environmentsnchnchl
 
Exam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and AdministrationExam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and AdministrationKylieJonathan
 
Purdue CS354 Operating Systems 2008
Purdue CS354 Operating Systems 2008Purdue CS354 Operating Systems 2008
Purdue CS354 Operating Systems 2008guestd9065
 
Virtual platform
Virtual platformVirtual platform
Virtual platformsean chen
 
Migrating ESM Resources From Oracle to CORR-Engine for ESM 6.5c SP1
Migrating ESM Resources From Oracle to CORR-Engine for ESM 6.5c SP1Migrating ESM Resources From Oracle to CORR-Engine for ESM 6.5c SP1
Migrating ESM Resources From Oracle to CORR-Engine for ESM 6.5c SP1Protect724mouni
 

Similar to Big Data - Load CSV File & Query the EZ way - HPCC Systems (20)

Runtime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya RathoreRuntime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya Rathore
 
Database driven web pages
Database driven web pagesDatabase driven web pages
Database driven web pages
 
Hacking Oracle From Web Apps 1 9
Hacking Oracle From Web Apps 1 9Hacking Oracle From Web Apps 1 9
Hacking Oracle From Web Apps 1 9
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14
 
Accelerated data access
Accelerated data accessAccelerated data access
Accelerated data access
 
Sql Injection Adv Owasp
Sql Injection Adv OwaspSql Injection Adv Owasp
Sql Injection Adv Owasp
 
Advanced SQL Injection
Advanced SQL InjectionAdvanced SQL Injection
Advanced SQL Injection
 
OWB11gR2 - Extending ETL
OWB11gR2 - Extending ETL OWB11gR2 - Extending ETL
OWB11gR2 - Extending ETL
 
Quantifying Container Runtime Performance: OSCON 2017 Open Container Day
Quantifying Container Runtime Performance: OSCON 2017 Open Container DayQuantifying Container Runtime Performance: OSCON 2017 Open Container Day
Quantifying Container Runtime Performance: OSCON 2017 Open Container Day
 
Ch23 system administration
Ch23 system administration Ch23 system administration
Ch23 system administration
 
htogcp.docx
htogcp.docxhtogcp.docx
htogcp.docx
 
20090109 Dsl2cpp Md Workbench
20090109 Dsl2cpp Md Workbench20090109 Dsl2cpp Md Workbench
20090109 Dsl2cpp Md Workbench
 
"PHP from soup to nuts" -- lab exercises
"PHP from soup to nuts" -- lab exercises"PHP from soup to nuts" -- lab exercises
"PHP from soup to nuts" -- lab exercises
 
Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부
 
MattsonTutorialSC14.pdf
MattsonTutorialSC14.pdfMattsonTutorialSC14.pdf
MattsonTutorialSC14.pdf
 
C5 c++ development environment
C5 c++ development environmentC5 c++ development environment
C5 c++ development environment
 
Exam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and AdministrationExam 1z0 062 Oracle Database 12c: Installation and Administration
Exam 1z0 062 Oracle Database 12c: Installation and Administration
 
Purdue CS354 Operating Systems 2008
Purdue CS354 Operating Systems 2008Purdue CS354 Operating Systems 2008
Purdue CS354 Operating Systems 2008
 
Virtual platform
Virtual platformVirtual platform
Virtual platform
 
Migrating ESM Resources From Oracle to CORR-Engine for ESM 6.5c SP1
Migrating ESM Resources From Oracle to CORR-Engine for ESM 6.5c SP1Migrating ESM Resources From Oracle to CORR-Engine for ESM 6.5c SP1
Migrating ESM Resources From Oracle to CORR-Engine for ESM 6.5c SP1
 

Recently uploaded

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Product School
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsVlad Stirbu
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsExpeed Software
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»QADay
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...Product School
 

Recently uploaded (20)

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 

Big Data - Load CSV File & Query the EZ way - HPCC Systems

  • 1. HPCC Systems Loading csv Data & Querying By Fujio Turner @FujioTurner
  • 2. Non-Indexed Full Data Set 1 20 Customers Development Business http://hpccsystems.com/why-hpcc/benchmarks
  • 3. ECL (Enterprise Control Language) C++ based query language SQL w/ JOINS Map/Reduce GraphDB Machine Learning Simple to Complex Queries
  • 4. “I’m sub-second fast.” “I can query all or part of your data.” Architecture Thor Roxie Hard Disk Index(optional) Hard Disk Index(optional) In-memory Index SSD Either/Both
  • 5. Example File Load File into HPCC Query CSV data sample source http://catalog.data.gov/dataset/consumer-complaint-database
  • 6. Administrator Web GUI! on IP / Url of HPCC install Port 8010
  • 7. 4. add ,t 5. 1. Upload file*! 2. Distribute to cluster! 3. Name of file in cluster! 4. Most CSV have t! 5. Push to cluster *2GB file size limit through web No limit if uploaded via SOAP Load !! ! ! Data
  • 8. *optional file rename Loaded In Thor Cluster
  • 9. How do I Query HPCC Systems ? What Is ECL? ECL (Enterprise Control Language) is a C++ based query language for use with HPCC Systems Big Data platform. ECLs syntax and format is very simple and easy to learn.! ! Note - ECL is very similar to Hadoop’s pig ,but! more expressive and feature rich.
  • 10. Query w/ ECL Com := DATASET(‘~test::complaints’,ComS, CSV(HEADING(1), SEPARATOR([',','t']))); ComS :=RECORD UNSIGNED3 ComplaintID; STRING23 Product; STRING38 State; …………………………. …………………………. STRING31Consumer_disputed; END; Ma := Com(State = ‘MA’); Ma; //output WHERE `State` = ‘MA’ File Type File Location,! “FROM Table” “USE DATABASE;” “SELECT * ….” Schema
  • 11. 1. Go to playground! 2. Edit ECL! 3. Pick “thor” Cluster! 4. Submit Practice http://www.meetup.com/HPCC-SV/pages/ECL_EXAMPLE__- _CSV_LOAD_and_QUERY
  • 12. Schema Made EZ http://hpccsystems.com/demos/data-profiling-demo CSV IN Schema Click OUT Storing a new file and want to make a quick schema? ! Take a small part of your CSV data and go to the link below to make an ECL Schema
  • 13. ECL Guide http://hpccsystems.com/download/docs/ecl-language-reference JOIN! MERGE! LENGTH! REGEX! ROUND! SUM! COUNT! TRIM! WHEN! AVE! ABS! CASE! DEDUP! NORMALIZE! DENORMALIZE! IF! SORT! GROUP! more ….
  • 14. For More HPCC! “How To’s”! Go to SlideShare http://www.slideshare.net/FujioTurner/
  • 15. Watch how to install HPCC Systems in 5 Minutes Download HPCC Systems Open Source Community Edition http://hpccsystems.com/download/ http://www.youtube.com/watch?v=8SV43DCUqJg or Source Code https://github.com/hpcc-systems