HPCC Systems Query Language , ECL , is easy to learn. If you are a software developer you will have no problems learning it. Go through this slide to see a quick example.
Meetup - Exabyte Big Data - HPCC Systems - SQL to ECLFujio Turner
How to do SQL like queries is ECL (Enterprise Control Language)
Install HPCC and get started in a few minutes.
"How to install HPCC in 5 minutes" Youtube on last slide.
https://www.youtube.com/watch?v=8SV43DCUqJg
Visualizing ORACLE performance data with R @ #C16LVMaxym Kharchenko
A picture is worth a thousand words.
This is especially true during performance problems investigations where a well done graph of the issue can often cut resolution time from days to mere minutes.
ORACLE database provides a wealth of performance information, but unfortunately only a small part of it is currently visualized by standard tools, such as Enterprise Manager.
Enter R: a well known (and free) statistical analysis and graphing framework that can create relevant and interesting visualizations on pretty much any data.
Meetup - Exabyte Big Data - HPCC Systems - SQL to ECLFujio Turner
How to do SQL like queries is ECL (Enterprise Control Language)
Install HPCC and get started in a few minutes.
"How to install HPCC in 5 minutes" Youtube on last slide.
https://www.youtube.com/watch?v=8SV43DCUqJg
Visualizing ORACLE performance data with R @ #C16LVMaxym Kharchenko
A picture is worth a thousand words.
This is especially true during performance problems investigations where a well done graph of the issue can often cut resolution time from days to mere minutes.
ORACLE database provides a wealth of performance information, but unfortunately only a small part of it is currently visualized by standard tools, such as Enterprise Manager.
Enter R: a well known (and free) statistical analysis and graphing framework that can create relevant and interesting visualizations on pretty much any data.
Web Application Security 101 - 05 EnumerationWebsecurify
In part 5 of Web Application Security 101 we will dive into the various enumeration techniques attackers use to fingerprint web applications. This steps is very important because it gives a lot of insight about weak areas that can be exploited at later stage. You will learn about fingerprinting software versions and firewalls, discovering virtual hosts, google hacking and more.
Odessapy2013 - Graph databases and PythonMax Klymyshyn
Page 10 "Я из Одессы я просто бухаю." translation: I'm from Odessa I just drink. Meaning his drinking a lot of "Vodka" ^_^ (@tuc @hackernews)
This is local meme - when someone asking question and you will look stupid in case you don't have answer.
Unified Data Platform, by Pauline Yeung of Cisco SystemsAltinity Ltd
Presented on December ClickHouse Meetup. Dec 3, 2019
Our journey from using ClickHouse in an internal threat library web application, to experimenting with ClickHouse to migrating production data from Elasticsearch, Postgres, HBase, to trying ClickHouse for error metrics in a product under development.
Extending Spark SQL API with Easier to Use Array Types Operations with Marek ...Databricks
Big companies typically integrate their data from various heterogeneous systems when building a data lake as single point for accessing data. To achieve this goal technical teams often deal with data defined by complex schemas and various data formats. Spark SQL Datasets are currently compatible with data formats such as XML, Avro and Parquet by providing primitive and complex data types such as structs and arrays.
Although Dataset API offers rich set of functions, general manipulation of array and deeply nested data structures is lacking. We will demonstrate this fact by providing examples of data which is currently very hard to process in Spark efficiently. We designed and developed an extension of Dataset API to allow developers to work with array and complex type elements in a more straightforward and consistent way. The extension should help users dealing with complex and structured big data to use Apache Spark as a truly generic processing framework.
What is the best full text search engine for Python?Andrii Soldatenko
Nowadays we can see lot’s of benchmarks and performance tests of different web frameworks and Python tools. Regarding to search engines, it’s difficult to find useful information especially benchmarks or comparing between different search engines. It’s difficult to manage what search engine you should select for instance, ElasticSearch, Postgres Full Text Search or may be Sphinx or Whoosh. You face a difficult choice, that’s why I am pleased to share with you my acquired experience and benchmarks and focus on how to compare full text search engines for Python.
Data Exploration with Apache Drill: Day 2Charles Givre
Study after study shows that data scientists and analysts spend between 50% and 90% of their time preparing their data for analysis. Using Drill, you can dramatically reduce the time it takes to go from raw data to insight. This course will show you how.
The course material for this presentation are available at https://github.com/cgivre/data-exploration-with-apache-drill
Workshop on command line tools - day 1Leandro Lima
Slides of the I Workshop on command-line tools with the collaboration of CAG (Center for Applied Genomics - Children's Hospital of Philadelphia) bioinformatics analysts.
1st day
Installing Apache Hive, internal and external table, import-export Rupak Roy
Perform Hive installation with internal and external table import-export and much more
Let me know if anything is required. Happy to help.
Ping me google #bobrupakroy.
Dr. Hsieh is teaching how to use the state-of-the-art libraries, Spark by Apache, to conduct data analysis on hadoop platform in ISSNIP 2015, Singapore. He started with teaching the basic operations like “map, reduce, flatten, and more,” followed by explaining the extension of Spark, including MLib, GraphX, and SparkSQL.
My talk for the Scala meetup at PayPal's Singapore office.
The intention is to focus on 3 things:
(a) two common functions in Apache Spark "aggregate" and "cogroup"
(b) Spark SQL
(c) Spark Streaming
The umbrella event is http://www.meetup.com/Singapore-Scala-Programmers/events/219613576/
Web Application Security 101 - 05 EnumerationWebsecurify
In part 5 of Web Application Security 101 we will dive into the various enumeration techniques attackers use to fingerprint web applications. This steps is very important because it gives a lot of insight about weak areas that can be exploited at later stage. You will learn about fingerprinting software versions and firewalls, discovering virtual hosts, google hacking and more.
Odessapy2013 - Graph databases and PythonMax Klymyshyn
Page 10 "Я из Одессы я просто бухаю." translation: I'm from Odessa I just drink. Meaning his drinking a lot of "Vodka" ^_^ (@tuc @hackernews)
This is local meme - when someone asking question and you will look stupid in case you don't have answer.
Unified Data Platform, by Pauline Yeung of Cisco SystemsAltinity Ltd
Presented on December ClickHouse Meetup. Dec 3, 2019
Our journey from using ClickHouse in an internal threat library web application, to experimenting with ClickHouse to migrating production data from Elasticsearch, Postgres, HBase, to trying ClickHouse for error metrics in a product under development.
Extending Spark SQL API with Easier to Use Array Types Operations with Marek ...Databricks
Big companies typically integrate their data from various heterogeneous systems when building a data lake as single point for accessing data. To achieve this goal technical teams often deal with data defined by complex schemas and various data formats. Spark SQL Datasets are currently compatible with data formats such as XML, Avro and Parquet by providing primitive and complex data types such as structs and arrays.
Although Dataset API offers rich set of functions, general manipulation of array and deeply nested data structures is lacking. We will demonstrate this fact by providing examples of data which is currently very hard to process in Spark efficiently. We designed and developed an extension of Dataset API to allow developers to work with array and complex type elements in a more straightforward and consistent way. The extension should help users dealing with complex and structured big data to use Apache Spark as a truly generic processing framework.
What is the best full text search engine for Python?Andrii Soldatenko
Nowadays we can see lot’s of benchmarks and performance tests of different web frameworks and Python tools. Regarding to search engines, it’s difficult to find useful information especially benchmarks or comparing between different search engines. It’s difficult to manage what search engine you should select for instance, ElasticSearch, Postgres Full Text Search or may be Sphinx or Whoosh. You face a difficult choice, that’s why I am pleased to share with you my acquired experience and benchmarks and focus on how to compare full text search engines for Python.
Data Exploration with Apache Drill: Day 2Charles Givre
Study after study shows that data scientists and analysts spend between 50% and 90% of their time preparing their data for analysis. Using Drill, you can dramatically reduce the time it takes to go from raw data to insight. This course will show you how.
The course material for this presentation are available at https://github.com/cgivre/data-exploration-with-apache-drill
Workshop on command line tools - day 1Leandro Lima
Slides of the I Workshop on command-line tools with the collaboration of CAG (Center for Applied Genomics - Children's Hospital of Philadelphia) bioinformatics analysts.
1st day
Installing Apache Hive, internal and external table, import-export Rupak Roy
Perform Hive installation with internal and external table import-export and much more
Let me know if anything is required. Happy to help.
Ping me google #bobrupakroy.
Dr. Hsieh is teaching how to use the state-of-the-art libraries, Spark by Apache, to conduct data analysis on hadoop platform in ISSNIP 2015, Singapore. He started with teaching the basic operations like “map, reduce, flatten, and more,” followed by explaining the extension of Spark, including MLib, GraphX, and SparkSQL.
My talk for the Scala meetup at PayPal's Singapore office.
The intention is to focus on 3 things:
(a) two common functions in Apache Spark "aggregate" and "cogroup"
(b) Spark SQL
(c) Spark Streaming
The umbrella event is http://www.meetup.com/Singapore-Scala-Programmers/events/219613576/
Mindmap: Oracle to Couchbase for developersKeshav Murthy
This deck provides a high-level comparison between Oracle and Couchbase: Architecture, database objects, types, data model, SQL & N1QL statements, indexing, optimizer, transactions, SDK and deployment options.
Introduction to source{d} Engine and source{d} Lookout source{d}
Join us for a presentation and demo of source{d} Engine and source{d} Lookout. Combining code retrieval, language agnostic parsing, and git management tools with familiar APIs parsing, source{d} Engine simplifies code analysis. source{d} Lookout, a service for assisted code review that enables running custom code analyzers on GitHub pull requests.
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Data Con LA
Data transformation has traditionally required expertise in specialized data platforms and typically been restricted to the domain of IT. A domain specific language (DSL) separates the user’s intent from a specific implementation, while maintaining expressivity. A user interface can be used to produce these expressions, in the form of suggestions, without requiring the user to manually write code. This higher level interaction, aided by transformation previews and suggestion ranking allows domain experts such as data scientists and business analysts to wrangle data while leveraging the optimal processing framework for the data at hand.
Similar to HPCC Systems - ECL for Programmers - Big Data - Data Scientist (20)
Big Data - Fast Machine Learning at Scale + CouchbaseFujio Turner
Machine Learning at scale is full of challenges. Many data scientist are finding that HPCC Systems is the right fit for their needs with Machine Learning in HPCC Systems already "built-in".
NoSQL Couchbase Lite & BigData HPCC SystemsFujio Turner
Mobile data is becoming the new source for data. Managing data in the mobile devices has become easier with NoSQL Couchbase Lite mobile database. Making sense, analyzing, scaling to exabytes has also become easier with LexisNexis Big Data platform HPCC Systems.
Big Data for Small Businesses & StartupsFujio Turner
Big Data is not just for Big Businesses. In this slideshare we will cover how small businesses and startups can leverage Big Data to increase revenue. HPCC Systems lets you get started with only one machine and grow to exabytes.
1. Mining and understanding customers behavior from data outside the firewall and joining it with internal data to turn it into actionable marketing strategies.
2. Understanding your whole business with BI tools. Learn how Big Data help join data from different parts of your business to see the big picture.
Big Data - In-Memory Index / Sub Second Query engine - Roxie - HPCC SystemsFujio Turner
Roxie , the best kept Big Data secret for high performance. Leverage the multi-threaded processing of Roxie and use tools like In-memory indexes, In-memory data ,SSD and more to do sub-second querying.
Big Data - Load CSV File & Query the EZ way - HPCC SystemsFujio Turner
A "How To" to load CSV files into HPCC Systems and query them. You can use this method to migrate your RDBMS data ,MySQL / Oracle / SQL, into HPCC Systems.
Big Data - Load, Index & Query the EZ way - HPCC SystemsFujio Turner
Learn how to index your Big Data to get the speed that you want and need. With HPCC Systems use less machines and do more work faster then Hadoop.
To Install HPCC Systems in just 5 Minutes Watch this Youtube video. http://www.youtube.com/watch?v=8SV43DCUqJg
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Elizabeth Buie - Older adults: Are we really designing for our future selves?
HPCC Systems - ECL for Programmers - Big Data - Data Scientist
1. HPCC Systems - ECL Intro
Big Data Querying Made EZ
By Fujio Turner
Enterprise Control Language
explained for Programmers
@FujioTurner
2. Comparison
Block Based File Based
JAVA C++
Petabytes
1-80,000 Jobs/day
Since 2005
Exabytes
Non-Indexed 4X-13X
Indexed: 2K-3K Jobs/sec
Since 2000
? ? ? ? ? ?
Thor Roxie
3. What Is ECL?
ECL (Enterprise Control Language) is a C++ based query
language for use with HPCC Systems Big Data platform.
ECLs syntax and format is very simple and easy to learn.!
!
Note - ECL is very similar to Hadoop’s pig ,but!
more expressive and feature rich.
4. Comparing ECL to General Programming
In this presentation you will see how in ECL loading and
querying data is just like reading and finding data in a
plain text file.!
general programming (general common logic)!
vs.!
ECL
General Code HERE ECL Code HERE
General ECL
5. Example Text File
Name State Age
Kevin CA 45
Mark MI 27
Sara FL 64
Customer Data May 2010
~/cdata_2010.txt!
example file name
= ~/hpcc::cdata_2010.txt
ECL example file distributed in HPCC cluster
6. Opening File: general programming vs ECL
d = fopen(‘~/cdata_2010.txt’)
File Location
d := DATASET(‘~hpcc::cdata_2010’,cs,THOR);
General ECL
7. Opening File: general programming vs ECL
d = fopen(‘~/cdata_2010.txt’)
File Location
Open File Function
d := DATASET(‘~hpcc::cdata_2010’,cs,THOR);
General ECL
8. Organizing: general programming vs ECL
d = fopen(‘~/cdata_2010.txt’)
new_d = split( d ,“rn”)
Split Data(d) by Row
Kevin CA 45
Mark MI 27
Sara FL 64
d := DATASET(‘~hpcc::cdata_2010’,cs,THOR);
General ECL
9. Organizing: general programming vs ECL
d = fopen(‘~/cdata_2010.txt’)
new_d = split( d ,“rn”)
cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
Split Data(d) by Row
d := DATASET(‘~hpcc::cdata_2010’,cs,THOR);
Use This Schema on this file!
to Give Structure to Data
Kevin CA 45
Mark MI 27
Sara FL 64
General ECL
10. Find “Sara”: general programming vs ECL
cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
d = fopen(‘~/cdata_2010.txt’)
new_d = split( d ,“rn”)
Split Data by Column
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! new_row = split(row,“ ”)!
! if(new_row[0] == ‘Sara’){!
! ! print ”Found Sara”!
! }!
}
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
d := DATASET(‘~hpcc::cdata_2010’,cs,THOR);
General ECL
11. Find “Sara”: general programming vs ECL
cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
d = fopen(‘~/cdata_2010.txt’)
new_d = split( d ,“rn”)
Split Data by Column
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! new_row = split(row,“ ”)!
! if(new_row[0] == ‘Sara’){!
! ! print ”Found Sara”!
! }!
}
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
d := DATASET(‘~hpcc::cdata_2010’,cs,THOR);
Filter Data By
General ECL
12. Find “Sara”: general programming vs ECL
cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
d = fopen(‘~/cdata_2010.txt’)
new_d = split( d ,“rn”)
Split Data by Column
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! new_row = split(row,“ ”)!
! if(new_row[0] == ‘Sara’){!
! ! print ”Found Sara”!
! }!
}
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
d := DATASET(‘~hpcc::cdata_2010’,cs,THOR);
Filter Data By
Output
General ECL
13. Find “Sara”: general programming vs ECL
cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
d = fopen(‘~/cdata_2010.txt’)
new_d = split( d ,“rn”)
Split Data by Column
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! new_row = split(row,“ ”)!
! if(new_row[0] == ‘Sara’){!
! ! print ”Found Sara”!
! }!
}
d := DATASET(‘~hpcc::cdata_2010’,cs,THOR);
sara := d(Name = ‘Sara’);
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
Filter Data By
Output
General ECL
14. Find “Sara”: general programming vs ECL
cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
d = fopen(‘~/cdata_2010.txt’)
new_d = split( d ,“rn”)
Split Data by Column
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! new_row = split(row,“ ”)!
! if(new_row[0] == ‘Sara’){!
! ! print ”Found Sara”!
! }!
}
d := DATASET(‘~hpcc::cdata_2010’,cs,THOR);
sara := d(Name = ’Sara’);
OUTPUT(sara);
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
Filter Data By
Output
General ECL
15. Find “Sara” & Older then 50: general programming vs ECL
cs := RECORD!
! STRING20 Name;!
! STRING2 State;!
! INT3 Age;!
END
d = fopen(‘~/cdata_2010.txt’)
new_d = split( d ,“rn”)
for(x = 0; x< 3; x++){!
! row = new_d[x]!
! new_row = row.split(“ ”)!
! if(new row[0] == ‘Sara’ and row[2] >50){!
! ! print ”Found Sara”!
! }!
}
d := DATASET(‘~hpcc::cdata_2010’,cs,THOR);
sara := d(Name = ‘Sara’ AND Age > 50);
OUTPUT(sara);
0 1 2
Kevin CA 45
Mark MI 27
Sara FL 64
General ECL
16. ECL is EZ
•Make your own functions & libraries in ECL.!
•Modularize your code with “Import”: reuse old code
Machine Learning Built-in
http://hpccsystems.com/ml
17. ECL Plugin for Eclipse IDE
http://hpccsystems.com/products-and-services/products/plugins/eclipse-ide
18. ECL + Others Languages
ECL is C++ based so all your C/C++ code can be used in ECL.!
&!
Use other languages and methods like below to query too.
20. For More HPCC “How To’s” Go to
Query with
Plain SQL
http://www.slideshare.net/hpccsystems/jdbc-hpcc
or SQL TO ECL
http://www.slideshare.net/FujioTurner/meet-up-sqldemopp
21. Watch how to install
HPCC Systems
in 5 Minutes
Download HPCC Systems
Open Source
Community Edition
http://hpccsystems.com/download/
http://www.youtube.com/watch?v=8SV43DCUqJg
or
Source Code
https://github.com/hpcc-systems