Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A First Look at HPCC Systems 7.0, Innovation in Action

20 views

Published on

As part of the 2018 HPCC Systems Summit Community Day event:

The latest version of the platform contains improvements to functionality, usability and interoperability. This talk gives an overview of the changes and explains how you might find them useful.

Gavin Halliday's primary focus is on the code generator, which converts ECL into the queries which run on the platform. Gavin enjoys working on problems together with the development team and the varied nature of the work keeps him engaged. Gavin shares how the platform compares with competitive platforms, including scalability and coding simplicity. He enjoys working on the platform and the elegant solutions the development team is able to implement. Gavin encourages people to give it a try!

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

A First Look at HPCC Systems 7.0, Innovation in Action

  1. 1. Innovation and Reinvention Driving Transformation OCTOBER 9, 2018 2018 HPCC Systems® Community Day Gavin Halliday A First Look at HPCC Systems 7.0, Innovation in Action
  2. 2. Renewing the foundations • File processing • ECLWatch workunit interface • Visualization Framework • DESDL • Configuration manager HPCC 7.0 2
  3. 3. Usability and Productivity
  4. 4. ECL Watch Goals • Highlight important information • Make it easier to understand queries • Improved support for very large queries Examples: • Gantt • Graph Viewer • Timings • Log data visualizer HPCC 7.0 4
  5. 5. Gantt chart HPCC 7.0 5
  6. 6. New Graph Viewer HPCC 7.0 6
  7. 7. New Graph Viewer HPCC 7.0 7
  8. 8. Stats and Timings HPCC 7.0 8
  9. 9. Visualization Framework • Version 2.0 now available • https://github.com/hpcc-systems/Visualization • Rebranded as hpcc-js in the node npm repository • New documentation, demos and gallery • Includes non visualization items like ESP comms layer • Dashy beta • Not tied to HPCC Systems • Visualizer Bundle 1.1 HPCC 7.0 9
  10. 10. ECL libraries • Ecl Library extensions • Date – timestamps, time zones, formatting • Unicode – words, prefixes and suffixes • Maths – infinity, fmod • Bundles • Data Patterns • ML – Gradient boosted trees, boosted forests • Visualizer HPCC 7.0 10
  11. 11. ESP improvements • DESDL improvements • Custom mappings • Fully integrated into ESP • Mixing DESDL and ESDL in one service • Allow disconnection from Dali • Support for persistent connections. HPCC 7.0 11
  12. 12. ECL Compiler • Activities in other languages. EXPORT streamed dataset(r) myDataset(unsigned numRows = numRows) := EMBED(javascript : activity) … • Multi-line string constants message := '''One Two Three'''; • Code generator improvements • Faster archive generation • Faster syntax checking HPCC 7.0 12
  13. 13. Interoperability
  14. 14. Spark • “An open source distributed general-purpose cluster-computing framework” • Reading from spark • Files and indexes. • Filter rows • Select fields required • N to M parallel reads • Writing from spark • File security • Spark cluster installation HPCC 7.0 14
  15. 15. Log Data Visualizations HPCC 7.0 15
  16. 16. Log Data Visualizations HPCC 7.0 16
  17. 17. Log Data Visualizations HPCC 7.0 17 https://hpccsystems.com/blog/ELK_visualizations
  18. 18. VS Code HPCC 7.0 18 https://code.visualstudio.com/
  19. 19. VS Code HPCC 7.0 19
  20. 20. Security
  21. 21. User Security • Session management • Avoid resending credentials • Users can log out • Allow sessions lock and time out • Minimize time passwords retained HPCC 7.0 21
  22. 22. System security • Spark • File access rights • Dafilesrv authentication of requests • The cloud • Verifying components • Encryption in transit • ROXIE HTTPS support HPCC 7.0 22
  23. 23. Performance
  24. 24. Thor • Keyed Join (HPCC-16476) HPCC 7.0 24
  25. 25. Thor • LOOP • Synchronization overhead • LOCAL LOOP bodies • Child Queries • Reduced overhead • Improvements to buffering • Faster Startup HPCC 7.0 25
  26. 26. Index improvements HPCC 7.0 26 •60K rows •0.02% of totalHourly •1.4M rows •0.6% of totalDaily •10M rows •4% of totalWeekly •43M rows •17% of totalMonthly •520M rows •100% of totalHistorical • Example database containing 250M unique items with 1000 updates each minute
  27. 27. Index improvements • Bloom filters • Supports multiple filters per index • User configurable probability • Automatically created. • Richard’s blog post hpccsystems.com/blog/bloom-filters • Hash distributed keys. • When distribution fields are filtered with equalities • Easier to create co-distributed keys • Lower overhead calculating the part containing a match HPCC 7.0 27
  28. 28. Finally • WsSQL – now part of the core • Over 1,000 pull requests since 6.4 HPCC 7.0 28
  29. 29. Talk to us! • Bloom filters - Richard Chapman • DESDL - Yanrui Ma • ELK - Rodrigo Pastrana • Thor - Jake Cobbett-Smith • Visualizations - Gordon Smith • Security - Tony Fishbeck • Spark - Rodrigo Pastrana • Config Manager - Ken Rowland HPCC 7.0 29

×