Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Meetup: Case Study - HPCC Systems implementation for an Aviation company

244 views

Published on

Have you booked tickets through websites such as Orbitz or Trip Advisor? Do you see the pages running different connections per your request and criteria? Do you know how many possible connections are built on a day? Am I hearing millions? Nope, it is billions. Aviation industry is one huge hub of data. Come and see how HPCC Systems solved their problems and helped them scale their solution. And throughout the time, HPCC Systems made it look so easy.

Published in: Data & Analytics
  • Be the first to comment

Meetup: Case Study - HPCC Systems implementation for an Aviation company

  1. 1. WHT/082311 1 | | ©2013, Cognizant1 | ©2017, Cognizant 1 Hammer and beyond – An ensembling journey
  2. 2. WHT/082311 2 | ©2017, Cognizant Who am I? Sunil Babu Peethambaram Architect, Cognizant Technology Solutions, CTSH (NASDAQ) Total IT experience – 13+ years Consulting with LexisNexis since 2013 (Chennai, Dayton, Buford, Alpharetta) Experience in HPCC Systems – more than 3 years Domains worked on : • Supply Chain Management • Logistics • Retail – • Merchandise and Store operations • Order Management and • Warehouse Management Systems • Insurance • Healthcare • Aviation
  3. 3. WHT/082311 3 | ©2017, Cognizant Problem statement - How did it all start Build valid flight connections (VFC) based on direct flight schedules (DFS) DFS come in a proprietary encoded format DFS spans across 1000 carriers and over 4 million records DFS are for a year or more into the future DFS keeps changing every day and VFC needs to be versioned for every day (potentially) Building VFC requires evaluating feasibility of over 16 trillion potential connections Valid connections to be identified by applying: • Circuitry • Cabotage • BIETA and LCC • Schedule conflicts • MCT rules of over 100,000 to be applied in sequence
  4. 4. WHT/082311 4 | ©2017, Cognizant The Legacy Setup • Complex Business Logic • Data intensive • .NET/SQL Server • Local datacenter • Scaled-up architecture • Ageing hardware • Sequential processing • Low fault tolerance • Stale data delivery • 24 X 7 life support
  5. 5. WHT/082311 5 | ©2017, Cognizant The ask SOS!
  6. 6. WHT/082311 6 | ©2017, Cognizant The ask Not really!
  7. 7. WHT/082311 7 | ©2017, Cognizant The ask Relevant data delivery – faster processing, parallelize independent tasks Don’t marry the hardware (just friends with benefits) Performance as a configuration (take your time, hurry up, choice is yours, don't be late) Fail fast, recover faster Onboard new customers quickly Automated data delivery pipeline Better maintainability – support and enhance the complex business logic
  8. 8. WHT/082311 8 | ©2017, Cognizant So What?
  9. 9. WHT/082311 9 | ©2017, Cognizant Every project has complex business logic
  10. 10. WHT/082311 10 | ©2017, Cognizant But, We have to generate hundreds of millions of records…
  11. 11. WHT/082311 11 | ©2017, Cognizant …which means we have a “big data” problem
  12. 12. WHT/082311 12 | ©2017, Cognizant And we are going to do whatever it takes…
  13. 13. WHT/082311 13 | ©2017, Cognizant OK Google.. What is big data?
  14. 14. WHT/082311 14 | ©2017, Cognizant This is what we got!
  15. 15. WHT/082311 15 | ©2017, Cognizant Our problem was different
  16. 16. WHT/082311 16 | ©2017, Cognizant We have a big “data problem” and the answers are a whole lot bigger!!!
  17. 17. WHT/082311 17 | ©2017, Cognizant So, why HPCC Systems? Why not?
  18. 18. WHT/082311 18 | ©2017, Cognizant So, why HPCC Systems? Our use case was data intensive and batch oriented Embarrassingly parallel ECL was built specifically for distributed data processing and gave us the fine control we needed Been there.. done that, lot of real experiences to tap into Access to the HPCC Systems development team It’s performing and maintainable We did a proof of concept and validated fitment anyway • 45 minute job ran in 1 second • 4 hours job ran in 90 seconds • 4 weeks planned proof of concept was completed in 4 days
  19. 19. WHT/082311 19 | ©2017, Cognizant What did Bill have to say about it?
  20. 20. WHT/082311 20 | ©2017, Cognizant Why AWS? Bring a multi-node HPCC Systems cluster up or down at a click of a button Scale up or down with zero upfront cost Validating multiple configurations for performance and choose the best And… No need for Data Centers Pay as you USE Go Global Speed of computing
  21. 21. WHT/082311 21 | ©2017, Cognizant High level flow
  22. 22. WHT/082311 22 | ©2017, Cognizant Inside HPCC Systems Data warehouse as Source of Truth Data warehouse is the base on which our solution was built. Follows a push-pull architecture The raw data from different data sources are cleansed and transformed to data cubes (push). The cubes acts as views that are used by downstream applications (pull). Eg: Connection builder Data warehouse is the only way by which data enters into the distributed data processing system All views follow a common interface through which data can be accessed
  23. 23. WHT/082311 23 | ©2017, Cognizant Lifecycle of a view in DW
  24. 24. WHT/082311 24 | ©2017, Cognizant How did we fare? Metrics Measure (Legacy – UTG) Measure (HPCC Systems) Building connections (Singles) 40 hours 1 hour Lines of Code 26535 (Not including SQL) 3973 Delivery Frequency Weekly Daily (Possible) Hardware 24 GB and 12 cores for Batch Server 384 GB and 24 Cores for SQL Server Thor Master + Middleware – 16 GB Thor Slaves 64 GB – 16 cores across 4 nodes AWS 4.4 million 100 million 13.5 million
  25. 25. WHT/082311 25 | ©2017, Cognizant Happy Side Effects Data Warehouse as a framework for new data sources Data Warehouse as an interface for downstream applications Plug and play by design File builder template – Blue print for all data delivery jobs Unit testing framework for HPCC Systems Regression testing suite – Can run all tests in the code base and provide report We integrated comparison testing tool from LNR into Hammer HPCC Systems cluster can now be built in AWS at a click of a button (puppet) Seamless sync between external FTP location and landing zone through S3
  26. 26. WHT/082311 26 | ©2017, Cognizant What next?
  27. 27. WHT/082311 27 | ©2017, Cognizant What next?
  28. 28. WHT/082311 28 | ©2017, Cognizant What next?
  29. 29. WHT/082311 29 | ©2017, Cognizant What next?
  30. 30. WHT/082311 30 | ©2017, Cognizant What next?
  31. 31. WHT/082311 31 | ©2017, Cognizant What next?
  32. 32. WHT/082311 32 | ©2017, Cognizant What next?
  33. 33. WHT/082311 33 | ©2017, Cognizant What next?
  34. 34. WHT/082311 34 | ©2017, Cognizant What next?
  35. 35. WHT/082311 35 | ©2017, Cognizant ? Questions?
  36. 36. WHT/082311 36 | ©2017, Cognizant Thank you Reach out to me: Sunil.Babu@flightglobal.com Useful links Cognizant: http://www.cognizant.com FlightGlobal http://www.flightglobal.com HPCC Systems Portal: http://hpccsystems.com Machine Learning: http://hpccsystems.com/ml Online Training: http://learn.lexisnexis.com/hpcc HPCC Systems Wiki & Red Book: https://wiki.hpccsystems.com Our GitHub portal: https://github.com/hpcc-systems Community Forums: http://hpccsystems.com/bb Documentation: https://hpccsystems.com/download/documentation

×