Be the first to like this
Big Data Processing Above and Beyond Hadoop: Data-intensive computing represents a new computing paradigm to address Big Data processing requirements using high-performance architectures supporting scalable parallel processing to allow government, commercial organizations, and research environments to process massive amounts of data and implement new applications previously thought to be impractical or infeasible. The fundamental challenges of data-intensive computing are managing and processing exponentially growing data volumes, significantly reducing associated data analysis cycles to support practical, timely applications, and developing new algorithms which can scale to search and process massive amounts of data. The open source HPCC (High-Performance Computing Cluster) Systems platform offers a unified approach to Big Data processing requirements: (1) a scalable, integrated computer systems hardware and software architecture designed for parallel processing of data-intensive computing applications, and (2) a new programming paradigm in the form of a high-level, declarative, data-centric programming language designed specifically for big data processing. This presentation explores the challenges of data-intensive computing from a programming perspective, and describes the ECL programming language and the HPCC architecture designed for data-intensive computing applications. HPCC is an alternative to the Hadoop platform, and ECL is compared to Pig Latin, a high-level language developed for the Hadoop MapReduce architecture.