Apache Drill Design proposal from OpenDremel team HLD Version 0.2, 9/sep/2012Camuel Gilyadov & Constantine Peresypkin,Email: Camuel@BigDataCraft.com
Intro• This is high-level design proposal for project ApacheDrill from OpenDeremel team.• History slides and usual “about us” stuff moved to the end of the deck.• Slide with all relevant links also published in the end.
Design Tenet #1• Apache Drill must support multi-tenant semantics internally and not to be run altogether in guest VMs.• It should be inspired by BigQuery and not only by Dremel/PowerDrill/Tenzing papers.• It is not practical to setup dedicated cloud (billed hourly) just to be able to run a query for a few seconds.• The codebase must be clearly divided into trusted part and untrusted part. Trusted part must be kept to absolute minimum and must be peer-reviewed, secured, audited and metered.
Design Tenet #2• Apache Drill must be modular and customizable in many dimensions.• Schema-on-read concept must be supported. Imperatively coded high-performance data parser must embeddable into the query.• SQL is not longer enough. New query languages must be easily added as well as user-defined-functions (UDF) implementing deep-analytics (such as statistics and machine learning).• Additionally various data-formats must be supported like column-stores, row-stores, PAX, RCFiles and etc.
Design Tenet #2 (cont.)• We suggest that query plan format will be relaxed to arbitrary executable, and data format relaxed to arbitrary opaque BLOB.• This way new query languages and new data formats could be easily supported without changing backend.• As added benefit backend becomes generic lightweight homogeneous compute-storage cloud.• Such approach exhibits good separation of control. Cloud operator controls and bills for generic infrastructure and the query engine is left completely in the control of the tenant/user.
Design Tenet #3• Apache Drill requests/queries must be hyper-elastic meaning capability to exploit compute capacity of thousands of servers for short duration of just a few seconds. No resources must be kept spinning per user between queries or when idle.• Traditional VMs are too heavyweight for that. Container approach such as OpenVZ/LXC and etc. are not secure enough in multi-tenancy context.• We suggest making sandboxing pluggable and supporting ZeroVM ( developed for OpenDremel ) and LXC (is fine for private clouds) to begin with.
Design Tenet #4• Apache Drill must be efficient.• Value-per-bit is extremely low with BigData.• Overhead in the inner loop must be kept to minimum.• Java was found inefficient for general number crunching (such as data compression). The main problem with Java is that GC overhead is unavoidable for the whole data corpus being scanned. We went so far as to keep all data in byte arrays and auto-generate transformation code and it still underperformed and code complexity went through the roof.
Suggested ArchitectureBrowser / Client Single-Tenant Multi-Tenant Frontend Backend running inside scale-out object store traditional guest VM and in-situ compute JVM Query Query Compiler Executable jobExecutable job
Suggested Frontend Design• Usual Java single-tenant web application.• In charge of: – All interaction with user. – Query/job submission – Query/job progress monitoring – Result browsing Client Tools Java Servlet CLI REST Query AJAX App Gateway Compiler
Suggested AJAX• What AJAX framework?• ExtJs?• Look&Feel – just clone Google App with the trademarks and logos replaced?• Why WebUI of Drill is more important than Hive? – Drill is interactive, at least basic WebUI must be provided with each release.
Suggested CLI Design• Bash+curl would suffice?• Full blown Java CLI tool?
Suggested REST-GW Design• Usual vanilla Java WebApp with Spring!
Suggested Query Compiler Design #1• Query Compiler consists from two component libraries with stable but language-dependent (so no reuse unfortunately ) interface between them:Query ExecutableText Parsers Semantic ModelReader Planners Script Syntax Semantic Errors Errors
Suggested Query Compiler Design #2• DrqlSemanticModelReader is ready and published under …..• SemanticModel that parsers produces closely follows original language. Parsers just parses query text and doesn’t attempts to “give it meaning” or annotate.• Simplified example: – List<Expression> getResultColumns() – List<DrqlQuery> getFromClause(); – List<ColumnId> getGroupByClause(); – etc….
Suggested Query Compiler Design #3• What is Executable Script? – Self-contained serializable, executable object. When executed with appropriate executor and yields correct query result on given input data of expected format – Self contained means no dependencies, everything is included in that executable object. – Particularly data parsing logic is included. – However, data access logic is NOT included. – The model for script is: “here is your blob of size N mapped to memory starting from address S, you have time T to generate your result up to size R in memory starting from address D. You will be terminated without advance notice for any attempted violation of any restriction”
Suggested Query Compiler Design #3• How executable script is generated? 1. Query object implementing SemanticModelReader interface is provided to planner by parser. 2. Planner logic examines semantic model through the SemanticModelReader interface and produces query plan object, that implements QueryPlanModelReader interface. Query analysis and optimization takes place during this stage and if needed additional interface of QueryPlanModelRewriter and/or QueryPlanModelVisitor could be created for this reason. However DrQL is a simple language without large (or any) search space so optimizer value is small. We suggest bypassing altogether query rewriting and query optimization for initial releases. 3. When query plan is generated, a most appropriate code template script is selected. Then template engine processes template coupled with QueryPlanModelReader object to produce executable
Suggested Backend Design• TODO• Executors per se – Janino based Java Executor – LXC-GCC based C Eexecutor – ZeroVM-GCC based C Executor• Storage platforms with collocated data processing – Local files (non distributed) – HDFS – OpenStack Swift
OpenDremel/DazoTwo separate unfinished We call it Metaxa We call it ZwiftjQuery apps & cmdline (historic reasons) (Swift + ZeroVM) app with no particular BQL Parser, unfinished codenames compiler based on Apache Alpha Quality Velocity JVM Query Query Compiler Executable job
What is Swift?“Swift is a highly available, distributed,eventually consistent object/blob store.Organizations can use Swift to storelots of data efficiently, safely, andcheaply.”
Don’t get it?Swift is THE open-source implementation of Amazon S3
What is ZeroVM?Highly-secure, low-overhead, low-latency container-stylevirtualization based on Google Native Client project. Thecritical security code is transferred verbatim from ChromeBrowser project and therefore is as secure as ChromeBrowser. More info: http://ZeroVM.org andhttp://news.ycombinator.com/item?id=3746222
ZeroVM highlights1. Disposable VM per request2. HyperElasticity per request3. Embeddable into everything4. High-performance (x86/ARM)5. Erlang inspired clustering6. Written in pure C, not deps
Don’t get it?ZeroVM to Virtualization is whatSQLite is to Databases
OpenDremel Story: 2010• Camuel Gilyadov started Dremel implementation on summer 2010 named OpenDremel.• David Gruzman joined the effort a few months later followed by Constantine Peresypkin.• There wasn’t a comprehensive design or architecture. The goal was to get hierarchal-columnar transformation working smoothly and in strict accordance to the Dremel paper. Several working implementations are published by us under Apache License.• Hong San was hired as first full-timer to speedup the development. Metaxa milestone was set.
OpenDremel Story: 2011• OpenDremel early design was found too naive, mainly due to Java underperformance in inner number-crunching loops.• After fierce brainstorming, project was restarted from scratch under new name Dazo. With Dazo, query plan is an arbitrary piece of executable native code with Java frontend.• From now on we got inspiration from BigQuery as opposed to from Dremel paper.• We decided to use Google NaCl as sandboxing technology to isolate queries as well as meter resource consumption. The new sandbox was named ZeroVM.• As for storage we decided to use OpenStack Swift.
OpenDremel Story: 2012• Four people full-time, several others part time, we still don’t have fully integrated version but we are satisfied with what we have achieved and convinced that the decisions behind Dazo were correct.• We believe ZeroVM could be a disruptive technology in itself revolutionizing BigData@Cloud space.• We are excited by Apache Drill initiative and hope to be useful for it.• Check the blog: http://BigDataCraft.com