DAG Definition:
directed = the connections between the nodes (edges) have a direction: A -> B is not the same as B -> A
acyclic = "non-circular" - moving from node to node by following the edges, you will never encounter the same node for the second time.
graph = structure consisting of nodes, that are connected to each other with edges
Basically a directed acyclic graph is a tree.
In-memory to on-disk failover [DAG v MR]
Storage Compatibility [HDFS v proprietary]
Resource Management [Yarn v other]
File format compatibility [Parquet/Columnar, Avro/Row, JSON/Hierarchic, Textfile/Linear]
Expressive language [Declarative/Functional v Imperative]
Streaming + batch support [Temporal/Dimensional Partitioning v Tabular Scans]
Connectivity to other systems [ODBC vs WS, Virtual vs Physical]
Ease of use – Web UIs [IDE vs Putty, Dashboard vs File-based aggregation]
Execution, monitoring, debugging, logging [Centralized v Decentralized; Integrated with CM v Fragmented]
Security [Authentication with LDAP/AD, Authorization/ACLs w Sentry/Posix/Kerberos, Encyrption on disk v wire, Key Management central v isolated]
Integrated graph, temporal, and statistical analytics [one framework vs Multiple Libraries]
Integrated Files, Tables, Datasets, DataFrame “views” – Ability to Share Results
https://cwiki.apache.org/confluence/display/DRILL/Release+Notes
Drill now features complete support for UNION ALL and COUNT(DISTINCT). Drill 0.8 also includes new functions such as unix_timestamp and the window functions sum, count and rank. Note that these window functions should be considered beta.