The document provides an overview of Hive architecture and workflow. It discusses how Hive converts HiveQL queries to MapReduce jobs through its compiler. The compiler includes components like the parser, semantic analyzer, logical and physical plan generators, and logical and physical optimizers. It analyzes sample HiveQL queries and shows the transformations done at each compiler stage to generate logical and physical execution plans consisting of operators and tasks.
2. Why?Hive is good tool for non-specialist!The number of M/R controls the Hive processing time.↓How can we reduce the number?What can we do for this on writing HiveQL?↓How does Hive convert HiveQLto M/R jobs?On this, what optimizing processes are adopted?7/6/2011HIVE - A warehouse solution over Map Reduce Framework2
3. Don’t you have..This fb’s paper has a lot of information!But this is a little old..7/6/2011HIVE - A warehouse solution over Map Reduce Framework3
5. Hive Architecture / Exec Flow7/6/2011HIVE - A warehouse solution over Map Reduce Framework5ClientHadoopMetastoreDriverCompiler
6. ClientHadoopDriverCompilerHive WorkflowHive has the operators which are minimum processing units.The process of each operator is done with HDFS operation or M/R jobs.The compiler converts HiveQL to the sets of operators.7/6/2011HIVE - A warehouse solution over Map Reduce Framework6Metastore
8. ClientHadoopMetastoreDriverCompilerHive WorkflowFor M/R processing, Hiveuses ExecMaper and ExecReducer.On processing, we have 2 modes.Local processing modeDistributed processing mode7/6/2011HIVE - A warehouse solution over Map Reduce Framework8
9. ClientHadoopMetastoreDriverCompilerHive WorkflowOn 1(Local mode)Hive fork the process with hadoop command.The plan.xml is made just on 1 and the single node processes this.On 2(Distributed mode).Hive send the process to exsistingJobTracker.The information is housed on DistributedCacheand processed on multi nodes.7/6/2011HIVE - A warehouse solution over Map Reduce Framework9
10. Compiler : How to Process HiveQL7/6/2011HIVE - A warehouse solution over Map Reduce Framework10ClientHadoopMetastoreDriverCompiler
11. “Plumbing” of HIVE compiler7/6/201111HIVE - A warehouse solution over Map Reduce Framework
12. “Plumbing” of HIVE compiler7/6/201112HIVE - A warehouse solution over Map Reduce Framework