Apache Arrow - A cross-language development platform for in-memory dataKouhei Sutou
Apache Arrow is the future for data processing systems. This talk describes how to solve data sharing overhead in data processing system such as Spark and PySpark. This talk also describes how to accelerate computation against your large data by Apache Arrow.
Apache Arrow - A cross-language development platform for in-memory dataKouhei Sutou
Apache Arrow is the future for data processing systems. This talk describes how to solve data sharing overhead in data processing system such as Spark and PySpark. This talk also describes how to accelerate computation against your large data by Apache Arrow.
Apache development with GitHub and Travis CIJukka Zitting
Much of the recent innovation in development tooling has happened around Git-based cloud services like GitHub and Travis CI. While these services are not part of the official Apache infrastructure, it's still possible to use them to complement the tooling available to Apache projects. Based on experience from Apache Jackrabbit, this presentation shows how to leverage such external services while staying true to Apache principles and policies.
Oak, the architecture of Apache Jackrabbit 3Jukka Zitting
Apache Jackrabbit is just about to reach the 3.0 milestone based on a new architecture called Oak. Based on concepts like eventual consistency and multi-version concurrency control, and borrowing ideas from distributed version control systems and cloud-scale databases, the Oak architecture is a major leap ahead for Jackrabbit. This presentation describes the Oak architecture and shows what it means for the scalability and performance of modern content applications. Changes to existing Jackrabbit functionality are described and the migration process is explained.
AEM6 comes with a fresh new repository backend designed for improved performance and scalability. This session introduces the new repository architecture and describes the key differences and improvements for developers and operations teams. Topics covered include content migration, backwards compatibility, key deployment scenarios and configuration options, and custom search indexes.
25. Ⓒ Classmethod, Inc.
Hive on Tez
• https://cwiki.apache.org/confluence/
display/Hive/Hive+on+Tez
• Hiveの実行エンジンとしてMRの代わりにTezを利用
25
http://tm.durusau.net/?p=48476
26. Ⓒ Classmethod, Inc.
Hive on MRとの比較
• DAGで表現することで段数を減らす事ができる
26
http://www.slideshare.net/Hadoop_Summit/w-235phall1pandey/9