Tim Hunter presented on TensorFrames, which allows users to run TensorFlow models on Apache Spark. Some key points:
- TensorFrames embeds TensorFlow computations into Spark's execution engine to enable distributed deep learning across a Spark cluster.
- It offers performance improvements over other options like Scala UDFs by avoiding serialization and using direct memory copies between processes.
- The demo showed how TensorFrames can leverage GPUs both in local mode and at scale in a cluster to speed up numerical workloads like kernel density estimation.
- Future work includes better integration with Tungsten and MLlib as well as official GPU support in Databricks. TensorFrames aims to provide a simple API for distributed numerical computing that