SlideShare a Scribd company logo
1 of 15
Download to read offline
Unrestricted © Siemens AG. 2016. All rights reserved.
Distributed Multi-device Execution of
TensorFlow – an Outlook
Meetup “ TensorFlow & OpenAI – a match made in Heaven?” | 2016-03-01
Page 2 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
What is TensorFlow?
numerical computation library
using data flow graphs
deployable on heterogeneous distributed
systems
Machine Learning
Perspective
Distributed
Computing Perspective
source:http://www.tomlichtenheld.com/childrens_books/duckrabbit!.html
Page 3 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
What is TensorFlow?
using data flow graphs
Machine Learning
Perspective
Distributed, Embedded
Computing Perspective
numerical computation library deployable on heterogeneous distributed
systems
source:http://www.tomlichtenheld.com/childrens_books/duckrabbit!.html
Page 4 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
TensorFlow froma distributed computing
perspective
processor,
memory,
network
hierarchies
automatically assign to computational devices
execute in parallel
multi-
dimensional
data flow
computations
source:https://www.tensorflow.org/
Page 5 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
TensorFlow froma distributed computing
perspective
processor,
memory,
network
hierarchies
multi-
dimensional
data flow
computations
Task Scheduling
Resource Management
placement,
parallelization
resource
availability,
costs
Google‘s clustermanagementsystem“Borg”1)
“Significantarea offuture work: improving the placementand
node scheduling algorithms”1)
1) http://download.tensorflow.org/paper/whitepaper2015.pdf
source:https://www.tensorflow.org/
Page 6 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
TensorFlow from a distributed, embedded systems
perspective?
Some presentation by Pete Warden, Tech Lead
of the TensorFlow Mobile/Embedded team:
“GoogLeNet v1 is 7MB after just quantization”
http://ip.cadence.com/uploads/presentations/1100AM_Tensor
Flow_on_Embedded_Devices_PeteWarden.pdf
?
?
https://www.youtube.com/watch
?v=b0hqhcwDIi4 https://www.autonomous.ai/deep-learning-robot
http://www.nvidia.com/object/embedded-systems.html
http://www.iphoneincanada.ca/
news/tesla-autopilot-summon/
https://www.youtube.com/watch?v=AbcRlDBnwjM
http://www.dexterindustries.com
/shop/gopigo-starter-kit-2/
Page 7 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
TensorFlow from a distributed, embedded systems
perspective
All things Tensor
• embedded systems sense
multidimensional, multimodal,
streaming data
• tensor networks for easy
implementation of most complex
mathematical operations
Dataflow paradigm
• data is king
• deterministic data acquisition &
calculation
• real-time constraints
• concurrency
• multi-core, GPU, FPGA
• enables true portability
source: https://www.tensorflow.org/
Page 8 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
TensorFlow from a distributed, embedded systems
perspective
Insufficient tensor support
• BLAS up to matrix-matrix ops
a start: extensions to Eigen by
Benoit Steiner for TensorFlow
http://eigen.tuxfamily.org/dox-devel/unsupported/classEigen_1_1Tensor.html
source: https://www.tensorflow.org/
Page 9 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
TensorFlow from a distributed, embedded systems
perspective
Insufficient tensor support
• BLAS up to matrix-matrix ops
a start: extensions to Eigen by
Benoit Steiner for TensorFlow
http://eigen.tuxfamily.org/dox-devel/unsupported/classEigen_1_1Tensor.html
Heuristic placement algorithm
• suited for cloud resources
need: determinism
Resource Management
• suited for large-scale clusters
need: including resources in
embedded systems
source: https://www.tensorflow.org/
Page 10 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
Upcoming workshop on tensor computing for IoT
Topics of interest
• multidimensional IoT data
• tensor methods and deep learning
• distributed data and computing models
• across heterogeneous architectures of
multi-core cluster and embedded
computing
• optimized and verifiable composition
of operations in an n-dimensional
array/tensor algebra
(Prefect timing, TensorFlow!) Manifesto will be available here:
http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=16152
Page 11 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
Sneak Peak: Multidimensional IoT data
Large-scale autonomous systems
generate massive amounts of data
captured by embedded devices
• about dynamic flows
• in dynamic networks
• streaming, GPS-synchronized
• captures various aspects,
measurements
• highly correlated, coming from
networked systems
Page 12 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
Sneak Peak: Tensor Networks (TN)
“Geometrization”, graphical representation
• modify, optimize TN structure
• reduce complexity, compare, analyze structures
• detect common, hidden components
Links between TNs & graphical models in ML
example notation
Example transformation
contraction unfolding
matrix factorization
SVD
reshaping
Page 13 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
Sneak Peak: Mathematics of Arrays, Psi Calculus
• indexing operations based on
shapes
• compose array operations to
minimize temporary arrays
Determinism
• for any number of tensor
operations, predict
• length of contiguous
blocks
• values in each block
• correctly pre-fetch blocks
• overlap computation & IO
Page 14 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved.
What, now?
Stay tuned, try out
• https://github.com/tensorflow/
• Distributed TensorFlow 2/26/2016
• uses gRPC http://www.grpc.io/
• TensorFlow Serving 2/16/2016
• model lifecycle management
• Dagstuhl perspectives: Tensor Computing for IoT
• intuitive handling of tensor operations, optimizations
• deterministic placement and scheduling
• applications in cyber-physical systems
• reference implementations, evaluations & publications
• Embedded Multicore Building Blocks EMB2 https://github.com/siemens/embb
• Eigen Tensor Module
https://bitbucket.org/eigen/eigen/src/265a621240a21b201cc9e73cffc1021e12e6fc93/unsupported/Eigen/CXX11/src/Tensor/?at=default
Page 15 March 2016 Sebnem Rusitschka
The future of embedded computing is being built now
– starting at the processor level
“Neo – The tiny chip that
could disrupt exascale
computing”
Raspberry Pi Zero: 1
GHz Linux computer
for $5
http://www.nvidia.com/object/embedded-systems.html
http://rexcomputing.com/REX_OCPSummit2015.pdf
http://www.nextplatform.com/2015/03/12/the-little-
chip-that-could-disrupt-exascale-computing/
https://medium.com/software-is-eating-the-world/what-s-next-in-computing-
e54b870b80cc#.r6k84z51m

More Related Content

What's hot

A peek on numerical programming in perl and python e christopher dyken 2005
A peek on numerical programming in perl and python  e christopher dyken  2005A peek on numerical programming in perl and python  e christopher dyken  2005
A peek on numerical programming in perl and python e christopher dyken 2005Jules Krdenas
 
Fletcher Framework for Programming FPGA
Fletcher Framework for Programming FPGAFletcher Framework for Programming FPGA
Fletcher Framework for Programming FPGAGanesan Narayanasamy
 
Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?Davide Carboni
 
Multithreading to Construct Neural Networks
Multithreading to Construct Neural NetworksMultithreading to Construct Neural Networks
Multithreading to Construct Neural NetworksAltoros
 
The world is the computer and the programmer is you
The world is the computer and the programmer is youThe world is the computer and the programmer is you
The world is the computer and the programmer is youDavide Carboni
 
PuppetConf 2017: How People Actually Write Puppet- Gareth Rushgrove, Puppet
PuppetConf 2017: How People Actually Write Puppet- Gareth Rushgrove, PuppetPuppetConf 2017: How People Actually Write Puppet- Gareth Rushgrove, Puppet
PuppetConf 2017: How People Actually Write Puppet- Gareth Rushgrove, PuppetPuppet
 
FSB: TreeWalker - SECCON 2015 Online CTF
FSB: TreeWalker - SECCON 2015 Online CTFFSB: TreeWalker - SECCON 2015 Online CTF
FSB: TreeWalker - SECCON 2015 Online CTFYOKARO-MON
 
PuppetConf 2017: Puppet Platform: A Path Forward- Eric Sorenson, Puppet
PuppetConf 2017: Puppet Platform: A Path Forward- Eric Sorenson, PuppetPuppetConf 2017: Puppet Platform: A Path Forward- Eric Sorenson, Puppet
PuppetConf 2017: Puppet Platform: A Path Forward- Eric Sorenson, PuppetPuppet
 
Fl0ppy - CODEGATE 2016 CTF Preliminary
Fl0ppy - CODEGATE 2016 CTF PreliminaryFl0ppy - CODEGATE 2016 CTF Preliminary
Fl0ppy - CODEGATE 2016 CTF PreliminaryYOKARO-MON
 
MTaulty_DevWeek_Parallel
MTaulty_DevWeek_ParallelMTaulty_DevWeek_Parallel
MTaulty_DevWeek_Parallelukdpe
 
How to make the Fastest C# Serializer, In the case of ZeroFormatter
How to make the Fastest C# Serializer, In the case of ZeroFormatterHow to make the Fastest C# Serializer, In the case of ZeroFormatter
How to make the Fastest C# Serializer, In the case of ZeroFormatterYoshifumi Kawai
 
Tensorflow presentation
Tensorflow presentationTensorflow presentation
Tensorflow presentationAhmed rebai
 
LeFlowを調べてみました
LeFlowを調べてみましたLeFlowを調べてみました
LeFlowを調べてみましたMr. Vengineer
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlowMatthias Feys
 
OREO - Hack.lu CTF 2014
OREO - Hack.lu CTF 2014OREO - Hack.lu CTF 2014
OREO - Hack.lu CTF 2014YOKARO-MON
 

What's hot (20)

A peek on numerical programming in perl and python e christopher dyken 2005
A peek on numerical programming in perl and python  e christopher dyken  2005A peek on numerical programming in perl and python  e christopher dyken  2005
A peek on numerical programming in perl and python e christopher dyken 2005
 
PyHEP 2019: Python 3.8
PyHEP 2019: Python 3.8PyHEP 2019: Python 3.8
PyHEP 2019: Python 3.8
 
Move from C to Go
Move from C to GoMove from C to Go
Move from C to Go
 
Fletcher Framework for Programming FPGA
Fletcher Framework for Programming FPGAFletcher Framework for Programming FPGA
Fletcher Framework for Programming FPGA
 
Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?Pysense: wireless sensor computing in Python?
Pysense: wireless sensor computing in Python?
 
Multithreading to Construct Neural Networks
Multithreading to Construct Neural NetworksMultithreading to Construct Neural Networks
Multithreading to Construct Neural Networks
 
High-Performance Computing and OpenSolaris
High-Performance Computing and OpenSolarisHigh-Performance Computing and OpenSolaris
High-Performance Computing and OpenSolaris
 
The world is the computer and the programmer is you
The world is the computer and the programmer is youThe world is the computer and the programmer is you
The world is the computer and the programmer is you
 
PuppetConf 2017: How People Actually Write Puppet- Gareth Rushgrove, Puppet
PuppetConf 2017: How People Actually Write Puppet- Gareth Rushgrove, PuppetPuppetConf 2017: How People Actually Write Puppet- Gareth Rushgrove, Puppet
PuppetConf 2017: How People Actually Write Puppet- Gareth Rushgrove, Puppet
 
TensorFlow for HPC?
TensorFlow for HPC?TensorFlow for HPC?
TensorFlow for HPC?
 
FSB: TreeWalker - SECCON 2015 Online CTF
FSB: TreeWalker - SECCON 2015 Online CTFFSB: TreeWalker - SECCON 2015 Online CTF
FSB: TreeWalker - SECCON 2015 Online CTF
 
Available HPC resources at CSUC
Available HPC resources at CSUCAvailable HPC resources at CSUC
Available HPC resources at CSUC
 
PuppetConf 2017: Puppet Platform: A Path Forward- Eric Sorenson, Puppet
PuppetConf 2017: Puppet Platform: A Path Forward- Eric Sorenson, PuppetPuppetConf 2017: Puppet Platform: A Path Forward- Eric Sorenson, Puppet
PuppetConf 2017: Puppet Platform: A Path Forward- Eric Sorenson, Puppet
 
Fl0ppy - CODEGATE 2016 CTF Preliminary
Fl0ppy - CODEGATE 2016 CTF PreliminaryFl0ppy - CODEGATE 2016 CTF Preliminary
Fl0ppy - CODEGATE 2016 CTF Preliminary
 
MTaulty_DevWeek_Parallel
MTaulty_DevWeek_ParallelMTaulty_DevWeek_Parallel
MTaulty_DevWeek_Parallel
 
How to make the Fastest C# Serializer, In the case of ZeroFormatter
How to make the Fastest C# Serializer, In the case of ZeroFormatterHow to make the Fastest C# Serializer, In the case of ZeroFormatter
How to make the Fastest C# Serializer, In the case of ZeroFormatter
 
Tensorflow presentation
Tensorflow presentationTensorflow presentation
Tensorflow presentation
 
LeFlowを調べてみました
LeFlowを調べてみましたLeFlowを調べてみました
LeFlowを調べてみました
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlow
 
OREO - Hack.lu CTF 2014
OREO - Hack.lu CTF 2014OREO - Hack.lu CTF 2014
OREO - Hack.lu CTF 2014
 

Similar to Distributed Multi-device Execution of TensorFlow – an Outlook

Going deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkusGoing deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkusRed Hat Developers
 
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018 Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018 Codemotion
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupesh Bansal
 
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2Anant Corporation
 
Stream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data PipelinesStream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data PipelinesVladimír Schreiner
 
Solving Cybersecurity at Scale
Solving Cybersecurity at ScaleSolving Cybersecurity at Scale
Solving Cybersecurity at ScaleDataWorks Summit
 
Building ML Pipelines with DCOS
Building ML Pipelines with DCOSBuilding ML Pipelines with DCOS
Building ML Pipelines with DCOSQAware GmbH
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkSingleStore
 
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co..."New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...Edge AI and Vision Alliance
 
Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017Clarisse Hedglin
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeItai Yaffe
 
At the Crossroads of HPC and Cloud Computing with Openstack
At the Crossroads of HPC and Cloud Computing with OpenstackAt the Crossroads of HPC and Cloud Computing with Openstack
At the Crossroads of HPC and Cloud Computing with OpenstackRyan Aydelott
 
Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017
Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017
Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017Amazon Web Services
 
Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed FrameworksStrata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed FrameworksPaco Nathan
 
Introduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptxIntroduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptxJanagi Raman S
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Joachim Schlosser
 

Similar to Distributed Multi-device Execution of TensorFlow – an Outlook (20)

Going deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkusGoing deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkus
 
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018 Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Rome 2018
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2
 
Stream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data PipelinesStream Processing and Real-Time Data Pipelines
Stream Processing and Real-Time Data Pipelines
 
Solving Cybersecurity at Scale
Solving Cybersecurity at ScaleSolving Cybersecurity at Scale
Solving Cybersecurity at Scale
 
How to setup MateriApps LIVE!
How to setup MateriApps LIVE!How to setup MateriApps LIVE!
How to setup MateriApps LIVE!
 
Building ML Pipelines with DCOS
Building ML Pipelines with DCOSBuilding ML Pipelines with DCOS
Building ML Pipelines with DCOS
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with Spark
 
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co..."New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
 
Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017Austin,TX Meetup presentation tensorflow final oct 26 2017
Austin,TX Meetup presentation tensorflow final oct 26 2017
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
At the Crossroads of HPC and Cloud Computing with Openstack
At the Crossroads of HPC and Cloud Computing with OpenstackAt the Crossroads of HPC and Cloud Computing with Openstack
At the Crossroads of HPC and Cloud Computing with Openstack
 
An Optics Life
An Optics LifeAn Optics Life
An Optics Life
 
Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017
Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017
Deep Learning Using Caffe2 on AWS - MCL313 - re:Invent 2017
 
Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed FrameworksStrata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
Strata SC 2014: Apache Mesos as an SDK for Building Distributed Frameworks
 
How to setup MateriApps LIVE!
How to setup MateriApps LIVE!How to setup MateriApps LIVE!
How to setup MateriApps LIVE!
 
How to setup MateriApps LIVE!
How to setup MateriApps LIVE!How to setup MateriApps LIVE!
How to setup MateriApps LIVE!
 
Introduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptxIntroduction to Tensor Flow-v1.pptx
Introduction to Tensor Flow-v1.pptx
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
 

Distributed Multi-device Execution of TensorFlow – an Outlook

  • 1. Unrestricted © Siemens AG. 2016. All rights reserved. Distributed Multi-device Execution of TensorFlow – an Outlook Meetup “ TensorFlow & OpenAI – a match made in Heaven?” | 2016-03-01
  • 2. Page 2 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. What is TensorFlow? numerical computation library using data flow graphs deployable on heterogeneous distributed systems Machine Learning Perspective Distributed Computing Perspective source:http://www.tomlichtenheld.com/childrens_books/duckrabbit!.html
  • 3. Page 3 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. What is TensorFlow? using data flow graphs Machine Learning Perspective Distributed, Embedded Computing Perspective numerical computation library deployable on heterogeneous distributed systems source:http://www.tomlichtenheld.com/childrens_books/duckrabbit!.html
  • 4. Page 4 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. TensorFlow froma distributed computing perspective processor, memory, network hierarchies automatically assign to computational devices execute in parallel multi- dimensional data flow computations source:https://www.tensorflow.org/
  • 5. Page 5 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. TensorFlow froma distributed computing perspective processor, memory, network hierarchies multi- dimensional data flow computations Task Scheduling Resource Management placement, parallelization resource availability, costs Google‘s clustermanagementsystem“Borg”1) “Significantarea offuture work: improving the placementand node scheduling algorithms”1) 1) http://download.tensorflow.org/paper/whitepaper2015.pdf source:https://www.tensorflow.org/
  • 6. Page 6 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. TensorFlow from a distributed, embedded systems perspective? Some presentation by Pete Warden, Tech Lead of the TensorFlow Mobile/Embedded team: “GoogLeNet v1 is 7MB after just quantization” http://ip.cadence.com/uploads/presentations/1100AM_Tensor Flow_on_Embedded_Devices_PeteWarden.pdf ? ? https://www.youtube.com/watch ?v=b0hqhcwDIi4 https://www.autonomous.ai/deep-learning-robot http://www.nvidia.com/object/embedded-systems.html http://www.iphoneincanada.ca/ news/tesla-autopilot-summon/ https://www.youtube.com/watch?v=AbcRlDBnwjM http://www.dexterindustries.com /shop/gopigo-starter-kit-2/
  • 7. Page 7 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. TensorFlow from a distributed, embedded systems perspective All things Tensor • embedded systems sense multidimensional, multimodal, streaming data • tensor networks for easy implementation of most complex mathematical operations Dataflow paradigm • data is king • deterministic data acquisition & calculation • real-time constraints • concurrency • multi-core, GPU, FPGA • enables true portability source: https://www.tensorflow.org/
  • 8. Page 8 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. TensorFlow from a distributed, embedded systems perspective Insufficient tensor support • BLAS up to matrix-matrix ops a start: extensions to Eigen by Benoit Steiner for TensorFlow http://eigen.tuxfamily.org/dox-devel/unsupported/classEigen_1_1Tensor.html source: https://www.tensorflow.org/
  • 9. Page 9 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. TensorFlow from a distributed, embedded systems perspective Insufficient tensor support • BLAS up to matrix-matrix ops a start: extensions to Eigen by Benoit Steiner for TensorFlow http://eigen.tuxfamily.org/dox-devel/unsupported/classEigen_1_1Tensor.html Heuristic placement algorithm • suited for cloud resources need: determinism Resource Management • suited for large-scale clusters need: including resources in embedded systems source: https://www.tensorflow.org/
  • 10. Page 10 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. Upcoming workshop on tensor computing for IoT Topics of interest • multidimensional IoT data • tensor methods and deep learning • distributed data and computing models • across heterogeneous architectures of multi-core cluster and embedded computing • optimized and verifiable composition of operations in an n-dimensional array/tensor algebra (Prefect timing, TensorFlow!) Manifesto will be available here: http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=16152
  • 11. Page 11 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. Sneak Peak: Multidimensional IoT data Large-scale autonomous systems generate massive amounts of data captured by embedded devices • about dynamic flows • in dynamic networks • streaming, GPS-synchronized • captures various aspects, measurements • highly correlated, coming from networked systems
  • 12. Page 12 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. Sneak Peak: Tensor Networks (TN) “Geometrization”, graphical representation • modify, optimize TN structure • reduce complexity, compare, analyze structures • detect common, hidden components Links between TNs & graphical models in ML example notation Example transformation contraction unfolding matrix factorization SVD reshaping
  • 13. Page 13 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. Sneak Peak: Mathematics of Arrays, Psi Calculus • indexing operations based on shapes • compose array operations to minimize temporary arrays Determinism • for any number of tensor operations, predict • length of contiguous blocks • values in each block • correctly pre-fetch blocks • overlap computation & IO
  • 14. Page 14 March 2016 Sebnem Rusitschka Unrestricted © Siemens AG. 2016. All rights reserved. What, now? Stay tuned, try out • https://github.com/tensorflow/ • Distributed TensorFlow 2/26/2016 • uses gRPC http://www.grpc.io/ • TensorFlow Serving 2/16/2016 • model lifecycle management • Dagstuhl perspectives: Tensor Computing for IoT • intuitive handling of tensor operations, optimizations • deterministic placement and scheduling • applications in cyber-physical systems • reference implementations, evaluations & publications • Embedded Multicore Building Blocks EMB2 https://github.com/siemens/embb • Eigen Tensor Module https://bitbucket.org/eigen/eigen/src/265a621240a21b201cc9e73cffc1021e12e6fc93/unsupported/Eigen/CXX11/src/Tensor/?at=default
  • 15. Page 15 March 2016 Sebnem Rusitschka The future of embedded computing is being built now – starting at the processor level “Neo – The tiny chip that could disrupt exascale computing” Raspberry Pi Zero: 1 GHz Linux computer for $5 http://www.nvidia.com/object/embedded-systems.html http://rexcomputing.com/REX_OCPSummit2015.pdf http://www.nextplatform.com/2015/03/12/the-little- chip-that-could-disrupt-exascale-computing/ https://medium.com/software-is-eating-the-world/what-s-next-in-computing- e54b870b80cc#.r6k84z51m