Dl 0n mobile jeff shomaker_jan-2018_final

Presentation at Global AI Conference
Santa Clara, CA
1-17-18
Jeff Shomaker
Founder, 21 SP, Inc.
Deep Learning on
Mobile Devices

21 SP, Inc.
Proprietary and Confidential 2
Introduction
• Neural network (NN) software will increasingly be available on phones,
watches, drones, sensors and other devices.
• Talk will cover:
– AI-Capable Smartphones
– Deep Learning (DL) on mobile devices
• Hardware
• Software
– Future applications
• Wireless communications
• Unmanned aerial vehicles (UAVs)
• Virtual reality
• Internet of Things (IoT)

21 SP, Inc.
Proprietary and Confidential
AI-Capable Smartphone Forecast 1)
3
Future smartphones will use specialized AI-capable chips

21 SP, Inc.
2018 Smartphones with AI 2)
– Samsung S8/S8 Plus/Note 8
• Bixby is their AI-based assistant that can learn the user’s behavior and
then act on that information.
– Honor View 10 (Huawei)
• This is the most sophisticated of the AI-based phones
• Their system allows: 1) secure face unlocking, 2) camera benefits, 3)
translations, 4) enhanced automation, and 5) voice assistance.
– Apple iPhone X
• Has sophisticated face ID for security
• Improved voice recognition for SIRI.
– Google Pixel 2/Pixel 2 XL
• Improved imaging capabilities on cameras
• Improved Google Assistant language understanding.
4

21 SP, Inc.
AI-Capable Smartphone Chipmakers 1)
–Apple is a powerful player with its All Bionic Chip. The company is
expecting its facial recognition technology to drive AI on smartphone
adoption.
• Apple is forecasted to have the strongest sales.
–Qualcomm will likely capture the second highest sales volume.
• Qualcomm’s processors make up 40 percent of the Android
mobile market.
• In 2017, they released a SDK for their Neural Processing
Engine so developers can run NNs on their processors.
–Huawei is an emerging participant and in 2017 stated their phones
will include a dedicated Neural Processing Unit on their Kirin 970
dedicated AI Chip.
5

21 SP, Inc.
Nvidia Wants AI on the Edge 3)
– Nvidia’s graphics processing units (GPUs) have been driving a lot
of the growth in deep learning.
– The company is now focusing on embedding AI technology into
edge devices such as security cameras and drones.
– According to a company VP, there are four reasons to shift
processing from the cloud to local devices:
• Bandwidth – capacity to send data from an estimated 1 billion security
cameras in 2020 is unlikely to be available.
• Latency – Some applications require extremely fast decisions. Self-
driving cars are one example.
• Privacy – Transferring personal information can increase security risk.
• Availability – In many parts of the world, cloud availability is limited or
intermittent. Emergency services need 100% availability, for example.
6

21 SP, Inc.
Nvidia’s Jetson TX2 Platform 3)
– In 2017, Nvidia announced it’s new credit-card sized module for
embedded AI-based computing.
– The unit, runs twice as fast as the TX1, per Nvidia.
– At an event in SF, the following companies stated they will be using
the new AI internal processing chip on edge devices:
• Cisco will use it on a new device that, for example, will recognize
people speaking at a meeting and focus the camera on them.
• Artec will use it to cut the link to the cloud (their earlier products) by
creating 3D scanning images in real time at the edge.
• Teal Drones is expected to ship a $1,300 drone that can react to what
it sees. One possible use is for counting cattle on large farms.
• EnRoute says the new TX2 will allow their Zion drone to fly faster while
still avoiding collisions.
7

21 SP, Inc.
Squeezing DL on Mobile Devices 4)
– “The blending of learning algorithms and mobile computing taking place
today is only the beginning.”
– The following commercial entities are making frameworks, tools and
libraries available:
• TensorFlow, Caffe2, SNPE, Compute Library by Google, Facebook,
Qualcomm and ARM, are examples.
– Assuming the models will only be executed (all training done in an off-
device manner), there are still three challenges to running sophisticated
NNs on resource constrained devices:
• Limited memory, computational power and unusually large inference
time.
– A lot of progress has been made in the last eighteen months on phones,
watches, and sensors and in time they will potentially be able to accomplish
control and decision activities as well as other logic-based tasks where the
DL will need to be able to learn and adapt to complexity.
8

21 SP, Inc.
Neural Networks 5)
9
- Neural networks are a paradigm for processing information loosely
based on the idea of neurons that communicate information in the brain
and spinal cord. 6)
- DNNs and CNNs “… routinely are composed of thousands of
interconnected units, and millions of parameters.” 18)

21 SP, Inc.
Shrinking Software: Quantization 7)
– Current deep neural networks are usually big cloud-based
structures so it is difficult to run them on mobile and in-sensor
devices.
– One way to address this is with special-purpose chips. A second
way, is by creating new smaller representations of deep NNs.
– In order to accomplish this, “ … must encode both the connectivity
pattern of synapses and the weights of those synapses.”
– In this paper, the author’s use quantization theory to develop an
approach that in certain circumstances allows them to calculate the
optimal quantizer.
– This approach can allow them to quantize (i.e., shrink) learned
weights so deep NN models can more easily fit on resource-
constrained devices.
10

21 SP, Inc.
TensorFlow’s Eight-Bit Quantization 8)
– In earlier NNs, floating point arithmetic was used. But, today more
efficient inference is needed.
– Quantization is used to describe methods that “…store numbers
and perform calculations on them in more compact formats than
32-bit floating point.”
– By storing only minimum and maximum values and then changing
float values to eight-bit integers, it is possible to reduce a file size
by 75%.
– Utilizing eight-bit reduces power consumption and shortens the
time to process a NN. These characteristics will, in turn, make it
easier to bring intelligent products to the IoT market.
– With TensorFlow, they “… found that we can get extremely good
performance on mobile and embedded devices by using eight-bit
arithmetic rather than floating point.”
11

21 SP, Inc.
DeepRebirth: An Acceleration Framework 9)
– Quantization is one way to shrink NN models.
– In this paper, the author’s develop an approach called DeepRebirth
they believe speeds processing much more than other techniques,
such as quantization.
– DeepRebirth reduces the numbers of layers by merging the “…
parameter-free layers with their neighbor convolutional layers to a
single dense layer.”
– The author’s propose two types of merging:
• Streaming merging – layers in a hierarchy are merged and a new
Rebirth Layer is created (processing time reduced:154ms to 17ms)
• Branch merging – parallel branches at the same level are merged
(processing time reduced: 56ms to 21ms).
– Experiments were done on a Samsung Galaxy S5 smartphone.
12

21 SP, Inc.
ProjectionNet: A Two NN Architecture 10)
– The author’s propose an approach that utilizes two NNs.
– “The two networks are trained jointly using backpropagation, where
the projection network learns from the full network similar to
apprenticeship learning.”
– Allows distributed training; but, then made to fit on devices and
runs with lower memory and computation costs.
– This method differs from others (e.g., weights quantization) since
here operations and intermediate representations (i.e., hidden
units) are the entities made smaller.
– Experimental Results:
• Handwriting recognition (MNIST) – 92.3% precision with 388x
compression ratio.
• Image classification (CIFAR-100, 50K color images) – 17.8% precision
with 70x compression ratio.
13

21 SP, Inc.
CoINF: An Offloading Framework 11)
– CoINF is a new proposed DL framework that allows wearables,
such as smartwatches and smart glasses, to work with
smartphones when making inferences.
– Wearables can capture a wide range of data. Examples, include a
person’s pulse, physical gestures, fitness metrics, and eye tracking.
– Since wearables are extremely resource limited, the author’s
developed a system that offloads DL computation from the
wearable to a local mobile device.
– The system uses TensorFlow and runs on an Android OS for
handhelds and on an Android Wear OS for wearables.
– Experiments show (vs wearable-only and handheld-only) favorable
results:
• 15.9X to 23.0X execution speedup
• 81.3% to 85.5% energy savings
14

21 SP, Inc.
NNs in Wireless Communications 12)
– In the wireless communications arena, NNs provide the technology
for two types of applications.
– First, NNs are useful for data analysis, predicting the future and
making inferences.
• In these circumstances, systems can capture data from user behavior,
environmental metrics and other information.
• NNs can then take action based on this information.
– Second, by implementing AI at a network’s edge, self-organizing
network operations can be enabled.
• Edge computing will include placing processors on many components,
including base stations as well as user devices.
• Examples of self-organizing operations include resource management,
data offloading and user association.
15

21 SP, Inc.
NNs in Unmanned Aerial Vehicles (UAVs) 12)
– Work is progressing on using UAVs to compliment terrestrial
communications networks by providing wireless service to end
users.
– This technology likely to be used in post-5G cellular networks. 13)
– There are two ways that NNs are likely to be helpful to UAV-based
wireless communications.
– First, reinforcement learning (RL) NNs can “… dynamically adjust
their locations, flying directions, resource allocation decisions, and
path planning to serve their ground users and adapt to the users’
dynamic environment.”
– Second, the ground environment can be mapped by UAVs and
then NNs can make predictions about the user’s behavior. For
example, NNs can predict where the user is moving to.
16
13) 5G available by about 2020. Many possible technologies. Expected to download HD
movie in < 1 second.

21 SP, Inc.
NNs for Wireless Virtual Reality (VR) 12)
– Virtual reality is expected to allow users to “… experience and
interact with a wealth of virtual and immersive environments
through a first-person view.”
– Technology allows for surround sound and 360 degree vision.
– NNs are expected to help address problems with VR.
– First, since NNs can predict user’s movements (e.g., head
direction), waste of capacity-limited bandwidth can be avoided.
• Based on vision direction, only the desired image will be
displayed as opposed to the entire 360 degree view.
– Second, cellular networks can have varying quality. NNs can drive
VR image adjustment based on network quality.
17

21 SP, Inc.
NNs for the Internet of Things (IoT) 12)
– “In the foreseeable future, it is envisioned that trillions of machine-
type devices such as wearables, sensors, connected vehicles, or
mundane objects will be connected to the Internet, forming a
massive IoT ecosystem .…”
– It’s expected that smart services will be provided; but, massive
connectivity can create bottlenecks.
– NNs are likely to help IoT systems in four ways:
• NNs can use big data analytics to compress massive amounts of data
• Utilizing user and wireless environmental data, RL NNs can self-
organize and, for example, switch frequencies to optimize
communication.
• NNs can analyze sensor data for immediate or future use, perhaps at
off-peak hours.
• Since NNs can predict user behavior, intelligent services may be
offered when, for example, a user leaves work.
18

21 SP, Inc.
McKinsey on the IoT 14), 15)
– “The Internet of Things (IoT) … is transforming how we live and
work.”
– It is already providing the following products and services:
• Soil moisture and nutrient data are being transmitted from farms to
experts at distant locations.
• Homeowners are benefitting from IoT-based security systems that
employ very long lasting batteries.
• Its now standard for production-line sensors to notify factory managers
of conditions on the floor.
– This technology is expected to drive economic benefits of $4 to $11
trillion by 2025.
– However, McKinsey’s research suggests that benefits and wide-
spread adoption of IoT applications could materialize more slowly
than many people think.
19

21 SP, Inc.
Hype vs Reality Panel 16)
– Many predictions about self-driving cars & drones are irrational.
• “It turns out that much of what appears in mainstream media about
self-driving cars is very much overstated, said Kumar. Fully
autonomous cars are many years away, in his view.”
– In terms of DL, there are constraints, as well per Fung.
• ‘… A deep-learning algorithm that can just do speech recognition,
which is translating what you are saying, has to be trained on millions
of hours of data and uses hugh data farms …. And while a deep-
learning network might have hundreds of thousands of neurons, the
human brain has trillions.”
– At this point, computers can only do narrow tasks.
– Work is underway, however, on “affective computing” (i.e., allowing
machines to pick up on our voices, body language, etc.)
– Extremely hard to do this type of communication, however
20

21 SP, Inc.
Appendix
21

21 SP, Inc.
AI Technologies and Applications 17)
22
1) Boston Consulting Group, September 2017.Source: Boston Consulting Group, September 2017.

21 SP, Inc.
End Notes
• 1) Rayna Hollander (2017 Oct 25). Apple Drives Native AI Adoption in Smartphones. Business
Insider, www.businessinsider.com.
• 2) Karen Bajai (2017 Dec 11). 4 Smartphones with Artificial Intelligence to Look for in 2018. The
Economic Times, m.economictimes.com.
• 3) Tekla S. Perry (2017 Mar). Nvidia Wants AI to Get Out of the Cloud and into the Camera, Drone,
or Other Gadget Near You. IEEE Spectrum, spectrum.ieee.org.
• 4) Nicholas D. Lane, et al (2017 Jul-Sep). Squeezing Deep Learning into Mobile and Embedded
Devices. Pervasive Computing, ieeeexplore.ieee.org.
• 5) S. Raschka (2016). What is the Difference Between Deep Learning and Regular Machine
Learning, Kdnuggets, www.kdnuggets.com.
• 6) Geoffrey Hinton, et al (2012 Oct). Neural Networks for Machine Learning course. U of Toronto,
Coursera.com, accessed 2013.
• 7) Avhishek Chatterjee, et al (2017 Aug 15). Towards Optimal Quantization of Neural Networks.
Information Theory (ISIT), 2017 IEEE International Symposium.
• 8) How to Quantize Neural Networks (2017 Nov 2 Update). TeensorFlow, www.tensorflow.org,
accessed Dec 28, 2017.
• 9) Dawei Li, et al (2017 Aug 16). DeepRebirth: A General Approach for Accelerating Deep Neural
Network Execution on Mobile Devices. Under review conference paper, arXiv:1708.04728 [cs.CV],
1708.04728.pdf, accessed Dec 27, 2017.
23

21 SP, Inc.
End Notes (cont.)
• 10) Sujith Ravi (2017 Aug 9). ProjectionNet: Learning Efficient On-Device Deep Networks using
Neural Projections. arXiv:1708.00630v2 [cs.LG].
• 11) Mengwei Xu, et al (2017 Dec 1). Enabling Cooperative Inference of Deep Learning on Wearables
and Smartphones. arXiv:1712.03073v1 [cs.CY].
• 12) Mingzhe Chen, et al (2017 Oct 9). Machine Learning for Wireless Networks with Artificial
Intelligence: A Tutorial on Neural Networks. Preprint, arXiv:1710.02913v1 [cs.IT], 1710.02913.pdf, 1-
98. Accessed Nov 1, 2017.
• 13) Amy Nordrum, et al (2017 Jan 27). Everything You Need to Know About 5G. IEEE Spectrum,
spectrum.ieee.org.
• 14) Mark Patel, et al (2017 May). What’s New with the Internet of Things? McKinsey & Company,
www.mckinsey.com.
• 15) Daniel Alsen, et al (2017 Nov). The Future of Connectivity: Enabling the Internet of Things.
McKinsey & Company, www.mckinsey.com.
• 16) (2017 Jul 14). The Future of Artificial Intelligence: Why the Hype Has Outrun Reality. Panel
discussion titled Engineering the Future of Business. Knowledge at Wharton,
knowledge.wharton.upenn.edu.
• 17) Frank Felden, et al (2017 Oct 6). Time to Double Down on AI and Robotics. Boston Consulting
Group. www.bcg.com.
• 18) Nicholas D. Lane, et al (2016 Nov-Dec). DXTK: Enabling Resource-efficient Deep Learning on
Mobile and Embedded Devices with the DeepX Toolkit. MobiCASE 2016 Proceedings of the 8th
EAI
Intl Conference on Mobile Computing, Applications and Services, di.acm.org, 98-107.
24

21 SP, Inc.
Contacts
• Jeff Shomaker – Founder/President 21 SP, Inc.
–jshomaker@21spinc.com
–www.21spinc.com
–650-302-7491
• 21 SP, Inc. is a small privately held startup developing and marketing
expert systems-based decision support software to use in genetic-
based personalized medicine. The company's mission is to create
tools that will reduce the use of traditional trial-and-error medicine by
using pharmacogenetics and other evidence-based data, such as the
results of high quality clinical trials, in the medical clinic.
25

Dl 0n mobile jeff shomaker_jan-2018_final

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Dl 0n mobile jeff shomaker_jan-2018_final

Similar to Dl 0n mobile jeff shomaker_jan-2018_final (20)

Recently uploaded

Recently uploaded (20)

Dl 0n mobile jeff shomaker_jan-2018_final