Understanding the
Characteristics of
AndroidWearOS
Renju Liu, Felix Xiaozhu Lin
Purdue ECE
Presentation By: Pratik Jain
Motivation
 Interactive wearables, like smart watches, are a newcomer to the
spectrum of mobile computers.
 Integrate computing even tighter with our daily lives.
 Substantial increase in demand for smart watches.
Usage
Patterns
&
Device
Hardware
 Users interact with wearable devices frequently throughout the
daily use
 Each interaction is short ( < 10s ), and is dedicated to a simple task
 Due to the limited content that can be displayed on one screen,
users spend a short time on one screen before switching to the
next.
 Tiny Battery capacity (200 – 400mAh)
 Slower CPU – Fewer cores
 Simpler CPU – Scaled-down but often architecturally identical to
handheld’s CPU
AndroidWearOS
 One of the most popular OSes for interactive wearables.
 Wearable OS with the most public information.
 Supports third-party applications and features a resigned system
UI, including Card for notifications, Context streams, and voice
input.
 Apps – renovated UI – Follow Android’s conventional
programming paradigm – Written in Java – Compiled ahead-of-
time – executed atop the managedAndroid Runtime.
 Major OS components –
 System Server – Key daemon hosting the core OS services
 Surface Flinger – Daemon controlling UI animation
 Clockwork – OS shell that implements the system UI
Benchmark
Scenarios
 A benchmark suite that consists of 15 benchmarks falling into the
following 4 categories:
1. Wakeup – Due to internal or external events, device transits out of
suspended mode and presents brief information. Due to frequent daily
wakeup, energy efficiency is the most important metric.
2. Single input – A waking wearable device responds to a single input
from the user. Because the user is waiting, the device needs to achieve
low UI latency.
3. Continuous interaction – Users are interacting with the device
continuously. The resultant UI animation requires the device to
produce a steady stream of graphic frames.
4. Sensing – For the execution of wearable apps, sensor data is sampled
and processed periodically to collect context information.
METHODOLOGY
Experimental
Setup
 All the benchmarks are run on 2 state-of-the-art Android Wear
devices
 LGWatch R
 Samsung Gear Live
 Qualcomm’s APQ8026 system on-chip
 AndroidWear 5.0 “Lollipop”
Power
Management
 Batteries have tiny contacts which are incompatible with
commodity power monitors.
 A compatible interface circuit is carved out from a smartphone
battery.
 Used the interface as an adapter between the smart watch and an
external power monitor.
The battery
interface carved
out from Nexus 5
The interface
(flipped) connected
to the LG watch R
Toolset
 Used the following to examine system behaviors at different levels
and granularities
 Systrace – for capturing global system events such as scheduling, I/O
activities, and IPC
 Android Runtime’s built-in function tracer – for recording function
call history in individual processes
 Linux perf – for sampling CPU performance counters.
Tackling
profiling
overhead
 EventTracing – Major profiling overhead
 Memory overhead can be overwhelming in tracing function
invocations.
 2 ways used to tackle
 In quantifying global system behaviors, the paper only relies on
system events. It collects function trace from extra runs.
 In quantifying function-level activities, deduction of an overhead of
4 µs from each traced function invocation ( constant overhead ).
CPUUsage
 CPU usage is collected at two granularities
 Task-level breakdown. An analyzer is built to identify the tasks .
 Function-level breakdown. To further locate the performance
hot spots in System Server, the following 2 metrics are employed:
 Exclusive CPU cycles are spent in the function’s own code
 Inclusive CPU cycles are spent in the function’s code as well as
in all subroutines being called
 Both metrics include the time spent in both user and kernel spaces
and do not cover the time when a task is off CPU due to being
scheduled out.
IdleTime
Analysis
 Amount and duration of the observed idle episodes are unusual.
 Match some idle episodes to system events known to cause idle, e.g.
I/O and power management.
 Others often root in stalling of OS service in serving app’s requests.
 IdleChecker, an analyzer that helps mapping anomalous idle episodes
to the responsible code regions, based on a simple rationale:
The function calls and IPC transactions spanning an anomalous idle
episode are suspicious.
 IdleChecker runs the following steps for each idle episode.
 Identifies suspicious app tasks that are blocked throughout the entire
idle episode but run after the episode.
 For each suspicious task, it identifies two suspicious CPU time quanta:
the one right before the idle episode and the one right after it.
 Examines the suspicious quanta, looking for IPC transactions spanning
across the idle episode.
 Identifies the function invocations that either coincide with the IPC
or span across the idle episode.
Thread-level
Parallelism
 Metric widely used for gauging an interactive system’s need for
core count.
 Average number of busy CPU cores during the non-idle time.
TLP = 𝑖=1
𝑛
𝑖 ∗ 𝑐𝑖/(1 − 𝑐0)
 𝑐0 - total time when no threads are running
𝑐𝑖 - time when exactly i threads are running simultaneously
n - number of cores available.
 For measurement, all 4 cores are forced online
Microarchitectural
behaviors
 Microarchitecture design is a Mystery
 By using the Linux perf, the paper samples the performance
counters of the Cortex-A7 CPU on test devices.
 Observe branch prediction, cache, andTLB in all benchmarks
RESULTS
Where doCPU
cycles go?
 Intensive OS execution often dominates the global CPU usage.
 Many costly OS services are likely to make software unnecessarily
complicated
 The CPU time distribution of hot functions is highly skewed.
 Manipulating basic data structures consumes substantial CPU
cycles.
 Legacy OS functions may become serious performance
bottlenecks
 OS Execution Bottlenecks
 setLight(), Layout(), computeOom(), getSimpleName()
Idle Episodes
 Plentiful and of a variety of lengths
 Improper OS Designs
 Interference from voice UI
 Legacy support for device suspending
 Performance overprovision during continuous Interaction
 Design Implications
 Hunting OS inefficiencies
 Filling idle time with useful work
 reducing CPU & GPU clock rates which will shrink idle episodes
 predictive execution
Thread-level
parallelism
 Short interactions exhibit substantial TLP, which is on par with
desktop workloads.
 While apps are mostly single-threaded, OS daemons contribute to
TLP significantly.
 A wearable device needs at least two cores.
Microarchitectural
behaviors
 A significant mismatch exists between the OS and CPU
microarchitecture, particularly in L1 icache, iTLB, and branch
predictor.
 The mismatch is largely due to the OS code complexity, and will
not be eliminated by a unilateral enhancement of wearable CPU.
 OS should be trimmed down to match the simplicity of its apps.
RelatedWork
 Gao et al. find that smartphone workloads show limitedTLP,
concluding that they need no more than two cores.
 ProfileDroid contributes an approach for charactering smartphone
apps at multiple layers
 Min et al. studies the battery usage of smart watches
 WearDrive creates synthetic benchmarks to shed light on
wearable storage.
 RisQ andTypingRing target gesture recognition
 iShadow tracks gaze in real time
 Ha et al. build wearable for cognitive assistance
 Cornelius et al. focus on user identification
Recap
 In-depth analysis of one of the most popular wearable Oses,
Android Wear.
 Examination of 4 key aspects: CPU usage, idle episodes, TLP, and
micro-architectural behaviors – in fifteen benchmarks.
 Discovery of serious OS inefficiencies and system bottlenecks that
were widespread but unknown before.
 The results clearly point out the system bottlenecks for immediate
optimization and have strong implications on future wearable
system software and hardware design.
THANKYOU!

Understanding the characteristics of android wear os

  • 1.
    Understanding the Characteristics of AndroidWearOS RenjuLiu, Felix Xiaozhu Lin Purdue ECE Presentation By: Pratik Jain
  • 2.
    Motivation  Interactive wearables,like smart watches, are a newcomer to the spectrum of mobile computers.  Integrate computing even tighter with our daily lives.  Substantial increase in demand for smart watches.
  • 3.
    Usage Patterns & Device Hardware  Users interactwith wearable devices frequently throughout the daily use  Each interaction is short ( < 10s ), and is dedicated to a simple task  Due to the limited content that can be displayed on one screen, users spend a short time on one screen before switching to the next.  Tiny Battery capacity (200 – 400mAh)  Slower CPU – Fewer cores  Simpler CPU – Scaled-down but often architecturally identical to handheld’s CPU
  • 4.
    AndroidWearOS  One ofthe most popular OSes for interactive wearables.  Wearable OS with the most public information.  Supports third-party applications and features a resigned system UI, including Card for notifications, Context streams, and voice input.  Apps – renovated UI – Follow Android’s conventional programming paradigm – Written in Java – Compiled ahead-of- time – executed atop the managedAndroid Runtime.  Major OS components –  System Server – Key daemon hosting the core OS services  Surface Flinger – Daemon controlling UI animation  Clockwork – OS shell that implements the system UI
  • 5.
    Benchmark Scenarios  A benchmarksuite that consists of 15 benchmarks falling into the following 4 categories: 1. Wakeup – Due to internal or external events, device transits out of suspended mode and presents brief information. Due to frequent daily wakeup, energy efficiency is the most important metric. 2. Single input – A waking wearable device responds to a single input from the user. Because the user is waiting, the device needs to achieve low UI latency. 3. Continuous interaction – Users are interacting with the device continuously. The resultant UI animation requires the device to produce a steady stream of graphic frames. 4. Sensing – For the execution of wearable apps, sensor data is sampled and processed periodically to collect context information.
  • 6.
  • 7.
    Experimental Setup  All thebenchmarks are run on 2 state-of-the-art Android Wear devices  LGWatch R  Samsung Gear Live  Qualcomm’s APQ8026 system on-chip  AndroidWear 5.0 “Lollipop”
  • 8.
    Power Management  Batteries havetiny contacts which are incompatible with commodity power monitors.  A compatible interface circuit is carved out from a smartphone battery.  Used the interface as an adapter between the smart watch and an external power monitor. The battery interface carved out from Nexus 5 The interface (flipped) connected to the LG watch R
  • 9.
    Toolset  Used thefollowing to examine system behaviors at different levels and granularities  Systrace – for capturing global system events such as scheduling, I/O activities, and IPC  Android Runtime’s built-in function tracer – for recording function call history in individual processes  Linux perf – for sampling CPU performance counters.
  • 10.
    Tackling profiling overhead  EventTracing –Major profiling overhead  Memory overhead can be overwhelming in tracing function invocations.  2 ways used to tackle  In quantifying global system behaviors, the paper only relies on system events. It collects function trace from extra runs.  In quantifying function-level activities, deduction of an overhead of 4 µs from each traced function invocation ( constant overhead ).
  • 11.
    CPUUsage  CPU usageis collected at two granularities  Task-level breakdown. An analyzer is built to identify the tasks .  Function-level breakdown. To further locate the performance hot spots in System Server, the following 2 metrics are employed:  Exclusive CPU cycles are spent in the function’s own code  Inclusive CPU cycles are spent in the function’s code as well as in all subroutines being called  Both metrics include the time spent in both user and kernel spaces and do not cover the time when a task is off CPU due to being scheduled out.
  • 12.
    IdleTime Analysis  Amount andduration of the observed idle episodes are unusual.  Match some idle episodes to system events known to cause idle, e.g. I/O and power management.  Others often root in stalling of OS service in serving app’s requests.  IdleChecker, an analyzer that helps mapping anomalous idle episodes to the responsible code regions, based on a simple rationale: The function calls and IPC transactions spanning an anomalous idle episode are suspicious.  IdleChecker runs the following steps for each idle episode.  Identifies suspicious app tasks that are blocked throughout the entire idle episode but run after the episode.  For each suspicious task, it identifies two suspicious CPU time quanta: the one right before the idle episode and the one right after it.  Examines the suspicious quanta, looking for IPC transactions spanning across the idle episode.  Identifies the function invocations that either coincide with the IPC or span across the idle episode.
  • 13.
    Thread-level Parallelism  Metric widelyused for gauging an interactive system’s need for core count.  Average number of busy CPU cores during the non-idle time. TLP = 𝑖=1 𝑛 𝑖 ∗ 𝑐𝑖/(1 − 𝑐0)  𝑐0 - total time when no threads are running 𝑐𝑖 - time when exactly i threads are running simultaneously n - number of cores available.  For measurement, all 4 cores are forced online
  • 14.
    Microarchitectural behaviors  Microarchitecture designis a Mystery  By using the Linux perf, the paper samples the performance counters of the Cortex-A7 CPU on test devices.  Observe branch prediction, cache, andTLB in all benchmarks
  • 15.
  • 16.
    Where doCPU cycles go? Intensive OS execution often dominates the global CPU usage.  Many costly OS services are likely to make software unnecessarily complicated  The CPU time distribution of hot functions is highly skewed.  Manipulating basic data structures consumes substantial CPU cycles.  Legacy OS functions may become serious performance bottlenecks  OS Execution Bottlenecks  setLight(), Layout(), computeOom(), getSimpleName()
  • 17.
    Idle Episodes  Plentifuland of a variety of lengths  Improper OS Designs  Interference from voice UI  Legacy support for device suspending  Performance overprovision during continuous Interaction  Design Implications  Hunting OS inefficiencies  Filling idle time with useful work  reducing CPU & GPU clock rates which will shrink idle episodes  predictive execution
  • 18.
    Thread-level parallelism  Short interactionsexhibit substantial TLP, which is on par with desktop workloads.  While apps are mostly single-threaded, OS daemons contribute to TLP significantly.  A wearable device needs at least two cores.
  • 19.
    Microarchitectural behaviors  A significantmismatch exists between the OS and CPU microarchitecture, particularly in L1 icache, iTLB, and branch predictor.  The mismatch is largely due to the OS code complexity, and will not be eliminated by a unilateral enhancement of wearable CPU.  OS should be trimmed down to match the simplicity of its apps.
  • 20.
    RelatedWork  Gao etal. find that smartphone workloads show limitedTLP, concluding that they need no more than two cores.  ProfileDroid contributes an approach for charactering smartphone apps at multiple layers  Min et al. studies the battery usage of smart watches  WearDrive creates synthetic benchmarks to shed light on wearable storage.  RisQ andTypingRing target gesture recognition  iShadow tracks gaze in real time  Ha et al. build wearable for cognitive assistance  Cornelius et al. focus on user identification
  • 21.
    Recap  In-depth analysisof one of the most popular wearable Oses, Android Wear.  Examination of 4 key aspects: CPU usage, idle episodes, TLP, and micro-architectural behaviors – in fifteen benchmarks.  Discovery of serious OS inefficiencies and system bottlenecks that were widespread but unknown before.  The results clearly point out the system bottlenecks for immediate optimization and have strong implications on future wearable system software and hardware design.
  • 22.