SNAPDRAGON SOC
FAMILY
AND
ARM ARCHITECTURE
Presented by : Abdulaziz Tagawy
Course : EC 681
Information
 Book(s)
 COMPUTER ORGANIZATION
AND ARCHITECTURE
DESIGNING FOR
PERFORMANCE ; NINTH
EDITION
 Addison Wesley - ARM System-
on-Chip Architecturenn ; 2Ed
 WIB Materials
 http://en.wikipedia.org/wiki/Qualc
omm_Snapdragon
 http://www.arm.com/products/pro
cessors/cortex-a/index.php
 http://www.tomshardware.com/r
eviews/snapdragon-810-
benchmarks,4053-2.html
Qualcomm Snapdragon
Qualcomm Snapdragon
 Introduction
 Qualcomm Incorporated is an
American global semiconductor
company that designs and markets
wireless telecommunications
products and services.
 Snapdragon is a family of mobile
systems on a chip (SoC) by
Qualcomm. Qualcomm considers
Snapdragon a "platform" for use in
smartphones, tablets, and
smartbook devices.
 The original Snapdragon CPU,
dubbed Scorpion is Qualcomm's
own design. It has many features
similar to those of the ARM Cortex-
A8 core and it is based on the
ARMv7 instruction set .
 The successor to Scorpion, found in
S4 Snapdragon SoCs, is named Krait
and has many similarities with the
ARM Cortex-A15 CPU and is also
based on the ARMv7 instruction set.
 The majority of Snapdragon
processors contain the circuitry to
decode high-definition video (HD)
resolution at 720p or 1080p
depending on the Snapdragon chip
Adreno, the company's proprietary
GPU series, integrated into
Snapdragon chips is Qualcomm's own
design
 All Snapdragons feature one or more
DSPs called Hexagon , The multimedia
Hexagons are mostly used for audio
encoding/decoding, the newer Snapdragons
have a hardware block called Venus for video
encoding/decoding.
Qualcomm Snapdragon
 Scorpion (CPU)
 Scorpion is a central
processing unit (CPU) core
designed by Qualcomm for
use in their Snapdragon
mobile systems on chips
(SoCs).
 It is designed in-house, but
has many architectural
similarities with the ARM
Cortex-A8 and Cortex-A9
CPU cores.
 Krait (CPU)
 Krait is an ARM-based
central processing unit
included in Qualcomm
Snapdragon S4 and
Snapdragon 400/600/800
(Krait 200, Krait 300, Krait
400 and Krait 450) System on
chips.
 It was introduced in 2012 as a
successor to the Scorpion
CPU and has architectural
similarities to ARM Cortex-
A15.
Qualcomm Snapdragon
 Scorpion (CPU)
 10/12 stage integer pipeline with 2-way
decode, 3-way out-of-order
speculatively issued superscalar
execution .
 Pipelined VFPv3[2] and 128-bit wide
NEON (SIMD)
 32 KB + 32 KB L1 cache
 256 KB (single-core) or 512 KB (dual-
core) L2 cache
 Single or dual-core configuration
 2.1 DMIPS/MHz
 Krait (CPU)
 11 stage integer pipeline with 3-way
decode and 4-way out-of-order
speculative issue superscalar execution
 Pipelined VFPv4[2] and 128-bit wide
NEON (SIMD)
 4 KB + 4 KB direct mapped L0 cache
 16 KB + 16 KB 4-way set associative
L1 cache
 1 MB 8-way set associative (dual-core)
or 2 MB (quad-core) L2 cache
 Dual or quad-core configurations
 Performance (DMIPS/MHz) Krait 200:
3.3
 Performance (DMIPS/MHz) Krait 300:
3.39
 Performance (DMIPS/MHz) Krait 450:
3.51
Qualcomm Snapdragon
Semicondu
ctor
technology
CPU
instruction
set
CPU
CPU
cache
GPU
Utilizing
devices
45 nm ARMv7
Up to 1.2 GHz
quad-core ARM
Cortex-A5[15]
Adreno 203
(FWVGA/F
WVGA)
HTC Desire
600
28 nm LP ARMv7
Up to 1.5 GHz
quad-core Krait
[107]
L0: 4 KB
+ 4 KB,
L1:
16 KB +
16 KB,
L2: 2 MB
Adreno 320
(QXGA/108
0p) at
400 MHz
Sony Xperia
Z
Family : S4
Qualcomm Snapdragon
 4 KiB + 4 KiB L0 cache, 16 KiB + 16 KiB L1 cache and 2 MiB L2
cache
 4K × 2K UHD video capture and playback
 Up to 21 Megapixel, stereoscopic 3D dual image signal processor
 Adreno 330 GPU
 USB 2.0 and 3.0
 Display controller: MDP 5, 2 RGB, 2 VIG, 2 DMA, 4K
 there are free Linux drivers Qualcomm's Adreno GPU
 there are free Linux drivers for the Qualcomm Atheros WNICs
 LLVM supports the Qualcomm Hexagon DSP
Family : 800 series
Qualcomm Snapdragon
 ARMv8-A (64-bit architecture)
 Dolby Atmos
 Directional micro support
 Wifi 11ac / 11ad
 Built in Shazam app
 H.265/HEVC encoding/decoding
 eMMC 5.0 support
 Native 4k support
 Native Bluetooth 4.1 support
 14-bit dual-ISP
 support for triple-band (i.e. IEEE 802.11n, IEEE 802.11ac and IEEE
802.11ad (60 GHz).
Family : 810
Qualcomm Snapdragon
Family : 810
Qualcomm Snapdragon
 Architecture :
Family : 810
Qualcomm Snapdragon
Family : 810
ARM Architecture
ARM Architecture
 Introduction
 ARM is a family of RISC-based microprocessors and microcontrollers
designed by ARM Inc., Cambridge, England.
 The company doesn’t make processors but instead designs microprocessor
and multicore architectures and licenses them to manufacturers .
 ARM chips are high-speed processors that are known for their small die
size and low power requirements.
 ARM chips are the processors in Apple’s popular iPod and iPhone devices.
 The origins of ARM technology can be traced back to the British-based
Acorn Computers company.
 The Acorn RISC Machine became the Advanced RISC Machine .
 The company dropped the designation Advanced RISC Machine in the late
1990s. It is now simply known as the ARM architecture.
ARM Architecture
Introduction
ARM Architecture
 According to the ARM Web site arm.com , ARM
processors are designed to meet the needs of
three system categories:
 Embedded real-time systems: Systems for
storage, automotive body and power-train,
industrial, and networking applications
 Application platforms: Devices running open
operating systems including Linux, Palm OS,
Symbian OS, and Windows CE in wireless,
consumer entertainment and digital imaging
applications
 Secure applications: Smart cards, SIM
cards, and payment terminals
Introduction
ARM Architecture
ARM CACHE ORGANIZATION
ARM Architecture
 The ARM7 models used a unified L1 cache,while all subsequent models use
a split instruction/data cache.
 All of the ARM designs use a set-associative cache, with the degree of
associativity and the line size varying.
 ARM cached cores with an MMU use a logical cache for processor families
ARM7 through ARM10, including the Intel StongARM and Intel Xscale
processors.
 The ARM11 family uses a physical cache.
 An interesting feature of the ARM architecture is the use of a small first-
infirst
 out (FIFO) write buffer to enhance memory write performance.
 The write buffer is interposed between the cache and main memory and
consists of a set of addresses and a set of data words.
ARM CACHE ORGANIZATION
ARM Architecture
ARM CACHE ORGANIZATION
ARM Architecture
 Logical and Physical Cache
A logical cache, also known as a virtual cache, stores data using
virtual addresses. The processor accesses the cache directly, without going
through
the MMU. A physical cache stores data using main memory physical addresses.
One obvious advantage of the logical cache is that cache access speed is faster
than for a physical cache, because the cache can respond before the MMU
performs
an address translation. The disadvantage has to do with the fact that most virtual
memory systems supply each application with the same virtual memory address
space. That is, each application sees a virtual memory that starts at address 0.
Thus,
the same virtual address in two different applications refers to two different physical
addresses. The cache memory must therefore be completely flushed with each
application context switch, or extra bits must be added to each line of the cache to
ARM CACHE ORGANIZATION
ARM Architecture
 Line Size
 When a block of data is retrieved
and placed in the cache, not only
the desired word but also some
number of adjacent words are
retrieved.
 As the block size increases from
very small to larger sizes, the hit
ratio will at first increase because
of the principle of locality, which
states that data in the vicinity of a
referenced word are likely to be
referenced in the near future.
 As the block size increases, more
useful data are brought into the
cache. The hit ratio will begin to
decrease .
 Two specific effects come into
play:
 Larger blocks reduce the
number of blocks that fit into a
cache. Because each block
fetch overwrites older cache
contents, a small number of
blocks results in data being
overwritten shortly after they
are fetched.
 As a block becomes larger,
each additional word is farther
from the requested word and
therefore less likely to be
needed in the near future.
ARM CACHE ORGANIZATION
ARM Architecture
 Line Size
 No definitive optimum value has been found.
 A size of from 8 to 64 bytes seems
reasonably close to optimum
ARM CACHE ORGANIZATION
ARM Architecture
 Logical Cache  Physical Cache
ARM CACHE ORGANIZATION
ARM Architecture
 ARM is a family of RISC
architectures.
 “ARM”is the abbreviation
of “Advanced RISC
Machines”.
 ARM does not
manufacture its own VLSI
devices.
 ARM7- von Neuman
Architecture
 ARM9 –Harvard
Architecture
 Cortex-A8 Processor Modes :
 User - used for executing most application
programs
 FIQ - used for handling fast interrupts
 IRQ - used for general-purpose interrupt
handling
 Supervisor - a protected mode for the
Operating System
 Undefined - entered upon Undefined
Instruction exceptions
 Abort - entered after Data or Pre-fetch
Aborts
 System - privileged user mode for the
Operating System
 Monitor - a secure mode for TrustZone
ARM processor :
ARM Architecture
 Memory Protection
ARM processor :
ARM Architecture
 Full Cortex-A8 Pipeline
Diagram
ARM processor :
ARM Architecture
ARM processor :
ARM Architecture
 ARM Cortex-A Architecture
ARM processor :
Questions/Discussions
 Question One
 Defined Logical and Physical
Cache
the and state the advantage and
disadvantage of each .
 Question Two
 Summarize the most important
advance features of ARM architecture
.

SNAPDRAGON SoC Family and ARM Architecture

  • 1.
    SNAPDRAGON SOC FAMILY AND ARM ARCHITECTURE Presentedby : Abdulaziz Tagawy Course : EC 681
  • 2.
    Information  Book(s)  COMPUTERORGANIZATION AND ARCHITECTURE DESIGNING FOR PERFORMANCE ; NINTH EDITION  Addison Wesley - ARM System- on-Chip Architecturenn ; 2Ed  WIB Materials  http://en.wikipedia.org/wiki/Qualc omm_Snapdragon  http://www.arm.com/products/pro cessors/cortex-a/index.php  http://www.tomshardware.com/r eviews/snapdragon-810- benchmarks,4053-2.html
  • 3.
  • 4.
    Qualcomm Snapdragon  Introduction Qualcomm Incorporated is an American global semiconductor company that designs and markets wireless telecommunications products and services.  Snapdragon is a family of mobile systems on a chip (SoC) by Qualcomm. Qualcomm considers Snapdragon a "platform" for use in smartphones, tablets, and smartbook devices.  The original Snapdragon CPU, dubbed Scorpion is Qualcomm's own design. It has many features similar to those of the ARM Cortex- A8 core and it is based on the ARMv7 instruction set .  The successor to Scorpion, found in S4 Snapdragon SoCs, is named Krait and has many similarities with the ARM Cortex-A15 CPU and is also based on the ARMv7 instruction set.  The majority of Snapdragon processors contain the circuitry to decode high-definition video (HD) resolution at 720p or 1080p depending on the Snapdragon chip Adreno, the company's proprietary GPU series, integrated into Snapdragon chips is Qualcomm's own design  All Snapdragons feature one or more DSPs called Hexagon , The multimedia Hexagons are mostly used for audio encoding/decoding, the newer Snapdragons have a hardware block called Venus for video encoding/decoding.
  • 5.
    Qualcomm Snapdragon  Scorpion(CPU)  Scorpion is a central processing unit (CPU) core designed by Qualcomm for use in their Snapdragon mobile systems on chips (SoCs).  It is designed in-house, but has many architectural similarities with the ARM Cortex-A8 and Cortex-A9 CPU cores.  Krait (CPU)  Krait is an ARM-based central processing unit included in Qualcomm Snapdragon S4 and Snapdragon 400/600/800 (Krait 200, Krait 300, Krait 400 and Krait 450) System on chips.  It was introduced in 2012 as a successor to the Scorpion CPU and has architectural similarities to ARM Cortex- A15.
  • 6.
    Qualcomm Snapdragon  Scorpion(CPU)  10/12 stage integer pipeline with 2-way decode, 3-way out-of-order speculatively issued superscalar execution .  Pipelined VFPv3[2] and 128-bit wide NEON (SIMD)  32 KB + 32 KB L1 cache  256 KB (single-core) or 512 KB (dual- core) L2 cache  Single or dual-core configuration  2.1 DMIPS/MHz  Krait (CPU)  11 stage integer pipeline with 3-way decode and 4-way out-of-order speculative issue superscalar execution  Pipelined VFPv4[2] and 128-bit wide NEON (SIMD)  4 KB + 4 KB direct mapped L0 cache  16 KB + 16 KB 4-way set associative L1 cache  1 MB 8-way set associative (dual-core) or 2 MB (quad-core) L2 cache  Dual or quad-core configurations  Performance (DMIPS/MHz) Krait 200: 3.3  Performance (DMIPS/MHz) Krait 300: 3.39  Performance (DMIPS/MHz) Krait 450: 3.51
  • 7.
    Qualcomm Snapdragon Semicondu ctor technology CPU instruction set CPU CPU cache GPU Utilizing devices 45 nmARMv7 Up to 1.2 GHz quad-core ARM Cortex-A5[15] Adreno 203 (FWVGA/F WVGA) HTC Desire 600 28 nm LP ARMv7 Up to 1.5 GHz quad-core Krait [107] L0: 4 KB + 4 KB, L1: 16 KB + 16 KB, L2: 2 MB Adreno 320 (QXGA/108 0p) at 400 MHz Sony Xperia Z Family : S4
  • 8.
    Qualcomm Snapdragon  4KiB + 4 KiB L0 cache, 16 KiB + 16 KiB L1 cache and 2 MiB L2 cache  4K × 2K UHD video capture and playback  Up to 21 Megapixel, stereoscopic 3D dual image signal processor  Adreno 330 GPU  USB 2.0 and 3.0  Display controller: MDP 5, 2 RGB, 2 VIG, 2 DMA, 4K  there are free Linux drivers Qualcomm's Adreno GPU  there are free Linux drivers for the Qualcomm Atheros WNICs  LLVM supports the Qualcomm Hexagon DSP Family : 800 series
  • 9.
    Qualcomm Snapdragon  ARMv8-A(64-bit architecture)  Dolby Atmos  Directional micro support  Wifi 11ac / 11ad  Built in Shazam app  H.265/HEVC encoding/decoding  eMMC 5.0 support  Native 4k support  Native Bluetooth 4.1 support  14-bit dual-ISP  support for triple-band (i.e. IEEE 802.11n, IEEE 802.11ac and IEEE 802.11ad (60 GHz). Family : 810
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
    ARM Architecture  Introduction ARM is a family of RISC-based microprocessors and microcontrollers designed by ARM Inc., Cambridge, England.  The company doesn’t make processors but instead designs microprocessor and multicore architectures and licenses them to manufacturers .  ARM chips are high-speed processors that are known for their small die size and low power requirements.  ARM chips are the processors in Apple’s popular iPod and iPhone devices.  The origins of ARM technology can be traced back to the British-based Acorn Computers company.  The Acorn RISC Machine became the Advanced RISC Machine .  The company dropped the designation Advanced RISC Machine in the late 1990s. It is now simply known as the ARM architecture.
  • 15.
  • 16.
    ARM Architecture  Accordingto the ARM Web site arm.com , ARM processors are designed to meet the needs of three system categories:  Embedded real-time systems: Systems for storage, automotive body and power-train, industrial, and networking applications  Application platforms: Devices running open operating systems including Linux, Palm OS, Symbian OS, and Windows CE in wireless, consumer entertainment and digital imaging applications  Secure applications: Smart cards, SIM cards, and payment terminals Introduction
  • 17.
  • 18.
    ARM Architecture  TheARM7 models used a unified L1 cache,while all subsequent models use a split instruction/data cache.  All of the ARM designs use a set-associative cache, with the degree of associativity and the line size varying.  ARM cached cores with an MMU use a logical cache for processor families ARM7 through ARM10, including the Intel StongARM and Intel Xscale processors.  The ARM11 family uses a physical cache.  An interesting feature of the ARM architecture is the use of a small first- infirst  out (FIFO) write buffer to enhance memory write performance.  The write buffer is interposed between the cache and main memory and consists of a set of addresses and a set of data words. ARM CACHE ORGANIZATION
  • 19.
  • 20.
    ARM Architecture  Logicaland Physical Cache A logical cache, also known as a virtual cache, stores data using virtual addresses. The processor accesses the cache directly, without going through the MMU. A physical cache stores data using main memory physical addresses. One obvious advantage of the logical cache is that cache access speed is faster than for a physical cache, because the cache can respond before the MMU performs an address translation. The disadvantage has to do with the fact that most virtual memory systems supply each application with the same virtual memory address space. That is, each application sees a virtual memory that starts at address 0. Thus, the same virtual address in two different applications refers to two different physical addresses. The cache memory must therefore be completely flushed with each application context switch, or extra bits must be added to each line of the cache to ARM CACHE ORGANIZATION
  • 21.
    ARM Architecture  LineSize  When a block of data is retrieved and placed in the cache, not only the desired word but also some number of adjacent words are retrieved.  As the block size increases from very small to larger sizes, the hit ratio will at first increase because of the principle of locality, which states that data in the vicinity of a referenced word are likely to be referenced in the near future.  As the block size increases, more useful data are brought into the cache. The hit ratio will begin to decrease .  Two specific effects come into play:  Larger blocks reduce the number of blocks that fit into a cache. Because each block fetch overwrites older cache contents, a small number of blocks results in data being overwritten shortly after they are fetched.  As a block becomes larger, each additional word is farther from the requested word and therefore less likely to be needed in the near future. ARM CACHE ORGANIZATION
  • 22.
    ARM Architecture  LineSize  No definitive optimum value has been found.  A size of from 8 to 64 bytes seems reasonably close to optimum ARM CACHE ORGANIZATION
  • 23.
    ARM Architecture  LogicalCache  Physical Cache ARM CACHE ORGANIZATION
  • 24.
    ARM Architecture  ARMis a family of RISC architectures.  “ARM”is the abbreviation of “Advanced RISC Machines”.  ARM does not manufacture its own VLSI devices.  ARM7- von Neuman Architecture  ARM9 –Harvard Architecture  Cortex-A8 Processor Modes :  User - used for executing most application programs  FIQ - used for handling fast interrupts  IRQ - used for general-purpose interrupt handling  Supervisor - a protected mode for the Operating System  Undefined - entered upon Undefined Instruction exceptions  Abort - entered after Data or Pre-fetch Aborts  System - privileged user mode for the Operating System  Monitor - a secure mode for TrustZone ARM processor :
  • 25.
    ARM Architecture  MemoryProtection ARM processor :
  • 26.
    ARM Architecture  FullCortex-A8 Pipeline Diagram ARM processor :
  • 27.
  • 28.
    ARM Architecture  ARMCortex-A Architecture ARM processor :
  • 29.
    Questions/Discussions  Question One Defined Logical and Physical Cache the and state the advantage and disadvantage of each .  Question Two  Summarize the most important advance features of ARM architecture .

Editor's Notes

  • #2 ARM Architecture and SNAPDRAGON SoC Family
  • #3 Beginning course details and/or books/materials needed for a class/project.
  • #30 An opportunity for questions and discussions.