SlideShare a Scribd company logo
1 of 5
Download to read offline
TEGRA 4I EXPANDS MARKET
              Cortex-A9r4 CPU Peps Up Nvidia’s First Integrated Processor
                                                  By Linley Gwennap (March 4, 2013)
                        ...................................................................................................................

      The moment Nvidia hinted at when it purchased Icera                             run simple user interfaces and small programs such as
has now arrived. The new Tegra 4i processor (formerly                                 email readers; modern apps did not yet exist (the Apple
known by its code-name, Grey) combines Icera’s cellular                               App Store opened in January 2008). In this environment,
technology with Nvidia’s application processors to create an                          ARM focused on die area and cost, keeping critical ele-
integrated smartphone processor. Although Nvidia is far                               ments in the CPU small. Since that time, Cortex-A9 has
from the first to make this combination, the new product                              been applied to many new tasks, but the design is poorly
will greatly expand the company’s target market, which to-                            optimized for some of them.
day is restricted to tablets and premium smartphones                                        The branch history table (BHT), for example, has
(Nvidia calls them superphones). By reducing the cost and                             only 512 entries in the original Cortex-A9. This size is fine
size of reference designs, Tegra 4i aims for mainstream                               for small programs, but in more complex software, multi-
smartphones that sell for a few hundred dollars.                                      ple active branches often hash to the same entry, confusing
      Nvidia demonstrated the new chip at Mobile World                                the BHT’s predictions. The A9r4 expands the BHT to 2,048
Congress and expects it to ship in commercial smart-
phones by the end of this year. Tegra 4i offers four Cortex-                                                 Tegra 3     Tegra 4i      Tegra 4
A9 CPUs, upgraded to release 4 (r4), running at clock                                  Main CPU Cores     4xCortex-A9 4xCortex-A9r4 4xCortex-A15
speeds as high as 2.3GHz. As Table 1 shows, the graphics                               Max CPU Speed         1.7GHz       2.3GHz       1.9GHz
                                                                                       Companion Core?         Yes          Yes          Yes
unit is considerably improved from Tegra 3 and is similar                              L2 Cache Size          1MB          1MB          2MB
to that of Tegra 4. The new chip also includes Tegra 4’s                               SPECint Score*          590          920         1,168
computational-photography unit. Its integrated cellular                                                    8 texture + 48 texture + 48 texture +
                                                                                       GPU Shaders
modem, based on Nvidia’s standalone i500 chip, is com-                                                      4 vertex     12 vertex    24 vertex
patible with LTE networks as well as older 3G services.                                GPU Clock Speed      520MHz       660MHz       672MHz
                                                                                       GLBenchmark 2.5†       12fps       30fps§        57fps
                                                                                       Video Decode‡      1080p 24fps 1080p 60fps     4K 30fps
Extending Cortex-A9                                                                    Photog Engine?          No           Yes          Yes
Tegra 4i is the first processor to use the r4 version of Cortex-                       DRAM Channels        1x32-bit     1x32-bit     2x32-bit
A9. Unlike previous releases, which contained mostly bug                               Max DRAM Speed LPDDR2-1066 LPDDR3-2133 LPDDR3-2133
fixes, the A9r4 includes some significant improvements to                              LTE Baseband         External    Integrated     External
                                                                                       Process Technology   40nm LP    28nm HPM      28nm HPL
the branch predictor, TLB, and cache-memory system. Be-                                Die Size             80mm2§       62mm2§       85mm2§
cause the basic microarchitecture stays the same, these im-                            Package             14mm PoP     12mm PoP     14mm PoP
provements have no effect on simple benchmarks such as                                 Production             4Q11      4Q13 (est)   2Q13 (est)
Dhrystone. But Nvidia has measured a 15% improvement                                  Table 1. Comparison of recent Nvidia Tegra processors.
in SPECint performance and a 25% gain in BrowserBench                                 Performance data for Tegra 4/4i is preliminary. *SPECint-
performance at the same clock speed.                                                  2000_base compiled using GCC -o3; †Egypt C24Z16 Off-
      ARM announced Cortex-A9 in 2007—eons ago in the                                 screen 1080p; ‡H.264 High Profile. (Source: Nvidia, except
fast-moving mobile market. The company designed it to                                 §The Linley Group estimate)



© The Linley Group • Microprocessor Report                                                                                                    March 2013
2         Tegra 4i Expands Market

entries (the same size as Cortex-A15’s). Nvidia measured                            Flooring the Gas Pedal
the branch-misprediction rate of one SPECint2000 pro-                               After working with ARM to rev up the CPU’s logical design,
gram at 48% on the original A9 and only 8% on the A9r4.                             Nvidia then optimized the physical design to maximize
This example is extreme, but many programs will see some                            its clock speed. Tegra 4i targets 2.3GHz, compared with
benefit from the expanded BHT.                                                      1.7GHz for the Cortex-A9 CPUs in Tegra 3+. Much of this
      Similarly, Cortex-A9’s original TLB had 128 entries,                          speed boost comes from the shrink to TSMC’s 28nm HPM
providing address translations for 512KB of data (using                             process, but 2.3GHz is among the fastest Cortex-A9 speeds
standard 4KB pages). This space is enough for small pro-                            yet announced, trailing only ST-Ericsson’s 2.5GHz design
grams, but complex modern apps need more. The A9r4                                  (which uses exotic 28nm FD-SOI technology).
increases the TLB to 512 entries, offering access to four                                 Tegra 4i’s power curve will affect its working speeds.
times as much data without thrashing the TLB.                                       The company did not disclose the chip’s power, but we
      The new CPU retains the same 32KB data cache as in                            expect it requires overvoltage to achieve its top speed,
previous Cortex-A9 designs, but it improves the prefetch-                           pushing power to about 5W with all four CPUs running at
er’s effectiveness. The original A9 included prefetch logic                         2.3GHz. To fit Tegra 4i into the power envelope of a
that attempted to detect a series of sequential memory ac-                          smartphone, Nvidia is likely to limit the CPUs to a slower
cesses and continue fetching additional data before it was                          clock rate—perhaps 1.8GHz—when all four are running.
needed. It was the first ARM CPU with this feature, how-                            With only a single CPU running, however, the chip should
ever, and the prefetcher too often fetches the wrong data,                          operate at its full rated speed.
wasting cycles and power; most operating systems simply                                   As in Tegra 3, Tegra 4i includes a fifth “companion”
turn off this feature. The new prefetcher, based on a few                           core that uses the Cortex-A9 microarchitecture but is op-
more years of experience, correctly handles most common                             timized for low power and runs at a lower clock speed (see
access patterns.                                                                    MPR 11/21/11, “Nvidia Leads With Quad-Core AP”).
      As with all Cortex CPUs, ARM implemented the                                  The low-power core handles light workloads, like email
new design. Nvidia provided vigorous input regarding the                            and social media, but it transfers operation to the main
changes and is the lead customer for this version. ARM will                         CPUs when the processing load picks up. In maximum-
deliver the r4 design to other Cortex-A9 licensees, so we ex-                       performance mode, the four main CPUs run while the
pect to see this version become more widespread over time.                          low-power core is shut down.
      The design improvements are unusual for an existing
core, and their performance impact is significant. ARM de-                          Digging Into the DXP
clined to create a new name for this core, perhaps to avoid              The Tegra 4i cellular modem derives from technology
diluting its emphasis on Cortex-A15. We believe Cortex-A10               Nvidia received when it acquired startup Icera in 2011 (see
or Cortex-A9+ would be more appropriate monikers than                    MPR 5/16/11, “Nvidia Picks Up the Phone”). The modem
Cortex-A9 r4. (Readers should avoid confusing this core with             is the same as in the i500 LTE chip that Nvidia recently
Cortex-R4, a low-end ARM design intended for real-time                   announced. It supports a number of protocols, including
applications.)                                                           GSM/EDGE, WCDMA/HSPA, and Release 8 LTE in FDD
                                                                                 and TDD modes.
                32KB I-Cache                    128KB Instruction Memory               As part of the Tegra 4i launch, Nvidia re-
                                                                                 vealed details of the Icera architecture for the first
                  Branch          Fetch                                          time. The modem employs a processor known as
                   Pred               One instr                                  the DXP, which implements a custom instruction
                                 Decode                                          set optimized for cellular processing. For example,
                                                                                 it includes instructions to accelerate voice codecs
                                           64x256-bit D Registers
                                                                                 and encryption. As Figure 1 shows, the DXP is
    32x32-bit C Regs                 256
          32      32
                                                     256                         essentially a RISC CPU with a large vector unit.
     ALU /    Load / Addr
                                                   Permute                       The CPU fetches and executes one instruction per
                                                                   Vector Control




     Branch Store            512KB
                                                  Vector ALU
                                                                                 cycle and has a standard 32-entry register file called
                              Data                                               the C registers, which are backed by a small data
                             Memory
    32KB Data Cache
                            (D-Mem)
                                                  Vector ALU                     cache configurable as either 16KB or 32KB.
             32                                                                        The vector unit has its own register file with
                                                  ...




                                                                                 64 entries. These D registers are logically 256 bits
                                                  Vector ALU                     wide, but physically they are broken into four
                                                                                 “channels.” Each channel can contain two 32-bit
Figure 1. Nvidia DXP microarchitecture. The DXP pairs a simple scalar            values, four 16-bit values, or eight 8-bit values. The
CPU (left) with a 256-bit-wide multistage vector engine (right) to gen-          vector unit performs the same operation across all
erate large amounts of compute at low power.                                     the values and across all four channels, creating a



© The Linley Group • Microprocessor Report                                                                                          March 2013
Tegra 4i Expands Market              3

large amount of data parallelism. The D registers are fed by    (as Icera) has followed a similar path in the past, improv-
a local 512KB memory, which can provide one 256-bit             ing the speed of its HSPA modem from 10Mbps to 14Mbps
operand per cycle.                                              to 17Mbps using only firmware upgrades.
       The vector unit is unusual in being deep as well as            Nvidia is planning additional firmware upgrades to
wide. Although Nvidia did not disclose full details, the unit   implement LTE Release 10 features such as carrier aggre-
has several pipelined stages that can each be configured for    gation, which allows data to be transmitted on two fre-
different computations. In this way, a single instruction can   quencies (carriers) at once to reach the maximum rate.
perform a complicated operation such as a matrix multi-         This feature is important because few cellular providers
ply. Nvidia claims each channel can execute up to 95 8-bit      have the 20MHz of contiguous spectrum required to
arithmetic operations (e.g., add or multiply) with a single     maximize LTE performance on a single carrier. Nvidia is
instruction. This approach provides lots of computational       also developing firmware for TD-SCDMA. The company
horsepower with a small amount of instruction decoding.         did not announce a schedule for delivering any of these
With vast amounts of parallelism, this architecture is well     speed or feature improvements.
suited to the high-speed signal processing required by                Software execution typically requires more power
modern cellular algorithms.                                     than offloading functions to hard-wired engines, but
       Tegra 4i includes two full DXP cores that operate at     Nvidia’s team has been careful to minimize power. With its
up to 1.3GHz. These two cores handle the entire physical        wide vector units and custom instruction set, the DXP uses
layer, implementing algorithms such as a rake receiver, di-     less power than a traditional DSP for cellular processing.
versity, turbo decoding, and HARQ (error correction) in         Because a single instruction can execute hundreds of oper-
firmware. A third DXP core implements only the scalar           ations, the DXP wastes little power in overhead tasks such
portion of the instruction set; this smaller, simpler core      as instruction decoding and branching. Other architectures
handles the cellular protocol stack.                            use a mix of DSP and hard-wired logic, so on average, their
       In addition to the 1MB of local SRAM for the two         power is similar to that of the Icera design. The i500 will
vector units, the chip includes about 6MB of additional         adjust the DXP clock speed and voltage as needed for the
SRAM to implement HARQ buffers and other data stor-             available data rate, so it will run at 1.3GHz only when per-
age. As Figure 2 shows, this memory consumes more than          forming LTE at the peak data rate, which happens rarely (if
half the baseband area. (Figure 2 shows an actual die photo     ever) in the real world.
of the i500 baseband, which differs radically from the ar-            The original 3G Icera modem has been certified with
tistic rendering of the chip that has been widely published.)   carriers around the world for use in data cards and USB
Because it is a standalone device, the i500 includes a num-     dongles, but not for voice devices. The ZTE Mimosa X,
ber of system interfaces, such as a USB port and serial         which began shipping in 2Q12, was the first smartphone to
ports, that are unnecessary when the baseband is imple-         use this modem design; it achieved voice certification with
mented as part of the Tegra 4i SoC.                             two carriers (Swisscom and EE, a UK carrier) and also
                                                                shipped to a few carriers that do not require certification.
Software-Defined Radio                                          Nvidia’s LTE modem has undergone certification for
Most other baseband designs use a combination of DSPs
and hard-wired accelerators. These accelerators are cus-
tomized for each protocol; thus, the traditional baseband              Qualcomm MDM9215
has separate units for GSM, WCDMA, and LTE. Nvidia
provides a single set of hardware that can switch protocols
simply by switching firmware. As a result, the program-
mable baseband is smaller than competing designs; for               Scratch              Vector
example, the i500 die measures 14mm2, versus 35mm2 for              Memory                DXP
Qualcomm’s MDM9215—a 28nm LTE modem chip with                           Scalar
similar data rates. To be fair, we note that the Qualcomm                DXP Scratch    D-Mem
chip includes a Cortex-A5 application CPU and a GPS                            Memory   D-Mem
baseband, and it has been in production for nearly a year.          System
                                                                   Interfaces
     The programmable design can be upgraded via new                                    Vector
firmware. The initial release of the i500 (and Tegra 4i) will                            DXP

support Category 3 LTE (100Mbps down, 50Mbps up),
twice the speed of Nvidia’s previous implementation. But
the company is still optimizing its firmware for LTE, and it    Figure 2. Die photo of Nvidia i500 baseband (foreground).
expects to hit Category 4 speeds (150Mbps down) with a          The i500 requires only 40% of the die area of a comparable
future release. Similarly, it expects to boost the top HSPA     Qualcomm modem chip (background). (Photo by Nvidia,
speed from the initial 42Mbps to 84Mbps. The company            overlay by The Linley Group)



© The Linley Group • Microprocessor Report                                                                      March 2013
4            Tegra 4i Expands Market

AT&T’s data network, and the company is working to cer-        images. The new chip can encode or decode 1080p video at
tify this design with other LTE carriers.                      60fps (H.264 High Profile)—twice the rate of Tegra 3 but
                                                               half the rate of Tegra 4, which also supports UltraHD video.
Scoring in the Low 60s                                               Although Tegra 4i doesn’t quite match Tegra 4’s per-
Tegra 4i’s GPU retains the same split-shader design as         formance, the changes are designed to keep the cost of the
Tegra 3 and Tegra 4, preventing it from supporting mod-        chip low. Each Cortex-A9 r4 CPU measures just 1.15mm2,
ern graphics APIs such as DirectX 10 or 11 and OpenCL.         57% smaller than the Cortex-A15 cores in Tegra 4. The
But software optimized for other Tegra processors should       GPU is also smaller, given the reduced number of shaders,
run well on Tegra 4i. The new chip includes 60 shaders—        and the video engine is about half the size. Cutting the sec-
far more than in Tegra 3. As Figure 3 shows, the design        ond DRAM channel both simplifies the memory-controller
allocates 48 texture shaders and 12 vertex shaders, provid-    logic and greatly reduces the number of pads. At 1MB, the
ing the same pixel processing but half the vertex processing   L2 cache is also half the size of Tegra 4’s. These changes
of Tegra 4. Both chips clock the GPU at about the same         leave room for the integrated baseband, which we estimate
speed: 660MHz for Tegra 4i and 672MHz for Tegra 4.             consumes only 8mm2. Even after adding the baseband,
      Whereas Tegra 4 supports two 32-bit memory chan-         Tegra 4i has a die size in the “low 60s,” according to
nels, Tegra 4i has only one, halving its peak memory band-     Nvidia, compared with the “mid-80s” for Tegra 4.
width. Thus, graphics tests that are limited by either mem-
ory bandwidth or vertex processing will run half as fast on    Rebirth of the Reference Design
Tegra 4i relative to Tegra 4. On the other hand, the two       The integrated baseband allows Tegra 4i to fit into com-
chips are equally good at pushing pixels. We estimate Tegra    pact and inexpensive phones. Nvidia offers a reference
4i will score about 30fps on GLBenchmark 2.5, putting it in    design, code-named Phoenix, for Tegra 4i. As Figure 4
the same class as high-end application processors shipping     shows, the main components fit into a narrow space on the
in phones today.                                               phone’s left edge. In addition to the processor, only a few
      Compared with Tegra 3, Tegra 4i offers a theoretical     other components are required. Shrinking the circuit-
2x gain in memory bandwidth, even though both use a            board size leaves more room for the battery; smartphone
single-memory channel. This doubling assumes the use of        designers can choose a smaller battery, yielding a thinner
LPDDR3-2133 memory chips, however, which do not exist          phone, or a larger battery for longer life. The Phoenix
today. Initial Tegra 4i smartphones are likely to use          design is 8mm thick, but OEMs may be able to reduce this
LPDDR3-1600, providing a 50% memory-bandwidth boost            dimension. (X-Men aficionados will appreciate the con-
over Tegra 3. As faster LPDDR3 speed grades become avail-      nection between Grey and Phoenix.)
able, Tegra 4i can take advantage of them.                           The reference design includes Nvidia’s ICE9245 RF
      Tegra 4i includes the computational-photography          transceiver. This chip, which also works with the i500, has
engine that Nvidia introduced with Tegra 4 (see MPR            inputs and outputs for eight configurable bands, with di-
1/21/13, “Tegra 4 Shows First Quad A15”). This unit sup-       versity for each of the receive bands. It can support addi-
ports advanced features such as high-dynamic-range (HDR)       tional bands by using external switches and converged
        Vertex                  Vertex                Vertex




                           IDX / Clip / Setup


                            Raster / Early Z


                 Texture                    Texture
                                L1                      L1




                                     L2 Cache

     Chan 0                Memory Controller
         32                                                    Figure 4. Nvidia’s Phoenix reference design. The main cir-
Figure 3. Tegra 4i GPU design. The GPU has two pixel pipe-     cuitry of the smartphone, including the Tegra 4i processor
lines with 24 fragment shaders each, plus three vertex units   (shown in false color), fits on a PC board less than one inch
with 4 shaders each.                                           wide. (Photo source: Nvidia)



© The Linley Group • Microprocessor Report                                                                      March 2013
Tegra 4i Expands Market               5

power amps (PAs). Built in a TSMC 65nm process to re-
duce cost, the chip integrates all low-noise amplifiers                         Price and Availability
(LNAs) for a highly integrated solution.
                                                                      Tegra 4i is currently sampling to lead customers;
      Unlike most of its competitors, Nvidia does not sup-
                                                                 Nvidia expects the first smartphones using the processor
ply its own connectivity chips. The Phoenix design uses a
                                                                 to ship in 4Q13. The company withheld pricing. For more
Broadcom Wi-Fi combo with a separate Broadcom GPS                information on Tegra 4i, access Nvidia’s web site at
chip, probably the BCM4334 and BCM4752, respectively.            www.nvidia.com/object/tegra-4-processor.html.
Nvidia has also qualified Tegra 4i with Wi-Fi combos from
Texas Instruments, but TI is a poor second supplier, since
it announced it will exit the smartphone market this year.      Nvidia’s 3G voice, it will create a sizable smartphone op-
Because Broadcom competes with Nvidia in mainstream             portunity. But to generate enough business to pay back its
smartphones, it can charge customers that use Tegra 4i a        Icera investment, Nvidia needs more than one carrier cus-
higher Wi-Fi price compared with customers that use             tomer. The good news is that the company still has several
Broadcom’s own processors, tilting the playing field in its     months to certify its modem technology before the first
favor. Nvidia declined to purchase TI’s Wi-Fi business,         Tegra 4i phones are ready to ship.
leaving it with no acquisition options in this area.
      Although Nvidia calls Phoenix a reference design, the     Closer to Tegra 4
company has yet to reach the same level as MediaTek and         At 2.3GHz, Tegra 4i delivers a 50% boost in CPU perfor-
Qualcomm, which offer much more complete packages               mance (on SPECint) compared with Tegra 3 and at least
(see MPR 2/25/13, “Qualcomm Clashes With MediaTek”)             twice the graphics performance. In fact, the new chip is
that even the smallest smartphone makers can use.               closer to Tegra 4 than to Tegra 3 in performance, although
Phoenix will help Nvidia attract mid-tier and large OEMs.       Nvidia has been careful to leave enough of a gap that Tegra 4
ZTE, which will follow Mimosa X with a smartphone using         remains a viable product for premium smartphones. Ac-
Tegra 4 and the i500, is a likely Tegra 4i customer.            cording to the company’s testing, Tegra 4 scores 27% better
                                                                on SPECint than Tegra 4i, and it is considerably better on
Finding a Hole in the Wall                                      both 3D graphics and video decoding. The second DRAM
Tegra 4i exploits a hole in Qualcomm’s otherwise solid          channel will boost Tegra 4’s performance, particularly with
product line. For premium phones, Qualcomm offers the           the larger screens used in tablets. Both Tegra 4i and Tegra 4
2.3GHz 8974, a quad-core Krait processor that will outrun       offer Nvidia’s new computational-photography engine.
Tegra 4i but carries a premium price tag. For mainstream              For a smartphone processor with integrated modem,
smartphones, the company offers the MSM8960T, which             Tegra 4i provides excellent performance. Few integrated
has only two CPUs. On single-thread programs, this chip’s       quad-core processors have been announced, and many of
1.7GHz Krait CPU should match the 2.3GHz Cortex-A9,             them target the low-end smartphone market with lower-
but the MSM8960T lacks both the marketing cachet and            frequency CPUs such as Cortex-A5 and Cortex-A7. Inte-
the multithreaded-benchmark scores of a quad-core chip.         grated processors with dual Cortex-A15 or Krait cores
      This situation should give Tegra 4i an advantage in       won’t match Tegra 4i in benchmark performance, al-
mainstream phones. The Tegra 4i die is about half the size      though they will perform well on most apps. Nvidia has
of the 8974, making it difficult for Qualcomm to compete        also packed plenty of GPU performance into Tegra 4i; on
on price using this die. The company may be able to devel-      graphics tests, it should outperform the quad-A7 chips and
op a cost-reduced quad-core chip by the time Tegra 4i is        at least match the dual-A15 processors. Qualcomm’s 8974
available, or perhaps shortly thereafter, but such a chip       is the only integrated processor that should equal Tegra
may fail to match Tegra 4i’s performance.                       4i’s performance, but we don’t expect that company to cut
      Other integrated quad-core chips will fall far short of   the price of its flagship processor enough to compete with
Tegra 4i. MediaTek’s MT6589 and Qualcomm’s MSM8226              the integrated Tegra.
both use Cortex-A7 CPUs running at speeds of about                    Nvidia has been successful in tablets, but until now,
1.2GHz. For its MP6530, Renesas uses an unusual “Two            its premium positioning and lack of a low-cost option have
and a Half Men” configuration, with two A15 CPUs and            limited its sales into smartphones. Tegra 2 and Tegra 3
two A7s, that won’t add up to the performance of Nvidia’s       have appeared mainly in hero phones such as the HTC
quad A9r4 cores.                                                One X, but these high-end devices ship in relatively small
      Nvidia’s lack of carrier qualifications for both voice    volumes. As a result, Nvidia holds less than 2% of the
and LTE is a concern. The company is working feverishly         smartphone market. A move into the mainstream offers
to remedy this situation, and at least one customer (ZTE) is    greater volume opportunities; here, Nvidia will compete
moving forward with the i500. Nvidia’s initial focus ap-        mainly against Qualcomm’s Snapdragon. By bringing
pears to be on AT&T, the only carrier with which it has         Tegra features and performance to lower price points, the
achieved LTE certification; if that carrier also signs off on   new processor is an attractive alternative to Snapdragon. ♦



© The Linley Group • Microprocessor Report                                                                      March 2013

More Related Content

What's hot

GTC China 2016
GTC China 2016GTC China 2016
GTC China 2016NVIDIA
 
Dell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data CenterDell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data CenterRenee Yao
 
GPU Technology Conference 2014 Keynote
GPU Technology Conference 2014 KeynoteGPU Technology Conference 2014 Keynote
GPU Technology Conference 2014 KeynoteNVIDIA
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation OverviewNVIDIA Taiwan
 
Opening Keynote at GTC 2015: Leaps in Visual Computing
Opening Keynote at GTC 2015: Leaps in Visual ComputingOpening Keynote at GTC 2015: Leaps in Visual Computing
Opening Keynote at GTC 2015: Leaps in Visual ComputingNVIDIA
 
Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015
Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015 Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015
Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015 NVIDIA
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA Taiwan
 
Enabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. LowndesEnabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. LowndesWithTheBest
 
AI, A New Computing Model
AI, A New Computing ModelAI, A New Computing Model
AI, A New Computing ModelNVIDIA Taiwan
 
HPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoTHPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoTRenee Yao
 
Artificial intelligence on the Edge
Artificial intelligence on the EdgeArtificial intelligence on the Edge
Artificial intelligence on the EdgeUsman Qayyum
 
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...KTN
 
Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)Rakuten Group, Inc.
 
NVIDIA CES 2016 Press Conference
NVIDIA CES 2016 Press ConferenceNVIDIA CES 2016 Press Conference
NVIDIA CES 2016 Press ConferenceNVIDIA
 
GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說NVIDIA Taiwan
 
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs Renee Yao
 
NVIDIA 深度學習教育機構 (DLI): Neural network deployment
NVIDIA 深度學習教育機構 (DLI): Neural network deploymentNVIDIA 深度學習教育機構 (DLI): Neural network deployment
NVIDIA 深度學習教育機構 (DLI): Neural network deploymentNVIDIA Taiwan
 
1030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.01030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.0NVIDIA Japan
 

What's hot (20)

GTC China 2016
GTC China 2016GTC China 2016
GTC China 2016
 
Dell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data CenterDell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data Center
 
GPU Technology Conference 2014 Keynote
GPU Technology Conference 2014 KeynoteGPU Technology Conference 2014 Keynote
GPU Technology Conference 2014 Keynote
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation Overview
 
Nvidia at SEMICon, Munich
Nvidia at SEMICon, MunichNvidia at SEMICon, Munich
Nvidia at SEMICon, Munich
 
Opening Keynote at GTC 2015: Leaps in Visual Computing
Opening Keynote at GTC 2015: Leaps in Visual ComputingOpening Keynote at GTC 2015: Leaps in Visual Computing
Opening Keynote at GTC 2015: Leaps in Visual Computing
 
Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015
Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015 Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015
Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
 
Enabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. LowndesEnabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. Lowndes
 
AI, A New Computing Model
AI, A New Computing ModelAI, A New Computing Model
AI, A New Computing Model
 
HPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoTHPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoT
 
Artificial intelligence on the Edge
Artificial intelligence on the EdgeArtificial intelligence on the Edge
Artificial intelligence on the Edge
 
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
 
Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)
 
Supercomputers and Cloud Games
Supercomputers and Cloud GamesSupercomputers and Cloud Games
Supercomputers and Cloud Games
 
NVIDIA CES 2016 Press Conference
NVIDIA CES 2016 Press ConferenceNVIDIA CES 2016 Press Conference
NVIDIA CES 2016 Press Conference
 
GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說
 
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs
 
NVIDIA 深度學習教育機構 (DLI): Neural network deployment
NVIDIA 深度學習教育機構 (DLI): Neural network deploymentNVIDIA 深度學習教育機構 (DLI): Neural network deployment
NVIDIA 深度學習教育機構 (DLI): Neural network deployment
 
1030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.01030: NVIDIA GRID 2.0
1030: NVIDIA GRID 2.0
 

Viewers also liked

Tegra 4 outperforms snapdragon
Tegra 4 outperforms snapdragonTegra 4 outperforms snapdragon
Tegra 4 outperforms snapdragonBrian Caulfield
 
HPCフォーラム2015 B-1RandD 100 Award 受賞記念講演 常温水冷スパコンHP Apollo 8000開発エンジニアによる誕生秘話 N...
HPCフォーラム2015 B-1RandD 100 Award 受賞記念講演 常温水冷スパコンHP Apollo 8000開発エンジニアによる誕生秘話 N...HPCフォーラム2015 B-1RandD 100 Award 受賞記念講演 常温水冷スパコンHP Apollo 8000開発エンジニアによる誕生秘話 N...
HPCフォーラム2015 B-1RandD 100 Award 受賞記念講演 常温水冷スパコンHP Apollo 8000開発エンジニアによる誕生秘話 N...日本ヒューレット・パッカード株式会社
 
The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...Brian Solis
 
Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)maditabalnco
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsBarry Feldman
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome EconomyHelge Tennø
 

Viewers also liked (7)

Tegra 4 outperforms snapdragon
Tegra 4 outperforms snapdragonTegra 4 outperforms snapdragon
Tegra 4 outperforms snapdragon
 
Eurotech 30 01 2013
Eurotech 30 01 2013Eurotech 30 01 2013
Eurotech 30 01 2013
 
HPCフォーラム2015 B-1RandD 100 Award 受賞記念講演 常温水冷スパコンHP Apollo 8000開発エンジニアによる誕生秘話 N...
HPCフォーラム2015 B-1RandD 100 Award 受賞記念講演 常温水冷スパコンHP Apollo 8000開発エンジニアによる誕生秘話 N...HPCフォーラム2015 B-1RandD 100 Award 受賞記念講演 常温水冷スパコンHP Apollo 8000開発エンジニアによる誕生秘話 N...
HPCフォーラム2015 B-1RandD 100 Award 受賞記念講演 常温水冷スパコンHP Apollo 8000開発エンジニアによる誕生秘話 N...
 
The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...
 
Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)Reuters: Pictures of the Year 2016 (Part 2)
Reuters: Pictures of the Year 2016 (Part 2)
 
The Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post FormatsThe Six Highest Performing B2B Blog Post Formats
The Six Highest Performing B2B Blog Post Formats
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome Economy
 

Similar to Tegra 4i expands the market

Intel new processors
Intel new processorsIntel new processors
Intel new processorszaid_b
 
Nvidia’s tegra line of processors for mobile devices2 2
Nvidia’s tegra line of processors for mobile devices2 2Nvidia’s tegra line of processors for mobile devices2 2
Nvidia’s tegra line of processors for mobile devices2 2Sukul Yarraguntla
 
Computer Hardware & Software Lab Manual 3
Computer Hardware & Software Lab Manual 3Computer Hardware & Software Lab Manual 3
Computer Hardware & Software Lab Manual 3senayteklay
 
Arm Cortex A8 Vs Intel Atom:Architectural And Benchmark Comparisons
Arm Cortex A8 Vs Intel Atom:Architectural And Benchmark ComparisonsArm Cortex A8 Vs Intel Atom:Architectural And Benchmark Comparisons
Arm Cortex A8 Vs Intel Atom:Architectural And Benchmark Comparisonsnapoleaninlondon
 
Crypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M ProcessorsCrypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M ProcessorsHannes Tschofenig
 
corei7anaghvjfinal-130316054830-.pptx
corei7anaghvjfinal-130316054830-.pptxcorei7anaghvjfinal-130316054830-.pptx
corei7anaghvjfinal-130316054830-.pptxPranita602627
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 HardwareJacob Wu
 
Apple A8 Series Application Processor
Apple A8 Series Application ProcessorApple A8 Series Application Processor
Apple A8 Series Application ProcessorJJ Wu
 
計算力学シミュレーションに GPU は役立つのか?
計算力学シミュレーションに GPU は役立つのか?計算力学シミュレーションに GPU は役立つのか?
計算力学シミュレーションに GPU は役立つのか?Shinnosuke Furuya
 
한컴MDS_NVIDIA Jetson Platform
한컴MDS_NVIDIA Jetson Platform한컴MDS_NVIDIA Jetson Platform
한컴MDS_NVIDIA Jetson PlatformHANCOM MDS
 
Fujitsu Presents Post-K CPU Specifications
Fujitsu Presents Post-K CPU SpecificationsFujitsu Presents Post-K CPU Specifications
Fujitsu Presents Post-K CPU Specificationsinside-BigData.com
 
Stream Processing
Stream ProcessingStream Processing
Stream Processingarnamoy10
 
AI Hardware Landscape 2021
AI Hardware Landscape 2021AI Hardware Landscape 2021
AI Hardware Landscape 2021Grigory Sapunov
 

Similar to Tegra 4i expands the market (20)

Intel new processors
Intel new processorsIntel new processors
Intel new processors
 
ARM cortex A15
ARM cortex A15ARM cortex A15
ARM cortex A15
 
Nvidia’s tegra line of processors for mobile devices2 2
Nvidia’s tegra line of processors for mobile devices2 2Nvidia’s tegra line of processors for mobile devices2 2
Nvidia’s tegra line of processors for mobile devices2 2
 
Mobile processors
Mobile processorsMobile processors
Mobile processors
 
Computer Hardware & Software Lab Manual 3
Computer Hardware & Software Lab Manual 3Computer Hardware & Software Lab Manual 3
Computer Hardware & Software Lab Manual 3
 
Arm Cortex A8 Vs Intel Atom:Architectural And Benchmark Comparisons
Arm Cortex A8 Vs Intel Atom:Architectural And Benchmark ComparisonsArm Cortex A8 Vs Intel Atom:Architectural And Benchmark Comparisons
Arm Cortex A8 Vs Intel Atom:Architectural And Benchmark Comparisons
 
Crypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M ProcessorsCrypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M Processors
 
Larrabee
LarrabeeLarrabee
Larrabee
 
corei7anaghvjfinal-130316054830-.pptx
corei7anaghvjfinal-130316054830-.pptxcorei7anaghvjfinal-130316054830-.pptx
corei7anaghvjfinal-130316054830-.pptx
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
 
Mateo valero p2
Mateo valero p2Mateo valero p2
Mateo valero p2
 
Apple A8 Series Application Processor
Apple A8 Series Application ProcessorApple A8 Series Application Processor
Apple A8 Series Application Processor
 
計算力学シミュレーションに GPU は役立つのか?
計算力学シミュレーションに GPU は役立つのか?計算力学シミュレーションに GPU は役立つのか?
計算力学シミュレーションに GPU は役立つのか?
 
한컴MDS_NVIDIA Jetson Platform
한컴MDS_NVIDIA Jetson Platform한컴MDS_NVIDIA Jetson Platform
한컴MDS_NVIDIA Jetson Platform
 
Fujitsu Presents Post-K CPU Specifications
Fujitsu Presents Post-K CPU SpecificationsFujitsu Presents Post-K CPU Specifications
Fujitsu Presents Post-K CPU Specifications
 
SDC Server Sao Jose
SDC Server Sao JoseSDC Server Sao Jose
SDC Server Sao Jose
 
Stream Processing
Stream ProcessingStream Processing
Stream Processing
 
AI Hardware Landscape 2021
AI Hardware Landscape 2021AI Hardware Landscape 2021
AI Hardware Landscape 2021
 
Nehalem
NehalemNehalem
Nehalem
 
POWER9 for AI & HPC
POWER9 for AI & HPCPOWER9 for AI & HPC
POWER9 for AI & HPC
 

Recently uploaded

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Recently uploaded (20)

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Tegra 4i expands the market

  • 1. TEGRA 4I EXPANDS MARKET Cortex-A9r4 CPU Peps Up Nvidia’s First Integrated Processor By Linley Gwennap (March 4, 2013) ................................................................................................................... The moment Nvidia hinted at when it purchased Icera run simple user interfaces and small programs such as has now arrived. The new Tegra 4i processor (formerly email readers; modern apps did not yet exist (the Apple known by its code-name, Grey) combines Icera’s cellular App Store opened in January 2008). In this environment, technology with Nvidia’s application processors to create an ARM focused on die area and cost, keeping critical ele- integrated smartphone processor. Although Nvidia is far ments in the CPU small. Since that time, Cortex-A9 has from the first to make this combination, the new product been applied to many new tasks, but the design is poorly will greatly expand the company’s target market, which to- optimized for some of them. day is restricted to tablets and premium smartphones The branch history table (BHT), for example, has (Nvidia calls them superphones). By reducing the cost and only 512 entries in the original Cortex-A9. This size is fine size of reference designs, Tegra 4i aims for mainstream for small programs, but in more complex software, multi- smartphones that sell for a few hundred dollars. ple active branches often hash to the same entry, confusing Nvidia demonstrated the new chip at Mobile World the BHT’s predictions. The A9r4 expands the BHT to 2,048 Congress and expects it to ship in commercial smart- phones by the end of this year. Tegra 4i offers four Cortex- Tegra 3 Tegra 4i Tegra 4 A9 CPUs, upgraded to release 4 (r4), running at clock Main CPU Cores 4xCortex-A9 4xCortex-A9r4 4xCortex-A15 speeds as high as 2.3GHz. As Table 1 shows, the graphics Max CPU Speed 1.7GHz 2.3GHz 1.9GHz Companion Core? Yes Yes Yes unit is considerably improved from Tegra 3 and is similar L2 Cache Size 1MB 1MB 2MB to that of Tegra 4. The new chip also includes Tegra 4’s SPECint Score* 590 920 1,168 computational-photography unit. Its integrated cellular 8 texture + 48 texture + 48 texture + GPU Shaders modem, based on Nvidia’s standalone i500 chip, is com- 4 vertex 12 vertex 24 vertex patible with LTE networks as well as older 3G services. GPU Clock Speed 520MHz 660MHz 672MHz GLBenchmark 2.5† 12fps 30fps§ 57fps Video Decode‡ 1080p 24fps 1080p 60fps 4K 30fps Extending Cortex-A9 Photog Engine? No Yes Yes Tegra 4i is the first processor to use the r4 version of Cortex- DRAM Channels 1x32-bit 1x32-bit 2x32-bit A9. Unlike previous releases, which contained mostly bug Max DRAM Speed LPDDR2-1066 LPDDR3-2133 LPDDR3-2133 fixes, the A9r4 includes some significant improvements to LTE Baseband External Integrated External Process Technology 40nm LP 28nm HPM 28nm HPL the branch predictor, TLB, and cache-memory system. Be- Die Size 80mm2§ 62mm2§ 85mm2§ cause the basic microarchitecture stays the same, these im- Package 14mm PoP 12mm PoP 14mm PoP provements have no effect on simple benchmarks such as Production 4Q11 4Q13 (est) 2Q13 (est) Dhrystone. But Nvidia has measured a 15% improvement Table 1. Comparison of recent Nvidia Tegra processors. in SPECint performance and a 25% gain in BrowserBench Performance data for Tegra 4/4i is preliminary. *SPECint- performance at the same clock speed. 2000_base compiled using GCC -o3; †Egypt C24Z16 Off- ARM announced Cortex-A9 in 2007—eons ago in the screen 1080p; ‡H.264 High Profile. (Source: Nvidia, except fast-moving mobile market. The company designed it to §The Linley Group estimate) © The Linley Group • Microprocessor Report March 2013
  • 2. 2 Tegra 4i Expands Market entries (the same size as Cortex-A15’s). Nvidia measured Flooring the Gas Pedal the branch-misprediction rate of one SPECint2000 pro- After working with ARM to rev up the CPU’s logical design, gram at 48% on the original A9 and only 8% on the A9r4. Nvidia then optimized the physical design to maximize This example is extreme, but many programs will see some its clock speed. Tegra 4i targets 2.3GHz, compared with benefit from the expanded BHT. 1.7GHz for the Cortex-A9 CPUs in Tegra 3+. Much of this Similarly, Cortex-A9’s original TLB had 128 entries, speed boost comes from the shrink to TSMC’s 28nm HPM providing address translations for 512KB of data (using process, but 2.3GHz is among the fastest Cortex-A9 speeds standard 4KB pages). This space is enough for small pro- yet announced, trailing only ST-Ericsson’s 2.5GHz design grams, but complex modern apps need more. The A9r4 (which uses exotic 28nm FD-SOI technology). increases the TLB to 512 entries, offering access to four Tegra 4i’s power curve will affect its working speeds. times as much data without thrashing the TLB. The company did not disclose the chip’s power, but we The new CPU retains the same 32KB data cache as in expect it requires overvoltage to achieve its top speed, previous Cortex-A9 designs, but it improves the prefetch- pushing power to about 5W with all four CPUs running at er’s effectiveness. The original A9 included prefetch logic 2.3GHz. To fit Tegra 4i into the power envelope of a that attempted to detect a series of sequential memory ac- smartphone, Nvidia is likely to limit the CPUs to a slower cesses and continue fetching additional data before it was clock rate—perhaps 1.8GHz—when all four are running. needed. It was the first ARM CPU with this feature, how- With only a single CPU running, however, the chip should ever, and the prefetcher too often fetches the wrong data, operate at its full rated speed. wasting cycles and power; most operating systems simply As in Tegra 3, Tegra 4i includes a fifth “companion” turn off this feature. The new prefetcher, based on a few core that uses the Cortex-A9 microarchitecture but is op- more years of experience, correctly handles most common timized for low power and runs at a lower clock speed (see access patterns. MPR 11/21/11, “Nvidia Leads With Quad-Core AP”). As with all Cortex CPUs, ARM implemented the The low-power core handles light workloads, like email new design. Nvidia provided vigorous input regarding the and social media, but it transfers operation to the main changes and is the lead customer for this version. ARM will CPUs when the processing load picks up. In maximum- deliver the r4 design to other Cortex-A9 licensees, so we ex- performance mode, the four main CPUs run while the pect to see this version become more widespread over time. low-power core is shut down. The design improvements are unusual for an existing core, and their performance impact is significant. ARM de- Digging Into the DXP clined to create a new name for this core, perhaps to avoid The Tegra 4i cellular modem derives from technology diluting its emphasis on Cortex-A15. We believe Cortex-A10 Nvidia received when it acquired startup Icera in 2011 (see or Cortex-A9+ would be more appropriate monikers than MPR 5/16/11, “Nvidia Picks Up the Phone”). The modem Cortex-A9 r4. (Readers should avoid confusing this core with is the same as in the i500 LTE chip that Nvidia recently Cortex-R4, a low-end ARM design intended for real-time announced. It supports a number of protocols, including applications.) GSM/EDGE, WCDMA/HSPA, and Release 8 LTE in FDD and TDD modes. 32KB I-Cache 128KB Instruction Memory As part of the Tegra 4i launch, Nvidia re- vealed details of the Icera architecture for the first Branch Fetch time. The modem employs a processor known as Pred One instr the DXP, which implements a custom instruction Decode set optimized for cellular processing. For example, it includes instructions to accelerate voice codecs 64x256-bit D Registers and encryption. As Figure 1 shows, the DXP is 32x32-bit C Regs 256 32 32 256 essentially a RISC CPU with a large vector unit. ALU / Load / Addr Permute The CPU fetches and executes one instruction per Vector Control Branch Store 512KB Vector ALU cycle and has a standard 32-entry register file called Data the C registers, which are backed by a small data Memory 32KB Data Cache (D-Mem) Vector ALU cache configurable as either 16KB or 32KB. 32 The vector unit has its own register file with ... 64 entries. These D registers are logically 256 bits Vector ALU wide, but physically they are broken into four “channels.” Each channel can contain two 32-bit Figure 1. Nvidia DXP microarchitecture. The DXP pairs a simple scalar values, four 16-bit values, or eight 8-bit values. The CPU (left) with a 256-bit-wide multistage vector engine (right) to gen- vector unit performs the same operation across all erate large amounts of compute at low power. the values and across all four channels, creating a © The Linley Group • Microprocessor Report March 2013
  • 3. Tegra 4i Expands Market 3 large amount of data parallelism. The D registers are fed by (as Icera) has followed a similar path in the past, improv- a local 512KB memory, which can provide one 256-bit ing the speed of its HSPA modem from 10Mbps to 14Mbps operand per cycle. to 17Mbps using only firmware upgrades. The vector unit is unusual in being deep as well as Nvidia is planning additional firmware upgrades to wide. Although Nvidia did not disclose full details, the unit implement LTE Release 10 features such as carrier aggre- has several pipelined stages that can each be configured for gation, which allows data to be transmitted on two fre- different computations. In this way, a single instruction can quencies (carriers) at once to reach the maximum rate. perform a complicated operation such as a matrix multi- This feature is important because few cellular providers ply. Nvidia claims each channel can execute up to 95 8-bit have the 20MHz of contiguous spectrum required to arithmetic operations (e.g., add or multiply) with a single maximize LTE performance on a single carrier. Nvidia is instruction. This approach provides lots of computational also developing firmware for TD-SCDMA. The company horsepower with a small amount of instruction decoding. did not announce a schedule for delivering any of these With vast amounts of parallelism, this architecture is well speed or feature improvements. suited to the high-speed signal processing required by Software execution typically requires more power modern cellular algorithms. than offloading functions to hard-wired engines, but Tegra 4i includes two full DXP cores that operate at Nvidia’s team has been careful to minimize power. With its up to 1.3GHz. These two cores handle the entire physical wide vector units and custom instruction set, the DXP uses layer, implementing algorithms such as a rake receiver, di- less power than a traditional DSP for cellular processing. versity, turbo decoding, and HARQ (error correction) in Because a single instruction can execute hundreds of oper- firmware. A third DXP core implements only the scalar ations, the DXP wastes little power in overhead tasks such portion of the instruction set; this smaller, simpler core as instruction decoding and branching. Other architectures handles the cellular protocol stack. use a mix of DSP and hard-wired logic, so on average, their In addition to the 1MB of local SRAM for the two power is similar to that of the Icera design. The i500 will vector units, the chip includes about 6MB of additional adjust the DXP clock speed and voltage as needed for the SRAM to implement HARQ buffers and other data stor- available data rate, so it will run at 1.3GHz only when per- age. As Figure 2 shows, this memory consumes more than forming LTE at the peak data rate, which happens rarely (if half the baseband area. (Figure 2 shows an actual die photo ever) in the real world. of the i500 baseband, which differs radically from the ar- The original 3G Icera modem has been certified with tistic rendering of the chip that has been widely published.) carriers around the world for use in data cards and USB Because it is a standalone device, the i500 includes a num- dongles, but not for voice devices. The ZTE Mimosa X, ber of system interfaces, such as a USB port and serial which began shipping in 2Q12, was the first smartphone to ports, that are unnecessary when the baseband is imple- use this modem design; it achieved voice certification with mented as part of the Tegra 4i SoC. two carriers (Swisscom and EE, a UK carrier) and also shipped to a few carriers that do not require certification. Software-Defined Radio Nvidia’s LTE modem has undergone certification for Most other baseband designs use a combination of DSPs and hard-wired accelerators. These accelerators are cus- tomized for each protocol; thus, the traditional baseband Qualcomm MDM9215 has separate units for GSM, WCDMA, and LTE. Nvidia provides a single set of hardware that can switch protocols simply by switching firmware. As a result, the program- mable baseband is smaller than competing designs; for Scratch Vector example, the i500 die measures 14mm2, versus 35mm2 for Memory DXP Qualcomm’s MDM9215—a 28nm LTE modem chip with Scalar similar data rates. To be fair, we note that the Qualcomm DXP Scratch D-Mem chip includes a Cortex-A5 application CPU and a GPS Memory D-Mem baseband, and it has been in production for nearly a year. System Interfaces The programmable design can be upgraded via new Vector firmware. The initial release of the i500 (and Tegra 4i) will DXP support Category 3 LTE (100Mbps down, 50Mbps up), twice the speed of Nvidia’s previous implementation. But the company is still optimizing its firmware for LTE, and it Figure 2. Die photo of Nvidia i500 baseband (foreground). expects to hit Category 4 speeds (150Mbps down) with a The i500 requires only 40% of the die area of a comparable future release. Similarly, it expects to boost the top HSPA Qualcomm modem chip (background). (Photo by Nvidia, speed from the initial 42Mbps to 84Mbps. The company overlay by The Linley Group) © The Linley Group • Microprocessor Report March 2013
  • 4. 4 Tegra 4i Expands Market AT&T’s data network, and the company is working to cer- images. The new chip can encode or decode 1080p video at tify this design with other LTE carriers. 60fps (H.264 High Profile)—twice the rate of Tegra 3 but half the rate of Tegra 4, which also supports UltraHD video. Scoring in the Low 60s Although Tegra 4i doesn’t quite match Tegra 4’s per- Tegra 4i’s GPU retains the same split-shader design as formance, the changes are designed to keep the cost of the Tegra 3 and Tegra 4, preventing it from supporting mod- chip low. Each Cortex-A9 r4 CPU measures just 1.15mm2, ern graphics APIs such as DirectX 10 or 11 and OpenCL. 57% smaller than the Cortex-A15 cores in Tegra 4. The But software optimized for other Tegra processors should GPU is also smaller, given the reduced number of shaders, run well on Tegra 4i. The new chip includes 60 shaders— and the video engine is about half the size. Cutting the sec- far more than in Tegra 3. As Figure 3 shows, the design ond DRAM channel both simplifies the memory-controller allocates 48 texture shaders and 12 vertex shaders, provid- logic and greatly reduces the number of pads. At 1MB, the ing the same pixel processing but half the vertex processing L2 cache is also half the size of Tegra 4’s. These changes of Tegra 4. Both chips clock the GPU at about the same leave room for the integrated baseband, which we estimate speed: 660MHz for Tegra 4i and 672MHz for Tegra 4. consumes only 8mm2. Even after adding the baseband, Whereas Tegra 4 supports two 32-bit memory chan- Tegra 4i has a die size in the “low 60s,” according to nels, Tegra 4i has only one, halving its peak memory band- Nvidia, compared with the “mid-80s” for Tegra 4. width. Thus, graphics tests that are limited by either mem- ory bandwidth or vertex processing will run half as fast on Rebirth of the Reference Design Tegra 4i relative to Tegra 4. On the other hand, the two The integrated baseband allows Tegra 4i to fit into com- chips are equally good at pushing pixels. We estimate Tegra pact and inexpensive phones. Nvidia offers a reference 4i will score about 30fps on GLBenchmark 2.5, putting it in design, code-named Phoenix, for Tegra 4i. As Figure 4 the same class as high-end application processors shipping shows, the main components fit into a narrow space on the in phones today. phone’s left edge. In addition to the processor, only a few Compared with Tegra 3, Tegra 4i offers a theoretical other components are required. Shrinking the circuit- 2x gain in memory bandwidth, even though both use a board size leaves more room for the battery; smartphone single-memory channel. This doubling assumes the use of designers can choose a smaller battery, yielding a thinner LPDDR3-2133 memory chips, however, which do not exist phone, or a larger battery for longer life. The Phoenix today. Initial Tegra 4i smartphones are likely to use design is 8mm thick, but OEMs may be able to reduce this LPDDR3-1600, providing a 50% memory-bandwidth boost dimension. (X-Men aficionados will appreciate the con- over Tegra 3. As faster LPDDR3 speed grades become avail- nection between Grey and Phoenix.) able, Tegra 4i can take advantage of them. The reference design includes Nvidia’s ICE9245 RF Tegra 4i includes the computational-photography transceiver. This chip, which also works with the i500, has engine that Nvidia introduced with Tegra 4 (see MPR inputs and outputs for eight configurable bands, with di- 1/21/13, “Tegra 4 Shows First Quad A15”). This unit sup- versity for each of the receive bands. It can support addi- ports advanced features such as high-dynamic-range (HDR) tional bands by using external switches and converged Vertex Vertex Vertex IDX / Clip / Setup Raster / Early Z Texture Texture L1 L1 L2 Cache Chan 0 Memory Controller 32 Figure 4. Nvidia’s Phoenix reference design. The main cir- Figure 3. Tegra 4i GPU design. The GPU has two pixel pipe- cuitry of the smartphone, including the Tegra 4i processor lines with 24 fragment shaders each, plus three vertex units (shown in false color), fits on a PC board less than one inch with 4 shaders each. wide. (Photo source: Nvidia) © The Linley Group • Microprocessor Report March 2013
  • 5. Tegra 4i Expands Market 5 power amps (PAs). Built in a TSMC 65nm process to re- duce cost, the chip integrates all low-noise amplifiers Price and Availability (LNAs) for a highly integrated solution. Tegra 4i is currently sampling to lead customers; Unlike most of its competitors, Nvidia does not sup- Nvidia expects the first smartphones using the processor ply its own connectivity chips. The Phoenix design uses a to ship in 4Q13. The company withheld pricing. For more Broadcom Wi-Fi combo with a separate Broadcom GPS information on Tegra 4i, access Nvidia’s web site at chip, probably the BCM4334 and BCM4752, respectively. www.nvidia.com/object/tegra-4-processor.html. Nvidia has also qualified Tegra 4i with Wi-Fi combos from Texas Instruments, but TI is a poor second supplier, since it announced it will exit the smartphone market this year. Nvidia’s 3G voice, it will create a sizable smartphone op- Because Broadcom competes with Nvidia in mainstream portunity. But to generate enough business to pay back its smartphones, it can charge customers that use Tegra 4i a Icera investment, Nvidia needs more than one carrier cus- higher Wi-Fi price compared with customers that use tomer. The good news is that the company still has several Broadcom’s own processors, tilting the playing field in its months to certify its modem technology before the first favor. Nvidia declined to purchase TI’s Wi-Fi business, Tegra 4i phones are ready to ship. leaving it with no acquisition options in this area. Although Nvidia calls Phoenix a reference design, the Closer to Tegra 4 company has yet to reach the same level as MediaTek and At 2.3GHz, Tegra 4i delivers a 50% boost in CPU perfor- Qualcomm, which offer much more complete packages mance (on SPECint) compared with Tegra 3 and at least (see MPR 2/25/13, “Qualcomm Clashes With MediaTek”) twice the graphics performance. In fact, the new chip is that even the smallest smartphone makers can use. closer to Tegra 4 than to Tegra 3 in performance, although Phoenix will help Nvidia attract mid-tier and large OEMs. Nvidia has been careful to leave enough of a gap that Tegra 4 ZTE, which will follow Mimosa X with a smartphone using remains a viable product for premium smartphones. Ac- Tegra 4 and the i500, is a likely Tegra 4i customer. cording to the company’s testing, Tegra 4 scores 27% better on SPECint than Tegra 4i, and it is considerably better on Finding a Hole in the Wall both 3D graphics and video decoding. The second DRAM Tegra 4i exploits a hole in Qualcomm’s otherwise solid channel will boost Tegra 4’s performance, particularly with product line. For premium phones, Qualcomm offers the the larger screens used in tablets. Both Tegra 4i and Tegra 4 2.3GHz 8974, a quad-core Krait processor that will outrun offer Nvidia’s new computational-photography engine. Tegra 4i but carries a premium price tag. For mainstream For a smartphone processor with integrated modem, smartphones, the company offers the MSM8960T, which Tegra 4i provides excellent performance. Few integrated has only two CPUs. On single-thread programs, this chip’s quad-core processors have been announced, and many of 1.7GHz Krait CPU should match the 2.3GHz Cortex-A9, them target the low-end smartphone market with lower- but the MSM8960T lacks both the marketing cachet and frequency CPUs such as Cortex-A5 and Cortex-A7. Inte- the multithreaded-benchmark scores of a quad-core chip. grated processors with dual Cortex-A15 or Krait cores This situation should give Tegra 4i an advantage in won’t match Tegra 4i in benchmark performance, al- mainstream phones. The Tegra 4i die is about half the size though they will perform well on most apps. Nvidia has of the 8974, making it difficult for Qualcomm to compete also packed plenty of GPU performance into Tegra 4i; on on price using this die. The company may be able to devel- graphics tests, it should outperform the quad-A7 chips and op a cost-reduced quad-core chip by the time Tegra 4i is at least match the dual-A15 processors. Qualcomm’s 8974 available, or perhaps shortly thereafter, but such a chip is the only integrated processor that should equal Tegra may fail to match Tegra 4i’s performance. 4i’s performance, but we don’t expect that company to cut Other integrated quad-core chips will fall far short of the price of its flagship processor enough to compete with Tegra 4i. MediaTek’s MT6589 and Qualcomm’s MSM8226 the integrated Tegra. both use Cortex-A7 CPUs running at speeds of about Nvidia has been successful in tablets, but until now, 1.2GHz. For its MP6530, Renesas uses an unusual “Two its premium positioning and lack of a low-cost option have and a Half Men” configuration, with two A15 CPUs and limited its sales into smartphones. Tegra 2 and Tegra 3 two A7s, that won’t add up to the performance of Nvidia’s have appeared mainly in hero phones such as the HTC quad A9r4 cores. One X, but these high-end devices ship in relatively small Nvidia’s lack of carrier qualifications for both voice volumes. As a result, Nvidia holds less than 2% of the and LTE is a concern. The company is working feverishly smartphone market. A move into the mainstream offers to remedy this situation, and at least one customer (ZTE) is greater volume opportunities; here, Nvidia will compete moving forward with the i500. Nvidia’s initial focus ap- mainly against Qualcomm’s Snapdragon. By bringing pears to be on AT&T, the only carrier with which it has Tegra features and performance to lower price points, the achieved LTE certification; if that carrier also signs off on new processor is an attractive alternative to Snapdragon. ♦ © The Linley Group • Microprocessor Report March 2013