2. Hey y’all, let’s introduce ourselves!
• I’m Sanath and I’m currently pursing 4thYear Engineering here in the ECE
Dept.
• I’m very excited to each you for these few hours! Hope you guys do like it!
• There is a course outline that we are going to follow, simple enough…just to
get you guys started with the actual event on Saturday.
3. Before that… some caveats…
• How many of you have come across the term computer architecture before?
How many of you have learnt the basics for this in our previous years?
• Doesn’t matter. But these statistics do bear in mind. So, don’t get
overwhelmed. Just enjoy the class.
• So I’m concerned more with the efficiency of understanding than the topics
concerned. If you have any doubt, you can ask me. And sure, I don’t know
everything, and you guys can beat me with google😁
4. Some Stats
• Average 16 year old teenagers attention span is ~35 minutes. Considering a
little leeway I’m taking 30 minutes.
• You guys are not machines and I’m not the most efficient teacher also. ✌️
• So I want you guys to take away at least 30% of what I tell today, till the
PhaseShift day.That will help y’all a lot.
5. Some tips to remember more.
• The Forgetting curve
• Google!
I can’t stress this enough….if you want to learn more,
you guys have to google!
6. Just a test! Show your google skills people!
Fastest fingers first!
• Here’s a famous guy. Identify him.
• How many languages could this guy speak?
• Where did he work?
• He was pretty well known for his eccentricity…. Could
you tell what was it in regards to work and music?
• What was his principle contribution to the atomic bomb?
7. Just a test! Show your google skills people!
Fastest fingers first!
• Here’s a famous guy. Identify him. JohnVon Neumann
• How many languages could this guy speak? 7
• Which University did he work in America? Princeton (IAS)
• He was pretty well known for his eccentricity…. Could
you tell what was it in regards to work and music?
He worked better in loud and chaotic noises
• What was his principle contribution to the atomic bomb?
He helped in formation of “explosive lenses” for the atomic bomb
8. This is the board you guys will be working with…
• On 26th, you’ll be working with this
board that is shown here. It is a
pretty cool board.
• It has a nice display, some sensors
and lots of things to play around
with.
• You guys will learn a lot with this
board.
• Go Google about it!
10. Let’s Start!
• Introduction to Computer Architecture
• The computer stack explained
• Basics of computer architecture
• Performance metrics
• Memory Hierarchy and Pipelining
• RTOS and More OS concepts
Outline
11. Introduction to Computer Architecture
• What is computer architecture ?
• It’s the science and art of designing computing platforms based on specific
functionality and requirements.
• Different computing platforms exist today for different use cases.
• Computers , Laptops, mobile phones, smartwatches etc… each has its own
goals and needs
12. Introduction to Computer Architecture
• Why this topic?Why should we care?
• Fundamentally, this drives some of the most important tools in the world, our
devices.
• There is a lot of demand for people who know about this stuff.Very simple law of
supply and demand.
• The newest buzzwords, AI , ML , hardware accelerators and many more!
• Also….. It’s fun! 🤘
• Check out this cool playlist where this person builds a computer from scratch!
13. Introduction to Computer Architecture
https://www.youtube.com/watch?v=Hyznrd
DSSGM&list=PLowKtXNTBypGqImE405J25
65dvjafglHU
Just google Ben Eater 8 bit computer onYouTube
14. Moore's law
• A lot of changes had been seen in the
semiconductor industry since the introduction of
the first microchip ,that can be characterized into
a certain law.
• The then CEO of Intel Gordan Moore made an
observation which has been turned into a law
• Moore's law is the observation that the number of
transistors in a dense integrated circuit (IC)
doubles about every two years.
15. Moore’s Law
• But currently, Moore’s Law is slowing down since 2009.
• Many researchers are also claiming it’s dead.
• Currently, the size of the transistors are not decreasing, but we are adding
more and more of them on the chips.
• But Moore’s Law was exactly why people used to work in computer
architecture, innovating much more rapidly than any other industry.
16. Levels of Abstraction
• Architecture
• A set of specifications that allows developers to write software and firmware
• These include the instruction set.
• Microarchitecture/Organization
• The logical organization of the inner structure of the computer
• Hardware or Implementation
• The realization or the physical structure, i.e., logic design and chip packaging
17. The computer architecture stack!
• Looks a little bit complex doesn’t it?
• We need to understand the concept of
“modularity” and “abstraction”.
• Here, each layer has no idea how the
below layer works, but it only focuses
on implementing the upper layer
21. Gates/ Modules
Simple Logic Gates that can be used to
construct more complex stuff
Importance of memory also comes here…we’ll talk about it
later. Just remember these are something called registers.
23. Microarchitecture
The 8051 MicrocontrollerArchitecture
The 8086 MicroprocessorArchitecture
In computer engineering, microarchitecture, also called computer organization and sometimes abbreviated as µarch
or uarch, is the way a given instruction set architecture (ISA) is implemented in a particular processor
25. Instruction Set Architecture
• From here on, it is mostly software.This is the power of logical abstraction.
Everything from above on is just a concept being executed by the lower layers.
• As you can see from above….some aspects of “programming” or “writing codes” can
be seen here.
27. Machine and Assembly Code
• Machine Code is something that the
machines will understand. Or something
that is “hardware friendly”.
• Assembly code is the last layer of “code
writing” that a programmer can do, after
this layer, everything is just 1’s and 0’s. Not
preferred for programming.
• These two parts are connected by
something called an Assembler, which
converts Assembly language into Machine
Code.
Note:Your computers…running on x86 architecture, are programmed in x86 assembly language
But your phones….they run on ARM architecture, hence written in ARM Assembly
29. Programming Language
• From here on, nobody actually has
to know how the architecture
works….They are above it all.
• Here your languages like C, C++,
Java, Python, Go, Rust… and all
comes along.
• These are categorized into
complied and interpreted
languages.
• From dead languages to popular
ones, we totally have created
around 9000 programming
languages!
31. Algorithms and Applications
• Here, the domain of theoretical computer science and computational mathematics
comes in.
• An software application is defined in terms of algorithms and APIs ( Application
Program Interfaces).
• We embedded and electronics engineers do not usually contribute much to this layer.
32. There’s the end to our stack journey!
• Was it long and complicated?
• Yes.
• But this is what runs in the real world. Each layer has been under decades of
development and without that, we wouldn’t have the knowledge or the skills to build
and test new architectures and platforms.
• Each layer is as quintessential as the layer above and below it. So yeah…now, you
know the hardware to software path!
34. Basics of a Computer:Von Neumann Architecture
• Remember the guy I asked y’all to google
earlier?This guy was to first one to propose
this kind of architecture.
• A computer consists a basic compute unit or
CPU, a data unit and input and output
devices.
• A simple example of this is theVon
NeumannArchitecture
35. Basics of a Computer: Harvard Architecture
• Consists of the same architecture as ofVon
Neumann, but a little change, such that the
instruction memory and data memory are kept
separately.
• Instructions are executed on the CPU, which uses
the data from memory.
• A simple intuitive example is that do not keep the
instructions for preparing tea and the materials in
the same place!
36. Other Important Structures: RAM and ROM
• Apart from the CPU, I/O, Memory and all.There are still important
structures.
• Memory can be divided into RAM and ROM. RAMs are used for non-
persistent short term storage for fast computation to the CPU.
• ROMs provide longer storage, so data might be stored in this part.
37.
38. Other Important Structures : Clocks
• Another important point is the clock.Without the clock, the device doesn’t
run. In general, the clock refers to a microchip that regulates the timing and
speed of all computer functions. In the chip is a crystal that vibrates at a
specific frequency when electricity is applied.The shortest time any
computer is capable of performing is one clock, or one vibration of the clock
chip.
40. System on Chips
• When you integrate all components into a single
chip, you get a system on Chip (SOC). Infact your
phone is an SoC.
• A modern mobile phone SoC (2019) may contain
more than 7 billion transistors.
• It will integrate:
• Multiple processor cores
• A GPU
• A large number of specialized accelerators
• Large amounts of on-chip memory
• High bandwidth interfaces to off-chip memory
GPU
mem interface
mem interface
mem interface
mem interface
Neural
Processor
Unit
(NPU)
4 “big”
cores
4 “small”
cores
L3 cache
memory
Other
accelerators
A high-level block diagram of a
mobile phone SoC
41. Performance Metrics
• So, we discussed a lot of abstractions involved in building a computer. But how do we
measure if they are good or not?
• Over time, technology scaling provided much greater numbers of faster and lower power
transistors.
• The “iron law” of processor performance:
Time = instructions executed x clocks per instruction (CPI) x clock period
• Clocks per instruction (CPI)
• Instructions Per Cycle (IPC)
• High-frequency design with a good CPI is much harder that costs transistors (area) and
power.
• Performance improvements are results of reduction in instruction count, better compiler
optimizations, and improvements in IPC.
42. The Smartphone
• A single
smartphone will
contain many
different
processor cores.
• All cores here
are ARM
Processors.
CORTEX-A
CORTEX-M
Apps processor
CORTEX-A
CORTEX-R
CORTEX-M
2G/3G/4G/5G
CORTEX-R
CORTEX-M
Wi-Fi
CORTEX-M
Bluetooth
CORTEX-M
GPS
CORTEX-M
Flash controller
CORTEX-M
Power management
CORTEX-M
Sensor hub
CORTEX-M
Camera
CORTEX-A
CORTEX-M
Touchscreen &
sensor hub
43. Memory Hierarchy
• Up until now, we have assumed that the
memory is one block of memory cells, which
take up a large area.
• But do all processes ( instruction for a certain
program) use all the memory cells?
• No….so there’s a lot of latency there. In this
image, you can see that the top 3 processes
consume 60% of the memory.
• So we use something called memory access
latency of the individual cells to optimize the
memory cost and speed.
44. Memory Hierarchy
• Example:
• Let’s look at my workplace. If I wanted to
access my rough working book, I would
store it on my desk rather than my shelf.
• If I wanted to read a novel, I would keep it
on the shelf rather than the shelf.
• Last month’s newspaper, I would rather
keep it in the cabinet than the desk or the
shelf, as once I am done with it, I probably
won’t refer to it again.
• Hence I’m creating a hierarchy of memory
for faster reaching.
desk
shelf
cabinet
45. Memory Hierarchy
• There are other concepts called temporal and
spacial locality. Based on that, some clever
people came with a hierarchy of needs.
• The entire structure is called the cache system.
On top is the fastest but least reliable memory
cells. At the bottom are the slowest but most
reliable cells.
• The L1 cache is a small memory (8-64 KB)
composed of SRAM cells
• The L2 cache is larger and slower (128 KB – 4
MB) (SRAM cells)
• The main memory is even larger (1 – 64 GB)
(DRAM cells)
L1 cache
L2 cache
Main memory
47. Pipelining
• Another awesome concept is called pipelining. It’s used to enhances the speeds of
execution.
• Let’s take a car analogy. Let’s assume that to produce a car, you need to execute
three things:
1. Chassis Production
2. Engine Fitting andTesting
3. Paint Job
• All these things are parallelizable, which is the main advantage.
48. Pipelining: Example
Order of manufacturing
(Car A, B, and then C)
Time
Order of
manufacturing
Time
A
B
C
A
B
C
Chassis
Engine
Paint
49. Pipelining
• Similar to the car production, instruction handling is also parallelizable.
• In normal processors, each instruction has 5 stages which can be seen below
• In a pipelined system, the task to be performed is divided into a series of discrete
stages.
• A key feature of pipelining is that it increases the throughput of the system, that is,
the number of instructions executed per unit time, but it may also slightly increase
the latency.
51. A Real Time Operating System (RTOS) is
an operating system designed specifically
to support real time operations.
In order to be accepted as RTOS it must
have
Predictable response time
Should be deterministic
Real-time operating systems are typically
designed for and used with embedded
system
RTOS
52. RTOS allows Multi-tasking
An RTOS is software that manages the time and resources of a CPU
Application is split into multiple tasks
The RTOS’s job is to run the most important task that is ready-to-run
On a single CPU, only one task executes at any given time
RTO
S
(Code)
Task A
(Code+Data+Stack
)
Task B
(Code+Data+Stack
)
Task C
(Code+Data+Stack
)
Task N
(Code+Data+Stack
)
High
Priorit
y
Low
Priorit
y
Events
Signals/Message
s from Tasks or
ISRs
CPU+FPU+MP
U
(8, 16, 32 or 64-bit)
Select
Highest Priority
Task
Tasks that are ready-to-
run
RTO
S
(Code)
53. Tasks have 3 states during its
lifespan
• Running
• Ready (possibly: suspended,
pending)
• Blocked (possibly: waiting,
dormant, delayed)
• Blocking is self-blocking by
tasks, and moved to Running
state via other tasks’ interrupt
signaling (when block-factor
is removed/satisfied)
Task States
54. Context switching
Scheduler – schedules/shuffles tasks between Running and Ready states
When a task is unblocked with a higher priority over the ‘running’ task, the scheduler ‘switches’ context immediately
(for all pre-emptive RTOSs)
55. Benefits of Using an RTOS
Allows us to split and prioritize the application
code
The RTOS always runs the highest priority task that
is ready
Adding low-priority tasks don’t affect the
responsiveness of high priority tasks
Tasks wait for events and avoids polling
RTOSs make it easy to add middleware
components
TCP/IP stack
USB stacks
File System
Graphical User Interface (GUI)
56. Multicore Processors
• Eventually, it made sense to shift from
single-core to multicore designs.
• From ~2005, multicore designs became
mainstream.
• The number of cores on a single chip
increased over time.
• Clock frequencies increased more
slowly.
• Individual cores were designed to be as
power efficient as possible.
• Wikipedia Readings:
• Microarchitecture , Instruction Set
Architecture, Registers e.g., 4 x Arm Cortex-A72 processors,
each with their own L1 caches and a
shared L2 cache
57. Conclusion
• Hope you guys had a good time!
• These were some heavy and new concepts that we discussed today!
• If you guys have any other doubts or need any assistance in regards to this topic you
can ping us!
• We’ll be happy to help! And yeah….remember to remember these things!
• Sanath N U: sanathn.ec19@bmsce.ac.in
• Dheraj S K: dherajsk.ec19@bmsce.ac.in
58. References
• Wikipedia Readings:
• Microarchitecture , Instruction Set Architecture, Registers
• CISC and RISCArchitecture: CISC, RISC
• Onur Mutlu ComputerArchitecture Lectures link
• Basic ComputerArchitecture
https://www.cse.iitd.ac.in/~srsarangi/archbooksoft.html
• ComputerArchitecture: A Quantitative Approach, 6th Edition
• LearningTips : Forgetting Curve, Attention Span