SlideShare a Scribd company logo
1 of 30
COMPUTER
ARCHITECTURE
Oversimplified by Arki-Tehcs
Prabhanshu Katiyar- 190050088
Sibasis Nayak - 190050115
Gurnoor Singh - 190050045
Paarth Jain - 190050076
Sahasra Ranjan - 190050102
A for Amdahl’s Law
What they teach: Oversimplified:
In computer architecture, Amdahl's law (or Amdahl's argument) is a formula which gives
the theoretical speedup in latency of the execution of a task at fixed workload that can
be expected of a system whose resources are improved. It is named after computer
scientist Gene Amdahl, and was presented at the AFIPS Spring Joint Computer
Conference in 1967. Amdahl's law is often used in parallel computing to predict the
theoretical speedup when using multiple processors. For example, if a program needs 20
hours to complete using a single thread, but a one-hour portion of the program cannot be
parallelized, therefore only the remaining 19 hours (p = 0.95) of execution time can be
parallelized, then regardless of how many threads are devoted to a parallelized execution
of this program, the minimum execution time cannot be less than one hour. Hence, the
theoretical speedup is limited to at most 20 times the single thread performance. Amdahl's
law is often conflated with the law of diminishing returns, whereas only a special case of
applying Amdahl's law demonstrates law of diminishing returns. If one picks optimally (in
terms of the achieved speedup) what is to be improved, then one will see monotonically
decreasing improvements as one improves. If, however, one picks non-optimally, after
improving a sub-optimal component and moving on to improve a more optimal
component, one can see an increase in the return. Note that it is often rational to improve
a system in an order that is "non-optimal" in this sense, given that some improvements are
more difficult or require larger development time than others. Amdahl's law does represent
the law of diminishing returns if on considering what sort of return one gets by adding more
processors to a machine, if one is running a fixed-size computation that will use all
available processors to their capacity. Each new processor added to the system will add
less usable power than the previous one. Each time one doubles the number of processors
the speedup ratio will diminish, as the total throughput heads toward the limit of 1/(1 − p).
B for Branch Predictors
What they teach: Oversimplified:
In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–
else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow in the
instruction pipeline. Branch predictors play a critical role in achieving high effective performance in many modern
pipelined microprocessor architectures such as x86. Two-way branching is usually implemented with a conditional jump
instruction. A conditional jump can either be "not taken" and continue execution with the first branch of code which
follows immediately after the conditional jump, or it can be "taken" and jump to a different place in program memory
where the second branch of code is stored. It is not known for certain whether a conditional jump will be taken or not
taken until the condition has been calculated and the conditional jump has passed the execution stage in the
instruction pipeline. Without branch prediction, the processor would have to wait until the conditional jump instruction
has passed the execute stage before the next instruction can enter the fetch stage in the pipeline. The branch predictor
attempts to avoid this waste of time by trying to guess whether the conditional jump is most likely to be taken or not
taken. The branch that is guessed to be the most likely is then fetched and speculatively executed. If it is later detected
that the guess was wrong, then the speculatively executed or partially executed instructions are discarded and the
pipeline starts over with the correct branch, incurring a delay. The time that is wasted in case of a branch misprediction
is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors
tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. As a result, making
a pipeline longer increases the need for a more advanced branch predictor. The first time a conditional jump
instruction is encountered, there is not much information to base a prediction on. But the branch predictor keeps
records of whether branches are taken or not taken. When it encounters a conditional jump that has been seen several
times before, then it can base the prediction on the history. The branch predictor may, for example, recognize that the
conditional jump is taken more often than not, or that it is taken every second time. Static prediction is the simplest
branch prediction technique because it does not rely on information about the dynamic history of code executing.
Instead, it predicts the outcome of a branch based solely on the branch instruction. The early implementations of
SPARC and MIPS (two of the first commercial RISC architectures) used single-direction static branch prediction: they
always predict that a conditional jump will not be taken, so they always fetch the next sequential instruction. Only when
the branch or jump is evaluated and found to be taken, does the instruction pointer get set to a non-sequential
address. Both CPUs evaluate branches in the decode stage and have a single cycle instruction fetch. As a result, the
branch target recurrence is two cycles long, and the machine always fetches the instruction immediately after any
taken branch. Both architectures define branch delay slots in order to utilize these fetched instructions. A more
advanced form of static prediction presumes that backward branches will be taken and that forward branches will not.
A backward branch is one that has a target address that is lower than its own address. This technique can help with
prediction accuracy of loops, which are usually backward-pointing branches, and are taken more often than not
taken. Some processors allow branch prediction hints to be inserted into the code to tell whether the static prediction
should be taken or not taken. The Intel Pentium 4 accepts branch prediction hints, but this feature was abandoned in
later Intel processors. Static prediction is used as a fall-back technique in some processors with dynamic branch
prediction when dynamic predictors do not have sufficient information to use. Both the Motorola MPC7450 and the Intel
Pentium 4 use this technique as a fall-back.
C for Caches
What they teach: Oversimplified:
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to
reduce the average cost (time or energy) to access data from the main memory. A cache is a
smaller, faster memory, located closer to a processor core, which stores copies of the data
from frequently used main memory locations. Most CPUs have a hierarchy of multiple cache
levels (L1, L2, often L3, and rarely even L4), with separate instruction-specific and data-specific
caches at level 1. Other types of caches exist (that are not counted towards the "cache size" of
the most important caches mentioned above), such as the translation lookaside buffer (TLB)
which is part of the memory management unit (MMU) which most CPUs have. Most modern
desktop and server CPUs have at least three independent caches: an instruction cache to
speed up executable instruction fetch, a data cache to speed up data fetch and store, and a
translation lookaside buffer (TLB) used to speed up virtual-to-physical address translation for
both executable instructions and data. A single TLB can be provided for access to both
instructions and data, or a separate Instruction TLB (ITLB) and data TLB (DTLB) can be provided.
The data cache is usually organized as a hierarchy of more cache levels (L1, L2, etc.; see also
multi-level caches below). However, the TLB cache is part of the memory management unit
(MMU) and not directly related to the CPU caches. Data is transferred between memory and
cache in blocks of fixed size, called cache lines or cache blocks. When a cache line is copied
from memory into the cache, a cache entry is created. The cache entry will include the
copied data as well as the requested memory location (called a tag). When the processor
needs to read or write a location in memory, it first checks for a corresponding entry in the
cache. The cache checks for the contents of the requested memory location in any cache
lines that might contain that address. If the processor finds that the memory location is in the
cache, a cache hit has occurred. However, if the processor does not find the memory location
in the cache, a cache miss has occurred. In the case of a cache hit, the processor
immediately reads or writes the data in the cache line. For a cache miss, the cache allocates a
new entry and copies data from main memory, then the request is fulfilled from the contents of
the cache.
D for Direct Mapped Cache
What they teach: Oversimplified:
In this cache organization, each location in
main memory can go in only one entry in the
cache. Therefore, a direct-mapped cache can
also be called a "one-way set associative"
cache. It does not have a placement policy as
such, since there is no choice of which cache
entry's contents to evict. This means that if two
locations map to the same entry, they may
continually knock each other out. Although
simpler, a direct-mapped cache needs to be
much larger than an associative one to give
comparable performance, and it is more
unpredictable. Let x be block number in
cache, y be block number of memory, and n
be number of blocks in cache, then mapping is
done with the help of the equation x = y mod n.
◦ Each address has a fixed Line it can
belong to according to its index,
basically:
E for Empirical Evaluation
What they teach: Oversimplified:
We look for two major values. Latency and Bandwidth. Latency is the
time for each instruction and bandwidth is the number of instructions
per unit time. In general, it is hard to improve on latency because the
speed of light delay cannot be reduced, or you can say “You cannot
bribe god”. On the other hand, bandwidth, also known as throughput
can be improved by spending more money. Amdahl’s law as taught
before is one way to measure improvement achieved by making
certain changes. Another way is benchmarks. Benchmarks are a set of
instructions which are used to stress test the CPU and measure its
performance. Various benchmarks are available such as spec,
Cloudsuite and parsec. Each benchmark is different and suited for
different goals. A major issue with benchmarks is that they may be
outdated and are often not good representative. For example, a CPU
designed to perform well on memory instructions at the cost of poor
performance on arithmetic will perform terribly on a benchmark
containing mostly arithmetic instructions. Also, a CPU might perform well
on one app but poor on another. In such cases, AM of execution times
is a bad idea as it leads to contradictory comparisons. HM or GM
usually work well. The power consumption and carbon emission is also a
key parameter we need to keep in mind while evaluating/rating a CPU.
F for Fully Associative Cache
What they teach: Oversimplified:
A fully associative cache contains a single set
with B ways, where B is the number of blocks. A
memory address can map to a block in any of
these ways. A fully associative cache is another
name for a B-way set associative cache with
one set. A fully associative cache permits data
to be stored in any cache block, instead of
forcing each memory address into one
particular block. When data is fetched from
memory, it can be placed in any unused block
of the cache. This way we’ll never have a
conflict between two or more memory
addresses which map to a single cache block.
If all the blocks are already in use, it’s usually
best to replace the least recently used one,
assuming that if it hasn’t used it in a while, it
won’t be needed again anytime soon.
◦ No concept of indices, entire cache
belongs to everyone. Blocks be like:
G for Good Job so Far
What they think I mean: What I really mean:
◦ We have learnt so many concepts so
far in a very simple way. We surely
deserve a break on this slide, pat our
backs for making it this far in this
Computer Architecture crash course
and prepare for the upcoming topics.
I don’t even know why are you still
reading this, you were supposed to
move to the next slide right away
because who even stops to read
unimportant long paragraphs
I couldn’t find a suitable concept for
the letter G, so lets do something
different here. This is a *different* kinda
assignment anyway.
H for Hazards
What they teach: Oversimplified:
In the domain of central processing unit (CPU) design, hazards
are problems with the instruction pipeline in CPU
microarchitectures when the next instruction cannot execute in
the following clock cycle,[1] and can potentially lead to
incorrect computation results. Three common types of hazards
are data hazards, structural hazards, and control hazards
(branching hazards).[2] There are several methods used to deal
with hazards, including pipeline stalls/pipeline bubbling, operand
forwarding, and in the case of out-of-order execution, the
scoreboarding method and the Tomasulo algorithm. Data
hazards occur when instructions that exhibit data dependence
modify data in different stages of a pipeline. Ignoring potential
data hazards can result in race conditions (also termed race
hazards). There are three situations in which a data hazard can
occur: RAW, WAW, WAR. A structural hazard occurs when two (or
more) instructions that are already in pipeline need the same
resource. The result is that instruction must be executed in series
rather than parallel for a portion of pipeline. Structural hazards
are sometime referred to as resource hazards. Control hazard
occurs when the pipeline makes wrong decisions on branch
prediction and therefore brings instructions into the pipeline that
must subsequently be discarded. The term branch hazard also
refers to a control hazard.
◦ Control Hazards Structural Hazards
I for Interrupts
What they teach: Oversimplified:
In digital computers, an interrupt is a response by the processor to an event that needs attention from
the software. An interrupt condition alerts the processor and serves as a request for the processor to
interrupt the currently executing code when permitted, so that the event can be processed in a timely
manner. If the request is accepted, the processor responds by suspending its current activities, saving
its state, and executing a function called an interrupt handler (or an interrupt service routine, ISR) to
deal with the event. This interruption is temporary, and, unless the interrupt indicates a fatal error, the
processor resumes normal activities after the interrupt handler finishes. Interrupts are commonly used
by hardware devices to indicate electronic or physical state changes that require attention. Interrupts
are also commonly used to implement computer multitasking, especially in real-time computing.
Systems that use interrupts in these ways are said to be interrupt-driven. Interrupt signals may be issued
in response to hardware or software events. These are classified as hardware interrupts or software
interrupts, respectively. For any particular processor, the number of interrupt types is limited by the
architecture. A hardware interrupt is a condition related to the state of the hardware that may be
signaled by an external hardware device, e.g., an interrupt request (IRQ) line on a PC, or detected by
devices embedded in processor logic (e.g., the CPU timer in IBM System/370), to communicate that
the device needs attention from the operating system (OS)[3] or, if there is no OS, from the "bare-
metal" program running on the CPU. Such external devices may be part of the computer (e.g., disk
controller) or they may be external peripherals. For example, pressing a keyboard key or moving a
mouse plugged into a PS/2 port triggers hardware interrupts that cause the processor to read the
keystroke or mouse position. Hardware interrupts can arrive asynchronously with respect to the
processor clock, and at any time during instruction execution. Consequently, all hardware interrupt
signals are conditioned by synchronizing them to the processor clock, and acted upon only at
instruction execution boundaries. A software interrupt is requested by the processor itself upon
executing particular instructions or when certain conditions are met. Every software interrupt signal is
associated with a particular interrupt handler. A software interrupt may be intentionally caused by
executing a special instruction which, by design, invokes an interrupt when executed. Such instructions
function similarly to subroutine calls and are used for a variety of purposes, such as requesting
operating system services and interacting with device drivers (e.g., to read or write storage media).
Software interrupts may also be unexpectedly triggered by program execution errors. These interrupts
typically are called traps or exceptions. For example, a divide-by-zero exception will be "thrown" (a
software interrupt is requested) if the processor executes a divide instruction with divisor equal to zero.
Typically, the operating system will catch and handle this exception.
J for Jump Instructions
What they teach: Oversimplified:
In a CPU, the general flow of control is that an
instruction is executed and the PC
automatically moves to the next instruction in
the code. The jump instruction, however breaks
this standard behavior and allows the PC to
jump to specified location (within a maximum
distance from the current PC. The utility of this
instruction is that it allows function calls. Some
variants of it like jump and link are usually used
for function calls as the PC needs to return to
the main function after completing the
function call. The jump instruction usually takes
one parameter, which is the offset from the
current PC. Therefore, the new PC is given by
PC = PC + offset. The offset is usually restricted
to some maximum value as the entire
instruction needs to be fitted in 32 or 64 bits.
Literally, that’s all:
K for Kernel Mode of CPU
What they teach: Oversimplified:
The system starts in kernel mode when it boots and after the
operating system is loaded, it executes applications in user
mode. There are some privileged instructions that can only be
executed in kernel mode. These are interrupt instructions, input
output management etc. If the privileged instructions are
executed in user mode, it is illegal and a trap is generated. The
mode bit is set to 0 in the kernel mode. It is changed from 0 to 1
when switching from kernel mode to user mode. In kernel mode,
the CPU may perform any operation allowed by its architecture;
any instruction may be executed, any I/O operation initiated,
any area of memory accessed, and so on. In the other CPU
modes, certain restrictions on CPU operations are enforced by
the hardware. Typically, certain instructions are not permitted
(especially those—including I/O operations—that could alter the
global state of the machine), some memory areas cannot be
accessed, etc. User-mode capabilities of the CPU are typically a
subset of those available in kernel mode, but in some cases, such
as hardware emulation of non-native architectures, they may be
significantly different from those available in standard kernel
mode.
Now CPU be like: You dare oppose me mortal
L for LRU Policy
What they teach: Oversimplified:
In computing, cache algorithms (also frequently called cache
replacement algorithms or cache replacement policies) are optimizing
instructions, or algorithms, that a computer program or a hardware-
maintained structure can utilize in order to manage a cache of
information stored on the computer. Caching improves performance
by keeping recent or often-used data items in memory locations that
are faster or computationally cheaper to access than normal memory
stores. When the cache is full, the algorithm must choose which items to
discard to make room for the new ones. Discards the least recently
used items first. This algorithm requires keeping track of what was used
when, which is expensive if one wants to make sure the algorithm
always discards the least recently used item. General implementations
of this technique require keeping "age bits" for cache-lines and track
the "Least Recently Used" cache-line based on age-bits. In such an
implementation, every time a cache-line is used, the age of all other
cache-lines changes. LRU is actually a family of caching algorithms with
members including 2Q by Theodore Johnson and Dennis Shasha, and
LRU/K by Pat O'Neil, Betty O'Neil and Gerhard Weikum. LRU, like many
other replacement policies, can be characterized using a state
transition field in a vector space, which decides the dynamic cache
state changes similar to how an electromagnetic field determines the
movement of a charged particle placed in it.
M for Moore’s law
What they teach: Oversimplified:
Moore's law is the observation that the number of transistors in a dense
integrated circuit (IC) doubles about every two years. Moore's law is an
observation and projection of a historical trend. Rather than a law of
physics, it is an empirical relationship linked to gains from experience in
production. The observation is named after Gordon Moore, the co-
founder of Fairchild Semiconductor and Intel (and former CEO of the
latter), who in 1965 posited a doubling every year in the number of
components per integrated circuit, and projected this rate of growth
would continue for at least another decade. In 1975, looking forward to
the next decade, he revised the forecast to doubling every two years,
a compound annual growth rate (CAGR) of 41%. While Moore did not
use empirical evidence in forecasting that the historical trend would
continue, his prediction held since 1975 and has since become known
as a "law". Moore's prediction has been used in the semiconductor
industry to guide long-term planning and to set targets for research and
development, thus functioning to some extent as a self-fulfilling
prophecy. Advancements in digital electronics, such as the reduction
in quality-adjusted microprocessor prices, the increase in memory
capacity (RAM and flash), the improvement of sensors, and even the
number and size of pixels in digital cameras, are strongly linked to
Moore's law. These step changes in digital electronics have been a
driving force of technological and social change, productivity, and
economic growth.
N for NOPS Instruction
What they teach: Oversimplified:
In computer science, a NOP, no-op, or NOOP (pronounced "no
op"; short for no operation) is a machine language instruction
and its assembly language mnemonic, programming language
statement, or computer protocol command that does nothing.
Some computer instruction sets include an instruction whose
explicit purpose is to not change the state of any of the
programmer-accessible registers, status flags, or memory. It often
takes a well-defined number of clock cycles to execute. In other
instruction sets, there is no explicit NOP instruction, but the
assembly language mnemonic NOP represents an instruction
which acts as a NOP. A NOP must not access memory, as that
could cause a memory fault or page fault. A NOP is most
commonly used for timing purposes, to force memory alignment,
to prevent hazards, to occupy a branch delay slot, to render
void an existing instruction such as a jump, as a target of an
execute instruction, or as a place-holder to be replaced by
active instructions later on in program development (or to
replace removed instructions when reorganizing would be
problematic or time-consuming). In some cases, a NOP can
have minor side effects; for example, on the Motorola 68000
series of processors, the NOP opcode causes a synchronization
of the pipeline.
You see what we did here? Very few people get
this
O for Optimization
What they teach: Oversimplified:
Even though we have achieved a lot of speedup
recently, there is still scope for improvement. Though
improvements in caches and other structures of CPU
also have a significant impact on performance, the
biggest impact is seen in branch predictors, especially
when the predictor is already very good. Consider a
predictor with 98% accuracy which is improved to 99%
accuracy. This may look like a negligible improvement
but in reality, it is huge. This essentially drops the
number of mispredictions to half, which when
calculated is a major speedup. Similar thing will
happen if we improve the accuracy from 99.98% to
99.99. There are other places also which have a scope
of improvement. The simple pipeline structure assumes
1 cycle is needed to fetch data from memory, which is
often not true. It may take tens or hundreds of cycles
which is very inefficient and good caches and cache
replacement policies can significantly improve
performance here as well.
P for Pipelined processor
What they teach:
Oversimplified:
In computer science, instruction pipelining is a technique for
implementing instruction-level parallelism within a single processor.
Pipelining attempts to keep every part of the processor busy with some
instruction by dividing incoming instructions into a series of sequential
steps (the eponymous "pipeline") performed by different processor units
with different parts of instructions processed in parallel. In a pipelined
computer, instructions flow through the central processing unit (CPU) in
stages. For example, it might have one stage for each step of the von
Neumann cycle: Fetch the instruction, fetch the operands, do the
instruction, write the results. A pipelined computer usually has "pipeline
registers" after each stage. These store information from the instruction
and calculations so that the logic gates of the next stage can do the
next step. This arrangement lets the CPU complete an instruction on
each clock cycle. It is common for even numbered stages to operate
on one edge of the square-wave clock, while odd-numbered stages
operate on the other edge. This allows more CPU throughput than a
multicycle computer at a given clock rate, but may increase latency
due to the added overhead of the pipelining process itself. Also, even
though the electronic logic has a fixed maximum speed, a pipelined
computer can be made faster or slower by varying the number of
stages in the pipeline. With more stages, each stage does less work,
and so the stage has fewer delays from the logic gates and could run
at a higher clock rate. For the purpose of this course, we consider the 5
stage pipeline whose stages are Instruction Fetch(IF), Instruction
Decode(ID), Execute(EX), Memory(MEM) and Write-Back (WB).
Q for ….. ummm…. How do I explain?
You know what? Lets make Q for Questions. If you have any questions so far, feel free to
ping any of us on MS Teams and we will try our best to resolve your doubts as soon as
possible. ☺
R for Read-stall
What they teach: Oversimplified:
A program often needs to read data from
memory which generally takes a lot of time.
Even with caches, the higher level caches still
take significantly large amount of time to bring
the data to the CPU. During this time, a simple
pipelined CPU is just stalled, executing NOP
instructions. These stalls are called read stalls. A
counter part for write called write stalls also
exists. The instruction which has issued a read
instruction must wait for it to finish, and it often
takes few tens to few hundreds of cycles. The
exact amount depends on a variety of factors
such as the cache miss rate, the miss penalty at
each level, and also the type of program. A
cache thrashing program will generally have a
large number of read stall cycles.
S for SPEC Benchmark
What they teach: Oversimplified:
The SPEC Benchmark is one of the most popular
benchmark tests used for evaluating performance
of CPU. SPEC stands for Standard Performance
Evaluation Corporation. The benchmarks aim to
test "real-life" situations. There are several
benchmarks testing Java scenarios, from simple
computation (SPECjbb) to a full system with Java
EE, database, disk, and network (SPECjEnterprise).
The SPEC CPU suites test CPU performance by
measuring the run time of several programs such
as the compiler GCC, the chemistry program
gamess, and the weather program WRF. The
various tasks are equally weighted; no attempt is
made to weight them based on their perceived
importance. An overall score is based on a
geometric mean. Apart from this, various other
benchmarks are also available for evaluating
performance of a cpu.
T for Trap Instructions
What they teach: Oversimplified:
A trap instruction is a procedure call that
synchronously transfers the control. It is a software
interrupt generated by the user program or by an
error when the operating system is needed by it to
perform the system calls or an operation. Thus, a
trap instruction used to switch from the user mode
of the system to the kernel mode. A trap is also
generated during context switch between various
processes by the OS. During a trap, the privilege
level of CPU is raised, and it is setup by the OS to
run OS code. For example, the stack changes from
user stack to kernel stack, CPU is granted access to
several protected data structures hidden from
users and the PC now points to some OS code,
depending upon the reason for generation of trap,
and the arguments passed to it. In a nutshell, trap is
responsible for handling all abnormal behavior.
U for Unconditional Branches
What they teach: Oversimplified:
The flow of program is “go to the next
instruction” for most of the time during
execution. However, branch instructions break
this general flow. There are two types of
branches, conditional and unconditional.
Conditional branches check the truth value of
some condition and jump or don’t jump based
on that value. Unconditional branches or
unconditional jumps are essentially the
branches which are always taken, or in other
words, the branch instructions whose next PC is
fixed and independent of the state of CPU (i.e.
the values in the registers). These are usually
used for making function calls and its variants
such as Jump and Link instruction are used to
jump to a segment and then return from it after
it is done.
V for Virtual Memory
What they teach: Oversimplified:
In computing, virtual memory, or virtual storage is a memory
management technique that provides an "idealized abstraction of the
storage resources that are actually available on a given machine"
which "creates the illusion to users of a very large (main) memory". The
computer's operating system, using a combination of hardware and
software, maps memory addresses used by a program, called virtual
addresses, into physical addresses in computer memory. Main storage,
as seen by a process or task, appears as a contiguous address space
or collection of contiguous segments. The operating system manages
virtual address spaces and the assignment of real memory to virtual
memory. Address translation hardware in the CPU, often referred to as
a memory management unit (MMU), automatically translates virtual
addresses to physical addresses. Software within the operating system
may extend these capabilities, utilizing, e.g., disk storage, to provide a
virtual address space that can exceed the capacity of real memory
and thus reference more memory than is physically present in the
computer. The primary benefits of virtual memory include freeing
applications from having to manage a shared memory space, ability to
share memory used by libraries between processes, increased security
due to memory isolation, and being able to conceptually use more
memory than might be physically available, using the technique of
paging or segmentation.
W for Write-back
What they teach: Oversimplified:
In the 5 staged pipeline, the final stage
is the WB or the write-back stage. The
job of this stage is to take the output
from the ALU or the Memory unit,
depending upon the type of
instruction, and writing the value into
the target register as specified in the
instruction. The decision between ALU
or MEM is made using a MUX after the
latch register.
X for ….. You know right?
I am out of ideas now. How about X for Xtra Questions? 
Y for Yahoo! We are almost done!
What I want to say: Oversimplified:
As we arrive on letter Y, its not hard to see
that we are nearing the end of this crash
course, and there is just one more slide to
go. Honestly, there is nothing more to say
here. We couldn’t find anything for the
letter Y either, so this slide is just random
text from here on, because we have to fill
this side of the slide entirely in order to
maintain consistency throughout the
slides. I don’t know, congratulations I
guess? For making it through the entire
course. I think Y for Ye hamari pawri ho
rahi hai. ☺
Z for Zero Register
What they teach: Oversimplified:
The zero register is the special register
whose value is hardwired to store zero.
This register is often used for comparisons
with zero in branch instructions, or simply
use zero anywhere. This makes the value
zero easily accessible at all times, without
needing to load it into some temporary
register. This is also used in NOP instruction
and although the exact instruction may
vary across architectures, ‘add $0 $0 $0’
can be used as the NOP instruction.
Created by: Me and the bois
We are evolving, just like the CPUs
• For the screen lovers, we have created a
telegram chat bot (coz why not?).
• Here is the link: CA Simplified bot
• Here is the link to video demo of the bot
(Available in our submission drive folder as well):
Demo Video
And finally, Thanks for reading 
What they teach: Oversimplified:
Merci. धन्यवाद| Shukriya. Gracias.
Shukran. Xièxiè. Abhari Ahe.
Thaagatchari. Terima kasih. Nandri.
Dhanyavaadaalu. Anugrihtaasmi.
Dhonnobad. 감 사 해 요 . Teşekkürler.
Dankie. Takk. 感謝. Grazie. Tatenda.
Asante. Ďakujem. Kösz. Au Kun. Met
Dank. Vd’aka. Choukran. Bohoma
Istuti. teşekkür ederim. Hvala. Npezié.
ευχαριστώ. Doh Jeh. Go raibh maith
agat. ਧੰਨਵਾਦ.
Thanks, Thanks, And Thanks. <3

More Related Content

What's hot

Cache Memory Computer Architecture and organization
Cache Memory Computer Architecture and organizationCache Memory Computer Architecture and organization
Cache Memory Computer Architecture and organizationHumayra Khanum
 
Cache memory ...
Cache memory ...Cache memory ...
Cache memory ...Pratik Farkya
 
Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6Ismail Mukiibi
 
Cache Memory
Cache MemoryCache Memory
Cache MemorySubid Biswas
 
Cache memory
Cache memoryCache memory
Cache memoryAbir Rahman
 
Cache memory principles
Cache memory principlesCache memory principles
Cache memory principlesbit allahabad
 
Cache memory
Cache memoryCache memory
Cache memoryAnand Goyal
 
Cache memory ppt
Cache memory ppt  Cache memory ppt
Cache memory ppt Arpita Naik
 
Chapter 2 pc
Chapter 2 pcChapter 2 pc
Chapter 2 pcHanif Durad
 
Memory Organization
Memory OrganizationMemory Organization
Memory OrganizationKamal Acharya
 
Cache performance considerations
Cache performance considerationsCache performance considerations
Cache performance considerationsSlideshare
 
Tdt4260 miniproject report_group_3
Tdt4260 miniproject report_group_3Tdt4260 miniproject report_group_3
Tdt4260 miniproject report_group_3Yulong Bai
 
Lecture 3
Lecture 3Lecture 3
Lecture 3Mr SMAK
 
Aca lab project (rohit malav)
Aca lab project (rohit malav) Aca lab project (rohit malav)
Aca lab project (rohit malav) Rohit malav
 
Real-Time Scheduling Algorithms
Real-Time Scheduling AlgorithmsReal-Time Scheduling Algorithms
Real-Time Scheduling AlgorithmsAJAL A J
 

What's hot (20)

Cache design
Cache design Cache design
Cache design
 
Cache Memory Computer Architecture and organization
Cache Memory Computer Architecture and organizationCache Memory Computer Architecture and organization
Cache Memory Computer Architecture and organization
 
Cache memory ...
Cache memory ...Cache memory ...
Cache memory ...
 
Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6Advanced computer architecture lesson 5 and 6
Advanced computer architecture lesson 5 and 6
 
Cache Memory
Cache MemoryCache Memory
Cache Memory
 
Cache memory
Cache memoryCache memory
Cache memory
 
Cache memory
Cache memoryCache memory
Cache memory
 
Cache memory principles
Cache memory principlesCache memory principles
Cache memory principles
 
Cache memory
Cache memoryCache memory
Cache memory
 
Cache memory ppt
Cache memory ppt  Cache memory ppt
Cache memory ppt
 
Chapter 2 pc
Chapter 2 pcChapter 2 pc
Chapter 2 pc
 
Memory Organization
Memory OrganizationMemory Organization
Memory Organization
 
1.prallelism
1.prallelism1.prallelism
1.prallelism
 
Cache performance considerations
Cache performance considerationsCache performance considerations
Cache performance considerations
 
Tdt4260 miniproject report_group_3
Tdt4260 miniproject report_group_3Tdt4260 miniproject report_group_3
Tdt4260 miniproject report_group_3
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Cache memory
Cache memoryCache memory
Cache memory
 
Aca lab project (rohit malav)
Aca lab project (rohit malav) Aca lab project (rohit malav)
Aca lab project (rohit malav)
 
Real-Time Scheduling Algorithms
Real-Time Scheduling AlgorithmsReal-Time Scheduling Algorithms
Real-Time Scheduling Algorithms
 
Modern processors
Modern processorsModern processors
Modern processors
 

Similar to Computer architecture

Affect of parallel computing on multicore processors
Affect of parallel computing on multicore processorsAffect of parallel computing on multicore processors
Affect of parallel computing on multicore processorscsandit
 
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORSAFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORScscpconf
 
The effective way of processor performance enhancement by proper branch handling
The effective way of processor performance enhancement by proper branch handlingThe effective way of processor performance enhancement by proper branch handling
The effective way of processor performance enhancement by proper branch handlingcsandit
 
THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDL...
THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDL...THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDL...
THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDL...cscpconf
 
AN ATTEMPT TO IMPROVE THE PROCESSOR PERFORMANCE BY PROPER MEMORY MANAGEMENT F...
AN ATTEMPT TO IMPROVE THE PROCESSOR PERFORMANCE BY PROPER MEMORY MANAGEMENT F...AN ATTEMPT TO IMPROVE THE PROCESSOR PERFORMANCE BY PROPER MEMORY MANAGEMENT F...
AN ATTEMPT TO IMPROVE THE PROCESSOR PERFORMANCE BY PROPER MEMORY MANAGEMENT F...IJCSEA Journal
 
Os solved question paper
Os solved question paperOs solved question paper
Os solved question paperAnkit Bhatnagar
 
computer architecture and organization.pptx
computer architecture and organization.pptxcomputer architecture and organization.pptx
computer architecture and organization.pptxROHANSharma311906
 
Operating Systems - memory management
Operating Systems - memory managementOperating Systems - memory management
Operating Systems - memory managementMukesh Chinta
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxfaithxdunce63732
 
Pipeline Mechanism
Pipeline MechanismPipeline Mechanism
Pipeline MechanismAshik Iqbal
 
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...caijjournal
 
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...caijjournal
 
Computer Applications: An International Journal (CAIJ)
Computer Applications: An International Journal (CAIJ)Computer Applications: An International Journal (CAIJ)
Computer Applications: An International Journal (CAIJ)caijjournal
 
Pipelining in Computer System Achitecture
Pipelining in Computer System AchitecturePipelining in Computer System Achitecture
Pipelining in Computer System AchitectureYashiUpadhyay3
 
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...IDES Editor
 
shashank_hpca1995_00386533
shashank_hpca1995_00386533shashank_hpca1995_00386533
shashank_hpca1995_00386533Shashank Nemawarkar
 

Similar to Computer architecture (20)

Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
Affect of parallel computing on multicore processors
Affect of parallel computing on multicore processorsAffect of parallel computing on multicore processors
Affect of parallel computing on multicore processors
 
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORSAFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
 
The effective way of processor performance enhancement by proper branch handling
The effective way of processor performance enhancement by proper branch handlingThe effective way of processor performance enhancement by proper branch handling
The effective way of processor performance enhancement by proper branch handling
 
THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDL...
THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDL...THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDL...
THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDL...
 
1.prallelism
1.prallelism1.prallelism
1.prallelism
 
AN ATTEMPT TO IMPROVE THE PROCESSOR PERFORMANCE BY PROPER MEMORY MANAGEMENT F...
AN ATTEMPT TO IMPROVE THE PROCESSOR PERFORMANCE BY PROPER MEMORY MANAGEMENT F...AN ATTEMPT TO IMPROVE THE PROCESSOR PERFORMANCE BY PROPER MEMORY MANAGEMENT F...
AN ATTEMPT TO IMPROVE THE PROCESSOR PERFORMANCE BY PROPER MEMORY MANAGEMENT F...
 
Reconfigurable computing
Reconfigurable computingReconfigurable computing
Reconfigurable computing
 
Os solved question paper
Os solved question paperOs solved question paper
Os solved question paper
 
computer architecture and organization.pptx
computer architecture and organization.pptxcomputer architecture and organization.pptx
computer architecture and organization.pptx
 
Bt0070
Bt0070Bt0070
Bt0070
 
Operating Systems - memory management
Operating Systems - memory managementOperating Systems - memory management
Operating Systems - memory management
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
 
Pipeline Mechanism
Pipeline MechanismPipeline Mechanism
Pipeline Mechanism
 
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
 
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
 
Computer Applications: An International Journal (CAIJ)
Computer Applications: An International Journal (CAIJ)Computer Applications: An International Journal (CAIJ)
Computer Applications: An International Journal (CAIJ)
 
Pipelining in Computer System Achitecture
Pipelining in Computer System AchitecturePipelining in Computer System Achitecture
Pipelining in Computer System Achitecture
 
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
 
shashank_hpca1995_00386533
shashank_hpca1995_00386533shashank_hpca1995_00386533
shashank_hpca1995_00386533
 

Recently uploaded

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 

Recently uploaded (20)

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 

Computer architecture

  • 1. COMPUTER ARCHITECTURE Oversimplified by Arki-Tehcs Prabhanshu Katiyar- 190050088 Sibasis Nayak - 190050115 Gurnoor Singh - 190050045 Paarth Jain - 190050076 Sahasra Ranjan - 190050102
  • 2. A for Amdahl’s Law What they teach: Oversimplified: In computer architecture, Amdahl's law (or Amdahl's argument) is a formula which gives the theoretical speedup in latency of the execution of a task at fixed workload that can be expected of a system whose resources are improved. It is named after computer scientist Gene Amdahl, and was presented at the AFIPS Spring Joint Computer Conference in 1967. Amdahl's law is often used in parallel computing to predict the theoretical speedup when using multiple processors. For example, if a program needs 20 hours to complete using a single thread, but a one-hour portion of the program cannot be parallelized, therefore only the remaining 19 hours (p = 0.95) of execution time can be parallelized, then regardless of how many threads are devoted to a parallelized execution of this program, the minimum execution time cannot be less than one hour. Hence, the theoretical speedup is limited to at most 20 times the single thread performance. Amdahl's law is often conflated with the law of diminishing returns, whereas only a special case of applying Amdahl's law demonstrates law of diminishing returns. If one picks optimally (in terms of the achieved speedup) what is to be improved, then one will see monotonically decreasing improvements as one improves. If, however, one picks non-optimally, after improving a sub-optimal component and moving on to improve a more optimal component, one can see an increase in the return. Note that it is often rational to improve a system in an order that is "non-optimal" in this sense, given that some improvements are more difficult or require larger development time than others. Amdahl's law does represent the law of diminishing returns if on considering what sort of return one gets by adding more processors to a machine, if one is running a fixed-size computation that will use all available processors to their capacity. Each new processor added to the system will add less usable power than the previous one. Each time one doubles the number of processors the speedup ratio will diminish, as the total throughput heads toward the limit of 1/(1 − p).
  • 3. B for Branch Predictors What they teach: Oversimplified: In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then– else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high effective performance in many modern pipelined microprocessor architectures such as x86. Two-way branching is usually implemented with a conditional jump instruction. A conditional jump can either be "not taken" and continue execution with the first branch of code which follows immediately after the conditional jump, or it can be "taken" and jump to a different place in program memory where the second branch of code is stored. It is not known for certain whether a conditional jump will be taken or not taken until the condition has been calculated and the conditional jump has passed the execution stage in the instruction pipeline. Without branch prediction, the processor would have to wait until the conditional jump instruction has passed the execute stage before the next instruction can enter the fetch stage in the pipeline. The branch predictor attempts to avoid this waste of time by trying to guess whether the conditional jump is most likely to be taken or not taken. The branch that is guessed to be the most likely is then fetched and speculatively executed. If it is later detected that the guess was wrong, then the speculatively executed or partially executed instructions are discarded and the pipeline starts over with the correct branch, incurring a delay. The time that is wasted in case of a branch misprediction is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. As a result, making a pipeline longer increases the need for a more advanced branch predictor. The first time a conditional jump instruction is encountered, there is not much information to base a prediction on. But the branch predictor keeps records of whether branches are taken or not taken. When it encounters a conditional jump that has been seen several times before, then it can base the prediction on the history. The branch predictor may, for example, recognize that the conditional jump is taken more often than not, or that it is taken every second time. Static prediction is the simplest branch prediction technique because it does not rely on information about the dynamic history of code executing. Instead, it predicts the outcome of a branch based solely on the branch instruction. The early implementations of SPARC and MIPS (two of the first commercial RISC architectures) used single-direction static branch prediction: they always predict that a conditional jump will not be taken, so they always fetch the next sequential instruction. Only when the branch or jump is evaluated and found to be taken, does the instruction pointer get set to a non-sequential address. Both CPUs evaluate branches in the decode stage and have a single cycle instruction fetch. As a result, the branch target recurrence is two cycles long, and the machine always fetches the instruction immediately after any taken branch. Both architectures define branch delay slots in order to utilize these fetched instructions. A more advanced form of static prediction presumes that backward branches will be taken and that forward branches will not. A backward branch is one that has a target address that is lower than its own address. This technique can help with prediction accuracy of loops, which are usually backward-pointing branches, and are taken more often than not taken. Some processors allow branch prediction hints to be inserted into the code to tell whether the static prediction should be taken or not taken. The Intel Pentium 4 accepts branch prediction hints, but this feature was abandoned in later Intel processors. Static prediction is used as a fall-back technique in some processors with dynamic branch prediction when dynamic predictors do not have sufficient information to use. Both the Motorola MPC7450 and the Intel Pentium 4 use this technique as a fall-back.
  • 4. C for Caches What they teach: Oversimplified: A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which stores copies of the data from frequently used main memory locations. Most CPUs have a hierarchy of multiple cache levels (L1, L2, often L3, and rarely even L4), with separate instruction-specific and data-specific caches at level 1. Other types of caches exist (that are not counted towards the "cache size" of the most important caches mentioned above), such as the translation lookaside buffer (TLB) which is part of the memory management unit (MMU) which most CPUs have. Most modern desktop and server CPUs have at least three independent caches: an instruction cache to speed up executable instruction fetch, a data cache to speed up data fetch and store, and a translation lookaside buffer (TLB) used to speed up virtual-to-physical address translation for both executable instructions and data. A single TLB can be provided for access to both instructions and data, or a separate Instruction TLB (ITLB) and data TLB (DTLB) can be provided. The data cache is usually organized as a hierarchy of more cache levels (L1, L2, etc.; see also multi-level caches below). However, the TLB cache is part of the memory management unit (MMU) and not directly related to the CPU caches. Data is transferred between memory and cache in blocks of fixed size, called cache lines or cache blocks. When a cache line is copied from memory into the cache, a cache entry is created. The cache entry will include the copied data as well as the requested memory location (called a tag). When the processor needs to read or write a location in memory, it first checks for a corresponding entry in the cache. The cache checks for the contents of the requested memory location in any cache lines that might contain that address. If the processor finds that the memory location is in the cache, a cache hit has occurred. However, if the processor does not find the memory location in the cache, a cache miss has occurred. In the case of a cache hit, the processor immediately reads or writes the data in the cache line. For a cache miss, the cache allocates a new entry and copies data from main memory, then the request is fulfilled from the contents of the cache.
  • 5. D for Direct Mapped Cache What they teach: Oversimplified: In this cache organization, each location in main memory can go in only one entry in the cache. Therefore, a direct-mapped cache can also be called a "one-way set associative" cache. It does not have a placement policy as such, since there is no choice of which cache entry's contents to evict. This means that if two locations map to the same entry, they may continually knock each other out. Although simpler, a direct-mapped cache needs to be much larger than an associative one to give comparable performance, and it is more unpredictable. Let x be block number in cache, y be block number of memory, and n be number of blocks in cache, then mapping is done with the help of the equation x = y mod n. ◦ Each address has a fixed Line it can belong to according to its index, basically:
  • 6. E for Empirical Evaluation What they teach: Oversimplified: We look for two major values. Latency and Bandwidth. Latency is the time for each instruction and bandwidth is the number of instructions per unit time. In general, it is hard to improve on latency because the speed of light delay cannot be reduced, or you can say “You cannot bribe god”. On the other hand, bandwidth, also known as throughput can be improved by spending more money. Amdahl’s law as taught before is one way to measure improvement achieved by making certain changes. Another way is benchmarks. Benchmarks are a set of instructions which are used to stress test the CPU and measure its performance. Various benchmarks are available such as spec, Cloudsuite and parsec. Each benchmark is different and suited for different goals. A major issue with benchmarks is that they may be outdated and are often not good representative. For example, a CPU designed to perform well on memory instructions at the cost of poor performance on arithmetic will perform terribly on a benchmark containing mostly arithmetic instructions. Also, a CPU might perform well on one app but poor on another. In such cases, AM of execution times is a bad idea as it leads to contradictory comparisons. HM or GM usually work well. The power consumption and carbon emission is also a key parameter we need to keep in mind while evaluating/rating a CPU.
  • 7. F for Fully Associative Cache What they teach: Oversimplified: A fully associative cache contains a single set with B ways, where B is the number of blocks. A memory address can map to a block in any of these ways. A fully associative cache is another name for a B-way set associative cache with one set. A fully associative cache permits data to be stored in any cache block, instead of forcing each memory address into one particular block. When data is fetched from memory, it can be placed in any unused block of the cache. This way we’ll never have a conflict between two or more memory addresses which map to a single cache block. If all the blocks are already in use, it’s usually best to replace the least recently used one, assuming that if it hasn’t used it in a while, it won’t be needed again anytime soon. ◦ No concept of indices, entire cache belongs to everyone. Blocks be like:
  • 8. G for Good Job so Far What they think I mean: What I really mean: ◦ We have learnt so many concepts so far in a very simple way. We surely deserve a break on this slide, pat our backs for making it this far in this Computer Architecture crash course and prepare for the upcoming topics. I don’t even know why are you still reading this, you were supposed to move to the next slide right away because who even stops to read unimportant long paragraphs I couldn’t find a suitable concept for the letter G, so lets do something different here. This is a *different* kinda assignment anyway.
  • 9. H for Hazards What they teach: Oversimplified: In the domain of central processing unit (CPU) design, hazards are problems with the instruction pipeline in CPU microarchitectures when the next instruction cannot execute in the following clock cycle,[1] and can potentially lead to incorrect computation results. Three common types of hazards are data hazards, structural hazards, and control hazards (branching hazards).[2] There are several methods used to deal with hazards, including pipeline stalls/pipeline bubbling, operand forwarding, and in the case of out-of-order execution, the scoreboarding method and the Tomasulo algorithm. Data hazards occur when instructions that exhibit data dependence modify data in different stages of a pipeline. Ignoring potential data hazards can result in race conditions (also termed race hazards). There are three situations in which a data hazard can occur: RAW, WAW, WAR. A structural hazard occurs when two (or more) instructions that are already in pipeline need the same resource. The result is that instruction must be executed in series rather than parallel for a portion of pipeline. Structural hazards are sometime referred to as resource hazards. Control hazard occurs when the pipeline makes wrong decisions on branch prediction and therefore brings instructions into the pipeline that must subsequently be discarded. The term branch hazard also refers to a control hazard. ◦ Control Hazards Structural Hazards
  • 10. I for Interrupts What they teach: Oversimplified: In digital computers, an interrupt is a response by the processor to an event that needs attention from the software. An interrupt condition alerts the processor and serves as a request for the processor to interrupt the currently executing code when permitted, so that the event can be processed in a timely manner. If the request is accepted, the processor responds by suspending its current activities, saving its state, and executing a function called an interrupt handler (or an interrupt service routine, ISR) to deal with the event. This interruption is temporary, and, unless the interrupt indicates a fatal error, the processor resumes normal activities after the interrupt handler finishes. Interrupts are commonly used by hardware devices to indicate electronic or physical state changes that require attention. Interrupts are also commonly used to implement computer multitasking, especially in real-time computing. Systems that use interrupts in these ways are said to be interrupt-driven. Interrupt signals may be issued in response to hardware or software events. These are classified as hardware interrupts or software interrupts, respectively. For any particular processor, the number of interrupt types is limited by the architecture. A hardware interrupt is a condition related to the state of the hardware that may be signaled by an external hardware device, e.g., an interrupt request (IRQ) line on a PC, or detected by devices embedded in processor logic (e.g., the CPU timer in IBM System/370), to communicate that the device needs attention from the operating system (OS)[3] or, if there is no OS, from the "bare- metal" program running on the CPU. Such external devices may be part of the computer (e.g., disk controller) or they may be external peripherals. For example, pressing a keyboard key or moving a mouse plugged into a PS/2 port triggers hardware interrupts that cause the processor to read the keystroke or mouse position. Hardware interrupts can arrive asynchronously with respect to the processor clock, and at any time during instruction execution. Consequently, all hardware interrupt signals are conditioned by synchronizing them to the processor clock, and acted upon only at instruction execution boundaries. A software interrupt is requested by the processor itself upon executing particular instructions or when certain conditions are met. Every software interrupt signal is associated with a particular interrupt handler. A software interrupt may be intentionally caused by executing a special instruction which, by design, invokes an interrupt when executed. Such instructions function similarly to subroutine calls and are used for a variety of purposes, such as requesting operating system services and interacting with device drivers (e.g., to read or write storage media). Software interrupts may also be unexpectedly triggered by program execution errors. These interrupts typically are called traps or exceptions. For example, a divide-by-zero exception will be "thrown" (a software interrupt is requested) if the processor executes a divide instruction with divisor equal to zero. Typically, the operating system will catch and handle this exception.
  • 11. J for Jump Instructions What they teach: Oversimplified: In a CPU, the general flow of control is that an instruction is executed and the PC automatically moves to the next instruction in the code. The jump instruction, however breaks this standard behavior and allows the PC to jump to specified location (within a maximum distance from the current PC. The utility of this instruction is that it allows function calls. Some variants of it like jump and link are usually used for function calls as the PC needs to return to the main function after completing the function call. The jump instruction usually takes one parameter, which is the offset from the current PC. Therefore, the new PC is given by PC = PC + offset. The offset is usually restricted to some maximum value as the entire instruction needs to be fitted in 32 or 64 bits. Literally, that’s all:
  • 12. K for Kernel Mode of CPU What they teach: Oversimplified: The system starts in kernel mode when it boots and after the operating system is loaded, it executes applications in user mode. There are some privileged instructions that can only be executed in kernel mode. These are interrupt instructions, input output management etc. If the privileged instructions are executed in user mode, it is illegal and a trap is generated. The mode bit is set to 0 in the kernel mode. It is changed from 0 to 1 when switching from kernel mode to user mode. In kernel mode, the CPU may perform any operation allowed by its architecture; any instruction may be executed, any I/O operation initiated, any area of memory accessed, and so on. In the other CPU modes, certain restrictions on CPU operations are enforced by the hardware. Typically, certain instructions are not permitted (especially those—including I/O operations—that could alter the global state of the machine), some memory areas cannot be accessed, etc. User-mode capabilities of the CPU are typically a subset of those available in kernel mode, but in some cases, such as hardware emulation of non-native architectures, they may be significantly different from those available in standard kernel mode. Now CPU be like: You dare oppose me mortal
  • 13. L for LRU Policy What they teach: Oversimplified: In computing, cache algorithms (also frequently called cache replacement algorithms or cache replacement policies) are optimizing instructions, or algorithms, that a computer program or a hardware- maintained structure can utilize in order to manage a cache of information stored on the computer. Caching improves performance by keeping recent or often-used data items in memory locations that are faster or computationally cheaper to access than normal memory stores. When the cache is full, the algorithm must choose which items to discard to make room for the new ones. Discards the least recently used items first. This algorithm requires keeping track of what was used when, which is expensive if one wants to make sure the algorithm always discards the least recently used item. General implementations of this technique require keeping "age bits" for cache-lines and track the "Least Recently Used" cache-line based on age-bits. In such an implementation, every time a cache-line is used, the age of all other cache-lines changes. LRU is actually a family of caching algorithms with members including 2Q by Theodore Johnson and Dennis Shasha, and LRU/K by Pat O'Neil, Betty O'Neil and Gerhard Weikum. LRU, like many other replacement policies, can be characterized using a state transition field in a vector space, which decides the dynamic cache state changes similar to how an electromagnetic field determines the movement of a charged particle placed in it.
  • 14. M for Moore’s law What they teach: Oversimplified: Moore's law is the observation that the number of transistors in a dense integrated circuit (IC) doubles about every two years. Moore's law is an observation and projection of a historical trend. Rather than a law of physics, it is an empirical relationship linked to gains from experience in production. The observation is named after Gordon Moore, the co- founder of Fairchild Semiconductor and Intel (and former CEO of the latter), who in 1965 posited a doubling every year in the number of components per integrated circuit, and projected this rate of growth would continue for at least another decade. In 1975, looking forward to the next decade, he revised the forecast to doubling every two years, a compound annual growth rate (CAGR) of 41%. While Moore did not use empirical evidence in forecasting that the historical trend would continue, his prediction held since 1975 and has since become known as a "law". Moore's prediction has been used in the semiconductor industry to guide long-term planning and to set targets for research and development, thus functioning to some extent as a self-fulfilling prophecy. Advancements in digital electronics, such as the reduction in quality-adjusted microprocessor prices, the increase in memory capacity (RAM and flash), the improvement of sensors, and even the number and size of pixels in digital cameras, are strongly linked to Moore's law. These step changes in digital electronics have been a driving force of technological and social change, productivity, and economic growth.
  • 15. N for NOPS Instruction What they teach: Oversimplified: In computer science, a NOP, no-op, or NOOP (pronounced "no op"; short for no operation) is a machine language instruction and its assembly language mnemonic, programming language statement, or computer protocol command that does nothing. Some computer instruction sets include an instruction whose explicit purpose is to not change the state of any of the programmer-accessible registers, status flags, or memory. It often takes a well-defined number of clock cycles to execute. In other instruction sets, there is no explicit NOP instruction, but the assembly language mnemonic NOP represents an instruction which acts as a NOP. A NOP must not access memory, as that could cause a memory fault or page fault. A NOP is most commonly used for timing purposes, to force memory alignment, to prevent hazards, to occupy a branch delay slot, to render void an existing instruction such as a jump, as a target of an execute instruction, or as a place-holder to be replaced by active instructions later on in program development (or to replace removed instructions when reorganizing would be problematic or time-consuming). In some cases, a NOP can have minor side effects; for example, on the Motorola 68000 series of processors, the NOP opcode causes a synchronization of the pipeline. You see what we did here? Very few people get this
  • 16. O for Optimization What they teach: Oversimplified: Even though we have achieved a lot of speedup recently, there is still scope for improvement. Though improvements in caches and other structures of CPU also have a significant impact on performance, the biggest impact is seen in branch predictors, especially when the predictor is already very good. Consider a predictor with 98% accuracy which is improved to 99% accuracy. This may look like a negligible improvement but in reality, it is huge. This essentially drops the number of mispredictions to half, which when calculated is a major speedup. Similar thing will happen if we improve the accuracy from 99.98% to 99.99. There are other places also which have a scope of improvement. The simple pipeline structure assumes 1 cycle is needed to fetch data from memory, which is often not true. It may take tens or hundreds of cycles which is very inefficient and good caches and cache replacement policies can significantly improve performance here as well.
  • 17. P for Pipelined processor What they teach: Oversimplified: In computer science, instruction pipelining is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some instruction by dividing incoming instructions into a series of sequential steps (the eponymous "pipeline") performed by different processor units with different parts of instructions processed in parallel. In a pipelined computer, instructions flow through the central processing unit (CPU) in stages. For example, it might have one stage for each step of the von Neumann cycle: Fetch the instruction, fetch the operands, do the instruction, write the results. A pipelined computer usually has "pipeline registers" after each stage. These store information from the instruction and calculations so that the logic gates of the next stage can do the next step. This arrangement lets the CPU complete an instruction on each clock cycle. It is common for even numbered stages to operate on one edge of the square-wave clock, while odd-numbered stages operate on the other edge. This allows more CPU throughput than a multicycle computer at a given clock rate, but may increase latency due to the added overhead of the pipelining process itself. Also, even though the electronic logic has a fixed maximum speed, a pipelined computer can be made faster or slower by varying the number of stages in the pipeline. With more stages, each stage does less work, and so the stage has fewer delays from the logic gates and could run at a higher clock rate. For the purpose of this course, we consider the 5 stage pipeline whose stages are Instruction Fetch(IF), Instruction Decode(ID), Execute(EX), Memory(MEM) and Write-Back (WB).
  • 18. Q for ….. ummm…. How do I explain? You know what? Lets make Q for Questions. If you have any questions so far, feel free to ping any of us on MS Teams and we will try our best to resolve your doubts as soon as possible. ☺
  • 19. R for Read-stall What they teach: Oversimplified: A program often needs to read data from memory which generally takes a lot of time. Even with caches, the higher level caches still take significantly large amount of time to bring the data to the CPU. During this time, a simple pipelined CPU is just stalled, executing NOP instructions. These stalls are called read stalls. A counter part for write called write stalls also exists. The instruction which has issued a read instruction must wait for it to finish, and it often takes few tens to few hundreds of cycles. The exact amount depends on a variety of factors such as the cache miss rate, the miss penalty at each level, and also the type of program. A cache thrashing program will generally have a large number of read stall cycles.
  • 20. S for SPEC Benchmark What they teach: Oversimplified: The SPEC Benchmark is one of the most popular benchmark tests used for evaluating performance of CPU. SPEC stands for Standard Performance Evaluation Corporation. The benchmarks aim to test "real-life" situations. There are several benchmarks testing Java scenarios, from simple computation (SPECjbb) to a full system with Java EE, database, disk, and network (SPECjEnterprise). The SPEC CPU suites test CPU performance by measuring the run time of several programs such as the compiler GCC, the chemistry program gamess, and the weather program WRF. The various tasks are equally weighted; no attempt is made to weight them based on their perceived importance. An overall score is based on a geometric mean. Apart from this, various other benchmarks are also available for evaluating performance of a cpu.
  • 21. T for Trap Instructions What they teach: Oversimplified: A trap instruction is a procedure call that synchronously transfers the control. It is a software interrupt generated by the user program or by an error when the operating system is needed by it to perform the system calls or an operation. Thus, a trap instruction used to switch from the user mode of the system to the kernel mode. A trap is also generated during context switch between various processes by the OS. During a trap, the privilege level of CPU is raised, and it is setup by the OS to run OS code. For example, the stack changes from user stack to kernel stack, CPU is granted access to several protected data structures hidden from users and the PC now points to some OS code, depending upon the reason for generation of trap, and the arguments passed to it. In a nutshell, trap is responsible for handling all abnormal behavior.
  • 22. U for Unconditional Branches What they teach: Oversimplified: The flow of program is “go to the next instruction” for most of the time during execution. However, branch instructions break this general flow. There are two types of branches, conditional and unconditional. Conditional branches check the truth value of some condition and jump or don’t jump based on that value. Unconditional branches or unconditional jumps are essentially the branches which are always taken, or in other words, the branch instructions whose next PC is fixed and independent of the state of CPU (i.e. the values in the registers). These are usually used for making function calls and its variants such as Jump and Link instruction are used to jump to a segment and then return from it after it is done.
  • 23. V for Virtual Memory What they teach: Oversimplified: In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very large (main) memory". The computer's operating system, using a combination of hardware and software, maps memory addresses used by a program, called virtual addresses, into physical addresses in computer memory. Main storage, as seen by a process or task, appears as a contiguous address space or collection of contiguous segments. The operating system manages virtual address spaces and the assignment of real memory to virtual memory. Address translation hardware in the CPU, often referred to as a memory management unit (MMU), automatically translates virtual addresses to physical addresses. Software within the operating system may extend these capabilities, utilizing, e.g., disk storage, to provide a virtual address space that can exceed the capacity of real memory and thus reference more memory than is physically present in the computer. The primary benefits of virtual memory include freeing applications from having to manage a shared memory space, ability to share memory used by libraries between processes, increased security due to memory isolation, and being able to conceptually use more memory than might be physically available, using the technique of paging or segmentation.
  • 24. W for Write-back What they teach: Oversimplified: In the 5 staged pipeline, the final stage is the WB or the write-back stage. The job of this stage is to take the output from the ALU or the Memory unit, depending upon the type of instruction, and writing the value into the target register as specified in the instruction. The decision between ALU or MEM is made using a MUX after the latch register.
  • 25. X for ….. You know right? I am out of ideas now. How about X for Xtra Questions? 
  • 26. Y for Yahoo! We are almost done! What I want to say: Oversimplified: As we arrive on letter Y, its not hard to see that we are nearing the end of this crash course, and there is just one more slide to go. Honestly, there is nothing more to say here. We couldn’t find anything for the letter Y either, so this slide is just random text from here on, because we have to fill this side of the slide entirely in order to maintain consistency throughout the slides. I don’t know, congratulations I guess? For making it through the entire course. I think Y for Ye hamari pawri ho rahi hai. ☺
  • 27. Z for Zero Register What they teach: Oversimplified: The zero register is the special register whose value is hardwired to store zero. This register is often used for comparisons with zero in branch instructions, or simply use zero anywhere. This makes the value zero easily accessible at all times, without needing to load it into some temporary register. This is also used in NOP instruction and although the exact instruction may vary across architectures, ‘add $0 $0 $0’ can be used as the NOP instruction.
  • 28. Created by: Me and the bois
  • 29. We are evolving, just like the CPUs • For the screen lovers, we have created a telegram chat bot (coz why not?). • Here is the link: CA Simplified bot • Here is the link to video demo of the bot (Available in our submission drive folder as well): Demo Video
  • 30. And finally, Thanks for reading  What they teach: Oversimplified: Merci. धन्यवाद| Shukriya. Gracias. Shukran. Xièxiè. Abhari Ahe. Thaagatchari. Terima kasih. Nandri. Dhanyavaadaalu. Anugrihtaasmi. Dhonnobad. 감 사 해 요 . TeşekkĂźrler. Dankie. Takk. 感謝. Grazie. Tatenda. Asante. Ďakujem. KĂśsz. Au Kun. Met Dank. Vd’aka. Choukran. Bohoma Istuti. teşekkĂźr ederim. Hvala. NpeziĂŠ. ευχαριστώ. Doh Jeh. Go raibh maith agat. ਧੰਨਵਾਦ. Thanks, Thanks, And Thanks. <3