The x86 instruction set architecture began with Intel's 16-bit processors in the 1980s and has since evolved through numerous extensions. It supports multiple execution modes including 16-bit real mode, 32-bit protected mode, and 64-bit long mode. The instruction format includes optional prefixes, opcode bytes, addressing fields, and immediate data. General purpose registers are used for operands along with memory addressing modes. Subsequent x86 architectures, such as AMD64, expanded register sizes and added new instructions while maintaining backwards compatibility.
The article reveals the meaning of the term "64 bits". It briefly discusses the history of 64-bit system development, describes the most popular 64-bit processors of the Intel 64 architecture and the 64-bit Windows operating system.
The article reveals the meaning of the term "64 bits". It briefly discusses the history of 64-bit system development, describes the most popular 64-bit processors of the Intel 64 architecture and the 64-bit Windows operating system.
The topic focuses on different aspects of processor organization and architecture such as architecture models, register organization, instruction formats, addressing modes etc.
This is mainly intended for young faculty who are involved in ARM processor architecture teaching. This may also be useful to those who are keen in understanding the secrets of ARM architecture.Very good luck
This presentation is about the design and function of a microprocessor, how to program and how to interface it with other electronics machines and devices
The topic focuses on different aspects of processor organization and architecture such as architecture models, register organization, instruction formats, addressing modes etc.
This is mainly intended for young faculty who are involved in ARM processor architecture teaching. This may also be useful to those who are keen in understanding the secrets of ARM architecture.Very good luck
This presentation is about the design and function of a microprocessor, how to program and how to interface it with other electronics machines and devices
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Event Management System Vb Net Project Report.pdfKamal Acharya
In present era, the scopes of information technology growing with a very fast .We do not see any are untouched from this industry. The scope of information technology has become wider includes: Business and industry. Household Business, Communication, Education, Entertainment, Science, Medicine, Engineering, Distance Learning, Weather Forecasting. Carrier Searching and so on.
My project named “Event Management System” is software that store and maintained all events coordinated in college. It also helpful to print related reports. My project will help to record the events coordinated by faculties with their Name, Event subject, date & details in an efficient & effective ways.
In my system we have to make a system by which a user can record all events coordinated by a particular faculty. In our proposed system some more featured are added which differs it from the existing system such as security.
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
Presented at NUS: Fuzzing and Software Security Summer School 2024
This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Automobile Management System Project Report.pdfKamal Acharya
The proposed project is developed to manage the automobile in the automobile dealer company. The main module in this project is login, automobile management, customer management, sales, complaints and reports. The first module is the login. The automobile showroom owner should login to the project for usage. The username and password are verified and if it is correct, next form opens. If the username and password are not correct, it shows the error message.
When a customer search for a automobile, if the automobile is available, they will be taken to a page that shows the details of the automobile including automobile name, automobile ID, quantity, price etc. “Automobile Management System” is useful for maintaining automobiles, customers effectively and hence helps for establishing good relation between customer and automobile organization. It contains various customized modules for effectively maintaining automobiles and stock information accurately and safely.
When the automobile is sold to the customer, stock will be reduced automatically. When a new purchase is made, stock will be increased automatically. While selecting automobiles for sale, the proposed software will automatically check for total number of available stock of that particular item, if the total stock of that particular item is less than 5, software will notify the user to purchase the particular item.
Also when the user tries to sale items which are not in stock, the system will prompt the user that the stock is not enough. Customers of this system can search for a automobile; can purchase a automobile easily by selecting fast. On the other hand the stock of automobiles can be maintained perfectly by the automobile shop manager overcoming the drawbacks of existing system.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
2. Stored-program computer
Stores program instructions in electronic memory, where programs and data in memory can be treated
interchangeably or uniformly.
von Neumann architecture
• also known as the von Neumann model and Princeton architecture, after 1945 work by John von
Neumann and others in the First Draft of a Report on the EDVAC
• stores program data and instruction data in the same memory
• consists of:
processing unit (arithmetic logic unit and processor registers)
control unit (instruction register and program counter)
memory (for data and instructions)
external mass storage
input and output mechanisms
• instruction fetch and a data operation cannot occur concurrently because they share a common bus;
referred to as the von Neumann bottleneck which often limits system performance.
3. Harvard architecture
• Based on the Harvard Mark I
• Data and instruction are stored in entirely separate memory systems
• CPU can fetch next instruction and load or store data simultaneously and independently
4. Modified Harvard architecture
• loose separation between code and data
• contents of the instruction memory can be accessed as if it were data.
• implemented on most modern CPU architectures
Implementation Modifications
• Split-cache (or Almost-von-Neumann) architecture
• builds memory hierarchy with a CPU cache separating instructions and data;
• unifies all except small portions of the data and instruction address spaces, providing the von
Neumann model
• cache coherency issues matter since it can greatly affect performance
• Instruction-memory-as-data architecture
• Preserves Harvard memory separation, but provides special machine operations to access the
contents of the instruction memory as data.
• Data-memory-as-instruction architecture
• can execute instructions fetched from any memory segment
• can read an instruction and read a data value simultaneously if they're in separate memory segments
with independent data buses (like Harvard).
• when executing an instruction from one memory segment, the same memory segment cannot be
simultaneously accessed as data
5. Three characteristics to distinguish modified Harvard machines from pure Harvard and von
Neumann machines:
Pure Harvard Von Neumann Modified Harvard
Instruction and data
memories occupy
different address
spaces
Separate address
"zero" in instruction
space and in data
space
store both instructions
and data in a single
address space
Separate address
"zero" in instruction
space and in data
space
Instruction and data
memories have
separate hardware
pathways to the
central processing
unit (CPU)
Separate pathways for
instruction and data
memories to CPU
unified address space such separate access
paths for CPU caches
or other tightly coupled
memories, but a
unified address space
covers the rest of the
memory hierarchy
Instruction and data
memories may be
accessed in different
ways
stored instructions on
a punched paper tape
and data in electro-
mechanical counters
provides uniform access to flash memory and
SRAM
6. Basic properties of the x86 architecture
• General consensus suggests that x86 is a modified Harvard architecture.
• The x86 architecture is a variable instruction length (typical 2 or 3 bytes, some are single-byte, others up to
15 bytes).
• Primarily "CISC" design with emphasis on backward compatibility.
• The instruction set is not typical CISC, but an extended version of the simple eight-bit 8008 and 8080
architectures
• Byte-addressing is enabled and words are stored in memory with little-endian byte order (LSB first)
• Memory access to unaligned addresses is allowed for all valid word sizes
• Native integer sizes for arithmetic and memory addresses (or offsets) is 16, 32 or 64 bits depending on
architecture generation
• Multiple scalar values can be handled simultaneously via the SIMD unit (starting with Pentium 3)
• Floating point (separate prior to 80486, built-in ever since) instructions and registers for floating point
operations
• SIMD (single instruction, multiple data) instructions works on (one or two) 128-bit words, each containing
two or four floating point numbers (each 64 or 32 bits wide respectively), or alternatively, 2, 4, 8 or 16
integers (each 64, 32, 16 or 8 bits wide respectively).
• Pipelining and Superscalar features (starting with Pentium) added extra decoding steps to split most
instructions into micro-operations buffered and scheduled by a control unit to be executed, partly in
parallel, by one of several execution units.
• Out-of-order and speculative execution uses branch prediction, register renaming, and memory
dependence prediction to allow execution of multiple x86 instructions simultaneously and not in the same
order as given in the instruction stream.
• Simultaneous multithreading
7. x86 REGISTERS
16-bit
• The original Intel 8086 and 8088 have fourteen 16-bit registers.
• Four are general-purpose registers (GPRs): AX, BX, CX, DX; Each can be accessed as two separate bytes
(the high byte and low byte)
• Two pointer registers have special roles: SP (stack pointer) points to the "top" of the stack, BP (base
pointer) is used to point anywhere on the stack.
• The address/index registers: SI, DI, BX and BP
• Four segment registers : CS, DS, SS and ES (used to form a memory address in segmented memory
mode)
• The FLAGS register contains, among others, carry flag (CF), overflow flag (OF) and zero flag (ZF).
• The instruction pointer (IP) points to the next instruction that will be fetched from memory and then
executed; is read-only to the software.
• three special registers (GDTR, LDTR, IDTR) hold descriptor table addresses to support protected mode in
80286 and a fourth task register (TR) is used for task switching.
8. 32-bit
• 32-bit processor (starting with 80386) expanded the 16-bit GPRs, base and index registers, instruction
pointer, and FLAGS register to 32 bits (segment registers not affected)
• Represented by prefixing an "E" (for "extended") to the register names in x86 assembly language.
• The general-purpose, base, and index registers can all be used as the base in addressing modes, and all of
those registers except for the stack pointer can be used as the index in addressing modes.
• Two new segment registers (FS and GS) were added.
• the machine code format was expanded to accommodate expanded registers.
• control/status register (MXCSR) 32-bit Streaming SIMD Extensions (SSE) added starting with the Pentium
III.
64-bit
• 32-bit registers are expanded into 64-bit registers (introduced with AMD Opteron)
• addressing extended to 64 bits
• An R-prefix identifies the 64-bit registers (RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP, RFLAGS, RIP),
• eight additional 64-bit general registers (R8-R15) were also introduced (only usable in 64-bit mode, which
is one of the two modes only available in long mode)
• extra addressing mode allows memory references relative to RIP (the instruction pointer), to ease the
implementation of position-independent code, used in shared libraries in some operating systems.
Miscellaneous/special purpose
• 32-bit x86 processors (starting with the 80386) also include various special/miscellaneous registers:
• control registers (CR0 through 4, CR8 for 64-bit only)
• debug registers (DR0 through 3, plus 6 and 7)
• test registers (TR3 through 7; 80486 only)
• model-specific registers (MSRs, appearing with the Pentium)
9. 80-bit
• Available in all floating point units (FPU) also known as math co-processors
• They appears as part of the CPU
• 8087 (8086, 8088, 80186, and 80188), 80287 (80286), 80387 (80386), built-in starting with 80486
• eight 80-bit wide registers: st(0) to st(7)
• each register holds numeric data in one of seven formats: 32-, 64-, or 80-bit floating point, 16-, 32-, or 64-
bit (binary) integer, and 80-bit packed decimal integer
• The Pentium MMX added eight 64-bit MMX integer registers (MMX0 to MMX7, which share lower bits with
the 80-bit-wide FPU stack).
128-bit
• SIMD registers XMM0–XMM15.
256-bit
• SIMD registers YMM0–YMM15.
• introduced with Intel's Sandy Bridge processors, SIMD registers widened to 256 bits; AVX (Advanced
Vector Extensions) instructions also introduced.
512-bit
• SIMD registers ZMM0–ZMM31. Used by Knights Corner (on Intel Xeon Phi co-processors)
10. General Purpose Registers (A, B, C and D)
64 56 48 40 32 24 16 8
R?X
E?X
?X
?H ?L
General Purpose
• AL/AH/AX/EAX/RAX: Accumulator
• BL/BH/BX/EBX/RBX: Base index (for use with arrays)
• CL/CH/CX/ECX/RCX: Counter (for use with loops and strings)
• DL/DH/DX/EDX/RDX: Extend the precision of the accumulator (e.g. combine 32-bit EAX and EDX for 64-bit
integer operations in 32-bit code)
R8-R15 (for 64-bit CPUs)
64-bit mode-only General Purpose Registers
(R8, R9, R10, R11, R12, R13, R14, R15)
64 56 48 40 32 24 16 8
?
?D
?W
?B
11. Address/Index Registers
• SI/ESI/RSI: Source index for string operations.
• DI/EDI/RDI: Destination index for string operations.
Index Registers (S and D)
64 56 48 40 32 24 16 8
R?I
E?I
?I
?IL
Note: The ?IL registers are only available in 64-bit mode.
Stack Pointer Register
• SP/ESP/RSP: Stack pointer for top address of the stack.
• BP/EBP/RBP: Stack base pointer for holding the address of the current stack frame.
Pointer Registers (S and B)
64 56 48 40 32 24 16 8
R?P
E?P
?P
?PL
Note: The ?PL registers are only available in 64-bit mode.
12. Instruction Pointer Register
• IP/EIP/RIP: Instruction pointer. Holds the program counter, the current instruction address.
Instruction Pointer Register (I)
64 56 48 40 32 24 16 8
RIP
EIP
IP
Segment registers
• CS: Code
• DS: Data
• SS: Stack
• ES: Extra data
• FS: Extra data #2
GS: Extra data #3
Segment Registers (C, D, S, E, F and G)
16 8
?S
14. • First introduced with Intel 8086 and 8088 16-bit CPUs.
• Used by Intel, AMD, Cyrix, NEC, and Zilog
• Inherited many characteristics and instructions from the previous generation of 8-bit CPUs such as the
8080.
• modern x86 instruction set is a superset of 8086 instructions and a series of extensions to this instruction
set that began with the Intel 8008 microprocessor.
• Nearly full binary backward compatibility (between the Intel 8086 chip through to the current generation of
x86 processors, with certain exceptions)
• Using instructions that will execute on either anything later than an Intel 80386 (or fully compatible clone)
processor or else anything later than an Intel Pentium (or compatible clone) processor, (In recent years
various software requirements need at least support for later specific extensions to the instruction set,
e.g., MMX or SIMD).
x86 INSTRUCTION SET ARCHITECTURE
15. Basic Instruction Format
• most registers are expressed in opcodes using three or four bits to conserve encoding space;
• at most one operand to an instruction can be a memory location
• memory operand may also be the destination (or a combined source and destination), while the other
operand, the source, can be either register or immediate.
• The relatively small number of general registers (also inherited from its 8-bit ancestors) has made
register-relative addressing (using small immediate offsets) an important method of accessing
operands, especially on the stack, making such accesses as fast as register accesses, i.e. a one cycle
instruction throughput, in most circumstances where the accessed data is available in the top-level
cache.
16. • IA-32E Mode
sub-modes: Compatibility Mode (64-bit, legacy protected mode) 64-Bit Mode (full access to 64-bit address)
REX Prefixes
REX prefixes are instruction-prefix bytes used in 64-bit mode. They do the following:
• Specify GPRs and SSE registers.
• Specify 64-bit operand size.
• Specify extended control registers.
Not all instructions require a REX prefix in 64-bit mode. A prefix is necessary only if an instruction references one of the extended
registers or uses a 64-bit operand. If a REX prefix is used when it has no meaning, it is ignored. Only one REX prefix is allowed
per instruction. If used, the prefix must immediately precede the opcode byte or the two-byte opcode escape prefix (if present).
Other placements are ignored. The instruction-size limit of 15 bytes still applies to instructions with a REX prefix.
• Instruction format for protected mode, real-address mode, and virtual-8086 mode
The Intel 64 and IA-32 architectures instruction encodings are subsets of the format shown. Instructions consist of optional
instruction prefixes (in any order), primary opcode bytes (up to three bytes), an addressing-form specifier (if required) consisting
of the ModR/M byte and sometimes the SIB (Scale-Index-Base) byte, a displacement (if required), and an immediate data field (if
required)
17. Mnemonics and opcodes
• Each x86 assembly instruction is represented by a mnemonic which, often combined with one or more
operands, translates to one or more bytes called an opcode;
NOP : 0x90
HLT : 0xF4
There are potential opcodes with no documented mnemonic which different processors may interpret
differently, making a program using them behave inconsistently or even generate an exception on some
processors. These opcodes often turn up in code writing competitions as a way to make the code smaller,
faster, more elegant or just show off the author's prowess.
Demonstrates how to find undocumented opcodes in x86 CPUs:
https://www.youtube.com/watch?v=KrksBdWcZgQ
18. Syntax
• x86 assembly language has two main syntax branches:
Intel syntax, originally used for documentation of the x86 platform and is dominant in the MS-DOS and
Windows world (Many x86 assemblers use Intel syntax, including NASM, FASM, MASM, TASM, and YASM)
AT&T syntax is dominant in the Unix world, since Unix was created at AT&T Bell Labs
Summary of the main differences between Intel syntax and AT&T syntax:
AT&T Intel
Parameter
order
Source before the destination.
mov $5, %eax
Destination before source.
mov eax, 5
Parameter
size
Mnemonics are suffixed with a letter
indicating the size of the operands: q for
qword, l for long (dword), w for word, and b
for byte.
addl $4, %esp
Derived from the name of the register that is
used (e.g. rax, eax, ax, al imply q, l, w, b,
respectively).
add esp, 4
Sigils
Immediate values prefixed with a "$",
registers prefixed with a "%".
The assembler automatically detects the type of
symbols; i.e., whether they are registers,
constants or something else.
Effective
addresses
General syntax of
DISP(BASE,INDEX,SCALE).
Example:
movl mem_location(%ebx,%ecx,4), %eax
Arithmetic expressions in square brackets;
additionally, size keywords like byte, word, or
dword have to be used if the size cannot be
determined from the operands.
Example:
mov eax, [ebx + ecx*4 + mem_location]
19. Execution modes
• Real mode (16-bit)
• Original operating mode of early generation x86 CPUs
• Protected mode (16-bit and 32-bit)
• 16-bit subset of instructions are available on the 16-bit x86 processors. These instructions are
available in real mode on all x86 processors, and in 16-bit protected mode (80286 onwards),
additional instructions relating to protected mode are available. On the 80386 and later, 32-bit
instructions (including later extensions) are also available in all modes, including real mode.
• protected of 80286 was extended to allow the 80386 to address up to 4 GB of memory,
• The 32-bit flat memory model of the 80386's helped drive large scale adoption of Windows 3.1
(which relied on protected mode) since Windows could now run many applications at once,
including DOS applications, by using virtual memory and simple multitasking.
• Virtual 8086 mode (16-bit)
• virtual 8086 mode (VM86) made it possible to run one or more real mode programs in a protected
environment which emulated real mode, (some programs could not run fully compatible)
• System Management Mode (16-bit)
• SMM, with some of its own special instructions, is available on some Intel i386SL, i486 and later
CPUs
• Long mode (64-bit)
• 64-bit instructions, and more registers, are also available. The instruction set is similar in each mode
but memory addressing and word size vary, requiring different programming strategies.
20. Segmented addressing (real, vm86, 80286 protected modes)
• uses a process known as segmentation to address memory
• Segmentation composes a memory address from two parts: a segment and an offset; the segment points to
the beginning of a 64 KB group of addresses and the offset determines how far from this beginning address
the desired address is.
• In segmented addressing, two registers are required for a complete memory address: one to hold the
segment, the other to hold the offset. In order to translate back into a flat address, the segment value is
shifted four bits left (equivalent to multiplication by 24 or 16) then added to the offset to form the full address,
which allows breaking the 64k barrier through clever choice of addresses, though it makes programming
considerably more complex.
Example:
DS = 0xDEAD, DX = 0xCAFE
memory address = 0xDEAD * 0x10 + 0xCAFE = 0xEB5CE.
Therefore, the CPU can address up to 1,048,576 bytes (1 MB) in real mode.
• By combining segment and offset values we find a 20-bit address.
• When referring to an address with a segment and an offset the notation of segment:offset is used, so in the
above example the flat address 0xEB5CE can be written as 0xDEAD:0xCAFE or as a segment and offset
register pair; DS:DX.
21. • There are some special combinations of segment registers and general registers that point to important
addresses:
CS:IP (CS is Code Segment, IP is Instruction Pointer)
points to the address where the processor will fetch the next byte of code.
SS:SP (SS is Stack Segment, SP is Stack Pointer)
points to the address of the top of the stack, i.e. the most recently pushed byte.
DS:SI (DS is Data Segment, SI is Source Index)
is often used to point to string data that is about to be copied to ES:DI.
ES:DI (ES is Extra Segment, DI is Destination Index)
is typically used to point to the destination for a string copy, as mentioned above.
• In 80286 protected mode (utilized by OS/2)
80286 had 16-bit address registers, limiting only 216 bytes (64 kilobytes) of addressable space.
In protected mode, the CPU can use 24-bit addressing to access 224 bytes of memory (16 megabytes).
• In protected mode, the segment selector can be broken down into three parts: a 13-bit index, a Table
Indicator bit that determines whether the entry is in the GDT or LDT and a 2-bit Requested Privilege
Level
23. Stack instructions
PUSH src/immed
Decrements SP by the size of the operand (two or four, byte values are sign extended) and transfers one word
from source to the stack top (SS:SP).
POP dest
Transfers word at the current stack top (SS:SP) to the destination then increments SP by two to point to the
new stack top. CS is not a valid destination.
PUSHA
PUSHAD (386+)
Pushes all general purpose registers onto the stack in the following order: (E)AX, (E)CX, (E)DX, (E)BX, (E)SP,
(E)BP, (E)SI, (E)DI. The value of SP is the value before the actual push of SP.
POPA
POPAD (386+)
Pops the top 8 words off the stack into the 8 general purpose 16/32 bit registers. Registers are popped in the
following order: (E)DI, (E)SI, (E)BP, (E)SP, (E)DX, (E)CX and (E)AX. The (E)SP value popped from the stack
is actually discarded.
POPF
POPFD (386+)
Pops word / doubleword from stack into the Flags Register and then increments SP by 2 (for POPF) or 4 (for
POPFD).
24. Integer ALU instructions
standard mathematical operations:
ADD dest,src Modifies flags: AF CF OF PF SF ZF
Adds "src" to "dest" and replacing the original contents of "dest". Both operands are binary.
SUB dest,src Modifies flags: AF CF OF PF SF ZF
The source is subtracted from the destination and the result is stored in the destination.
MUL src Modifies flags: CF OF (AF,PF,SF,ZF undefined)
Unsigned multiply of the accumulator by the source. If "src" is a byte value, then AL is used as the other
multiplicand and the result is placed in AX. If "src" is a word value, then AX is multiplied by "src" and DX:AX
receives the result. If "src" is a double word value, then EAX is multiplied by "src" and EDX:EAX receives the
result. The 386+ uses an early out algorithm which makes multiplying any size value in EAX as fast as in the
8 or 16 bit registers.
DIV src Modifies flags: (AF,CF,OF,PF,SF,ZF undefined)
Unsigned binary division of accumulator by source. If the source divisor is a byte value then AX is divided by
"src" and the quotient is placed in AL and the remainder in AH. If source operand is a word value, then DX:AX
is divided by "src" and the quotient is stored in AX and the remainder in DX.
25. logical operators:
AND dest,src Modifies flags: CF OF PF SF ZF (AF undefined)
Performs a logical AND of the two operands replacing the destination with the result.
OR dest,src Modifies flags: CF OF PF SF ZF (AF undefined)
Logical inclusive OR of the two operands returning the result in the destination. Any bit set in either
operand will be set in the destination.
XOR dest,src Modifies flags: CF OF PF SF ZF (AF undefined)
Performs a bitwise exclusive OR of the operands and returns the result in the destination.
NEG dest Modifies flags: AF CF OF PF SF ZF
Subtracts the destination from 0 and saves the 2s complement of "dest" back into "dest"
26. bitshift arithmetic and logical:
SAL dest,count Modifies flags: CF OF PF SF ZF (AF undefined)
SHL dest,count
.-. .---------------. .-.
|C|<----|7 <---------- 0|<----|0|
'-' '---------------' '-'
Shifts the destination left by "count" bits with zeroes shifted in on right. The Carry Flag contains the last bit
shifted out.
SAR dest,count Modifies flags: CF OF PF SF ZF (AF undefined)
.---------------. .-.
.--|7 ----------> 0|---->|C|
| '---------------' '-'
'---^
Shifts the destination right by "count" bits with the current sign bit replicated in the leftmost bit. The Carry
Flag contains the last bit shifted out.
SHR dest,count Modifies flags: CF OF PF SF ZF (AF undefined)
.-. .---------------. .-.
|0|---->|7 ----------> 0|---->|C|
'-' '---------------' '-'
Shifts the destination right by "count" bits with zeroes shifted in on the left. The Carry Flag contains the last
bit shifted out.
27. rotate with and without carry:
RCL dest,count Modifies flags: CF OF
.-. .---------------.
.--|C|<----|7 <---------- 0|<-.
| '-' '---------------' |
'-----------------------------'
Rotates the bits in the destination to the left "count" times with all data pushed out the left side re-entering on
the right. The Carry Flag holds the last bit rotated out.
RCR dest,count Modifies flags: CF OF
.---------------. .-.
.->|7 ----------> 0|---->|C|--.
| '---------------' '-' |
'-----------------------------'
Rotates the bits in the destination to the right "count" times with all data pushed out the right side re-entering on
the left. The Carry Flag holds the last bit rotated out.
ROL dest,count Modifies flags: CF OF
.-. .---------------.
|C|<-.--|7 <---------- 0|<-.
'-' | '---------------' |
'---------------------‘
Rotates the bits in the destination to the left "count" times with all data pushed out the left side re-entering on
the right. The Carry Flag will contain the value of the last bit rotated out.
ROR dest,count Modifies flags: CF OF
.---------------. .-.
.->|7 ----------> 0|--.->|C|
| '---------------' | '-'
'---------------------'
Rotates the bits in the destination to the right "count" times with all data pushed out the right side re-entering on
the left. The Carry Flag will contain the value of the last bit rotated out.
28. complement of BCD arithmetic instructions / others
AAA Modifies flags: AF CF (OF,PF,SF,ZF undefined)
Changes contents of AL to valid unpacked decimal. The high order nibble is zeroed.
AAD Modifies flags: SF ZF PF (AF,CF,OF undefined)
Used before dividing unpacked decimal numbers. Multiplies AH by 10 and the adds result into AL. Sets AH to
zero. This instruction is also known to have an undocumented behavior.
AL := 10*AH+AL
AH := 0
AAM Modifies flags: PF SF ZF (AF,CF,OF undefined)
AH := AL / 10
AL := AL mod 10
Used after multiplication of two unpacked decimal numbers, this instruction adjusts an unpacked decimal
number. The high order nibble of each byte must be zeroed before using this instruction. This instruction is
also known to have an undocumented behavior.
AAS Modifies flags: AF CF (OF,PF,SF,ZF undefined)
Corrects result of a previous unpacked decimal subtraction in AL. High order nibble is zeroed.
DAA Modifies flags: AF CF PF SF ZF (OF undefined)
Corrects result (in AL) of a previous BCD addition operation. Contents of AL are changed to a pair of packed
decimal digits.
DAS Modifies flags: AF CF PF SF ZF (OF undefined)
Corrects result (in AL) of a previous BCD subtraction operation. Contents of AL are changed to a pair of
packed decimal digits.
29. Data manipulation instructions
data transfer instructions
MOV dest,src
Copies byte or word from the source operand to the destination operand. If the destination is SS interrupts
are disabled except on early buggy 808x CPUs. Some CPUs disable interrupts if the destination is any of
the segment registers
XCHG dest,src
Exchanges contents of source and destination.
MOVSX dest,src
Copies the value of the source operand to the destination register with the sign extended.
MOVZX dest,src
Copies the value of the source operand to the destination register with the zeroes extended.
CMPXCHG dest,src (486+) Modifies flags: AF CF OF PF SF ZF
Compares the accumulator (8-32 bits) with "dest". If equal the "dest" is loaded with "src", otherwise the
accumulator is loaded with "dest".
CWD
Extends sign of word in register AX throughout register DX forming a doubleword quantity in DX:AX.
CDQ
Converts signed DWORD in EAX to a signed quad word in EDX:EAX by extending the high order bit of EAX
throughout EDX
30. string/array instructions
MOVS dest,src
MOVSB
MOVSW
MOVSD (386+)
Copies data from addressed by DS:SI (even if operands are given) to the location ES:DI destination and
updates SI and DI based on the size of the operand or instruction used. SI and DI are incremented when the
Direction Flag is cleared and decremented when the Direction Flag is Set. Use with REP prefixes.
CMPS dest,src Modifies flags: AF CF OF PF SF ZF
CMPSB
CMPSW
CMPSD (386+)
Subtracts destination value from source without saving results. Updates flags based on the subtraction and
the index registers (E)SI and (E)DI are incremented or decremented depending on the state of the Direction
Flag. CMPSB inc/decrements the index registers by 1, CMPSW inc/decrements by 2, while CMPSD
increments or decrements by 4. The REP prefixes can be used to process entire data items.
31. SCAS string Modifies flags: AF CF OF PF SF ZF
SCASB
SCASW
SCASD (386+)
Compares value at ES:DI (even if operand is specified) from the accumulator and sets the flags similar to a
subtraction. DI is incremented/decremented based on the instruction format (or operand size) and the state
of the Direction Flag. Use with REP prefixes.
LODS src
LODSB
LODSW
LODSD (386+)
Transfers string element addressed by DS:SI (even if an operand is supplied) to the accumulator. SI is
incremented based on the size of the operand or based on the instruction used. If the Direction Flag is set
SI is decremented, if the Direction Flag is clear SI is incremented. Use with REP prefixes.
STOS dest
STOSB
STOSW
STOSD
Stores value in accumulator to location at ES:(E)DI (even if operand is given). (E)DI is
incremented/decremented based on the size of the operand (or instruction format) and the state of the
Direction Flag. Use with REP prefixes.
32. REP
Repeats execution of string instructions while CX != 0. After each string operation, CX is decremented and
the Zero Flag is tested. The combination of a repeat prefix and a segment override on CPU's before the 386
may result in errors if an interrupt occurs before CX=0. The following code shows code that is susceptible to
this and how to avoid it:
again: rep movs byte ptr ES:[DI],ES:[SI] ; vulnerable instr.
jcxz next ; continue if REP successful
loop again ; interrupt goofed count
next:
REPE
REPZ
Repeats execution of string instructions while CX != 0 and the Zero Flag is set. CX is decremented and the
Zero Flag tested after each string operation. The combination of a repeat prefix and a segment override on
processors other than the 386 may result in errors if an interrupt occurs before CX=0.
REPNE
REPNZ
Repeats execution of string instructions while CX != 0 and the Zero Flag is clear. CX is decremented and
the Zero Flag tested after each string operation. The combination of a repeat prefix and a segment override
on processors other than the 386 may result in errors if an interrupt occurs before CX=0.
33. Program flow
conditional jumps
Mnemonic Meaning Jump Condition
JA Jump if Above CF=0 and ZF=0
JAE Jump if Above or Equal CF=0
JB Jump if Below CF=1
JBE Jump if Below or Equal CF=1 or ZF=1
JC Jump if Carry CF=1
JCXZ Jump if CX Zero CX=0
JE Jump if Equal ZF=1
JG Jump if Greater (signed) ZF=0 and SF=OF
JGE Jump if Greater or Equal (signed) SF=OF
JL Jump if Less (signed) SF != OF
JLE Jump if Less or Equal (signed) ZF=1 or SF != OF
JNA Jump if Not Above CF=1 or ZF=1
JNAE Jump if Not Above or Equal CF=1
JNB Jump if Not Below CF=0
JNBE Jump if Not Below or Equal CF=0 and ZF=0
JNC Jump if Not Carry CF=0
JNE Jump if Not Equal ZF=0
JNG Jump if Not Greater (signed) ZF=1 or SF != OF
JNGE Jump if Not Greater or Equal (signed) SF != OF
JNL Jump if Not Less (signed) SF=OF
JNLE Jump if Not Less or Equal (signed) ZF=0 and SF=OF
JNO Jump if Not Overflow (signed) OF=0
JNP Jump if No Parity PF=0
JNS Jump if Not Signed (signed) SF=0
JNZ Jump if Not Zero ZF=0
JO Jump if Overflow (signed) OF=1
JP Jump if Parity PF=1
JPE Jump if Parity Even PF=1
JPO Jump if Parity Odd PF=0
JS Jump if Signed (signed) SF=1
JZ Jump if Zero
34. JCXZ label
JECXZ label (386+)
Causes execution to branch to "label" if register CX is zero. Uses unsigned comparision.
JMP target
Unconditionally transfers control to "label". Jumps by default are within -32768 to 32767 bytes from the
instruction following the jump. NEAR and SHORT jumps cause the IP to be updated while FAR jumps
cause CS and IP to be updated.
LEAVE
Releases the local variables created by the previous ENTER instruction by restoring SP and BP to their
condition before the procedure stack frame was initialized.
ENTER locals, level
Modifies stack for entry to procedure for high level language.
Operand "locals" specifies the amount of storage to be allocated on the stack. "Level" specifies the nesting
level of the routine. Paired with the LEAVE instruction, this is an efficient method of entry and exit to
procedures.
35. LOOP label
Decrements CX by 1 and transfers control to "label" if CX is not Zero. The "label" operand must be within -
128 or 127 bytes of the instruction following the loop instruction
LOOPE label
LOOPZ label
Decrements CX by 1 (without modifying the flags) and transfers control to "label" if CX != 0 and the Zero Flag
is set. The "label" operand must be within -128 or 127 bytes of the instruction following the loop instruction.
LOOPNZ label
LOOPNE label
Decrements CX by 1 (without modifying the flags) and transfers control to "label" if CX != 0 and the Zero Flag
is clear. The "label" operand must be within -128 or 127 bytes of the instruction following the loop instruction.
INT num Modifies flags: TF IF
Initiates a software interrupt by pushing the flags, clearing the Trap and Interrupt Flags, pushing CS followed
by IP and loading CS:IP with the value found in the interrupt vector table. Execution then begins at the
location addressed by the new CS:IP
CALL destination
Pushes Instruction Pointer (and Code Segment for far calls) onto stack and loads Instruction Pointer with the
address of proc-name. Code continues with execution at CS:IP.
RET/RETF/RETN nBytes
Transfers control from a procedure back to the instruction address saved on the stack. "n bytes“ is an optional
number of bytes to release. Far returns pop the IP followed by the CS, while near returns pop only the IP
register.
36. Segment Register Instructions
The segment register instructions allow far pointers (segment addresses) to be loaded into the segment
registers.
LDS dest,src
Loads 32-bit pointer from memory source to destination register and DS. The offset is placed in the
destination register and the segment is placed in DS. To use this instruction the word at the lower memory
address must contain the offset and the word at the higher address must contain the segment. This simplifies
the loading of far pointers from the stack and the interrupt vector table.
LFS dest,src
Loads 32-bit pointer from memory source to destination register and FS. The offset is placed in the
destination register and the segment is placed in FS. To use this instruction the word at the lower memory
address must contain the offset and the word at the higher address must contain the segment. This simplifies
the loading of far pointers from the stack and the interrupt vector table.
LEA dest,src
Transfers offset address of "src" to the destination register.
LES dest,src
Loads 32-bit pointer from memory source to destination register and ES. The offset is placed in the destination
register and the segment is placed in ES. To use this instruction the word at the lower memory address must
contain the offset and the word at the higher address must contain the segment. This simplifies the loading of
far pointers from the stack and the interrupt vector table.
LSS dest,src
Loads 32-bit pointer from memory source to destination register and SS. The offset is placed in the destination
register and the segment is placed in SS. To use this instruction the word at the lower memory address must
contain the offset and the word at the higher address must contain the segment. This simplifies the loading of
far pointers from the stack and the interrupt vector table.
37. I/O INSTRUCTIONS
These instructions move data between the processor’s I/O ports and a register or memory.
IN accum,port
A byte, word or dword is read from "port" and placed in AL, AX or EAX respectively. If the port number is in the range of 0-255 it
can be specified as an immediate, otherwise the port number must be specified in DX. Valid port ranges on the PC are 0-1024,
though values through 65535 may be specified and recognized by third party vendors and PS/2's.
OUT port,accum
Transfers byte in AL,word in AX or dword in EAX to the specified hardware port address. If the port number is in the range of 0-
255 it can be specified as an immediate. If greater than 255 then the port number must be specified in DX. Since the PC only
decodes 10 bits of the port address, values over 1023 can only be decoded by third party vendor equipment and also map to the
port range 0-1023.
INS dest,port
INSB
INSW
INSD (386+)
Loads data from port to the destination ES:(E)DI (even if a destination operand is supplied). (E)DI is adjusted by the size of the
operand and increased if the Direction Flag is cleared and decreased if the Direction Flag is set. For INSB, INSW, INSD no
operands are allowed and the size is determined by the mnemonic.
OUTS port,src
OUTSB
OUTSW
OUTSD (386+)
Transfers a byte, word or doubleword from "src" to the hardware port specified in DX. For instructions with no operands the "src"
is located at DS:SI and SI is incremented or decremented by the size of the operand or the size dictated by the instruction
format. When the Direction Flag is set SI is decremented, when clear, SI is incremented. If the port number is in the range of 0-
255 it can be specified as an immediate. If greater than 255 then the port number must be specified in DX. Since the PC only
decodes 10 bits of the port address, values over 1023 can only be decoded by third party vendor equipment and also map to the
port range 0-1023.
38. Flag Control (EFLAG) Instructions
The flag control instructions operate on the flags in the EFLAGS register
STC Modifies flags: CF
Sets the Carry Flag to 1.
STD Modifies flags: DF
Sets the Direction Flag to 1 causing string instructions to auto-decrement SI and DI instead of auto-increment
STI Modifies flags: IF
Sets the Interrupt Flag to 1, which enables recognition of all hardware interrupts. If an interrupt is generated by a hardware
device, an End of Interrupt (EOI) must also be issued to enable other hardware interrupts of the same or lower priority.
SAHF Modifies flags: AF CF PF SF ZF
Transfers bits 0-7 of AH into the Flags Register. This includes AF, CF, PF, SF and ZF.
LAHF
Copies bits 0-7 of the flags register into AH. This includes flags AF, CF, PF, SF and ZF other bits are undefined.
AH := SF ZF xx AF xx PF xx CF
CLC Modifies flags: CF
Clears the Carry Flag.
CLD Modifies flags: DF
Clears the Direction Flag causing string instructions to increment the SI and DI index registers.
CLI Modifies flags: IF
Disables the maskable hardware interrupts by clearing the Interrupt flag. NMI's and software interrupts are not inhibited.
CLTS
Clears the Task Switched Flag in the Machine Status Register. This is a privileged operation and is generally used only by
operating system code.
39. Miscellaneous Instructions
The miscellaneous instructions provide such functions as loading an effective address, executing a “no-
operation,” and retrieving processor identification information.
NOP
This is a do nothing instruction. It results in occupation of both space and time and is most useful for
patching code segments. (This is the original XCHG AL,AL instruction)
XLAT translation-table
XLATB (masm 5.x)
Replaces the byte in AL with byte from a user table addressed by BX. The original value of AL is the index
into the translate table. The best way to discripe this is MOV AL,[BX+AL]
CPUID
Processor Identification
40. OTHERS
Floating-point instructions
• instructions for a stack-based floating-point unit (FPU).
• The FPU instructions:
addition, subtraction, negation, multiplication, division, remainder, square roots,
integer truncation, fraction truncation, and scale by power of two.
The operations also include conversion instructions, which can load or store a value from memory in any of
the following formats: binary-coded decimal, 32-bit integer, 64-bit integer, 32-bit floating-point, 64-bit floating-
point or 80-bit floating-point (upon loading, the value is converted to the currently used floating-point mode).
• transcendental functions: sine, cosine, tangent, arctangent, exponentiation with the base 2 and logarithms to
bases 2, 10, or e.
• The stack register to stack register format of the instructions:
fop st, st(n) or fop st(n), st
where st is equivalent to st(0), and st(n) is one of the 8 stack registers (st(0),st(1),…, st(7)).
Like the integers, the first operand is both the first source operand and the destination operand.
fsubr and fdivr should be singled out as first swapping the source operands before performing the
subtraction or division. The addition, subtraction, multiplication, division, store and comparison instructions
include instruction modes that pop the top of the stack after their operation is complete.
So, for example, faddp st(1), st performs the calculation st(1) = st(1) + st(0), then removes st(0) from
the top of stack, thus making what was the result in st(1) the top of the stack in st(0).
41. SIMD instructions
• Modern x86 CPUs contain SIMD instructions,
• which largely perform the same operation in parallel on many values encoded in a wide SIMD register.
• Various instruction technologies support different operations on different register sets, but taken as
complete whole (from MMX to SSE4.2) they include general computations on integer or floating point
arithmetic (addition, subtraction, multiplication, shift, minimization, maximization, comparison, division or
square root).
So for example, paddw mm0, mm1 performs 4 parallel 16-bit (indicated by the w) integer adds (indicated by
the padd) of mm0 values to mm1 and stores the result in mm0.
• Streaming SIMD Extensions or SSE also includes a floating point mode in which only the very first value of
the registers is actually modified (expanded in SSE2).
• Some other unusual instructions have been added including a sum of absolute differences (used for motion
estimation in video compression, such as is done in MPEG) and a 16-bit multiply accumulation instruction
(useful for software-based alpha-blending and digital filtering).
• SSE (since SSE3) and 3DNow! extensions include addition and subtraction instructions for treating paired
floating point values like complex numbers.
These instruction sets also include numerous fixed sub-word instructions for shuffling, inserting and
extracting the values around within the registers. In addition there are instructions for moving data between
the integer registers and XMM (used in SSE)/FPU (used in MMX) registers.