This document provides an overview of floating point numbers and their representation. It discusses how floating point numbers are used to represent very large and small numbers with exponents. The IEEE 754 standard for floating point representation is described, including the use of sign-magnitude, biased exponents, normalization, denormalization and special values like infinity and NaN. Single and double precision floating point number formats are defined according to IEEE 754. Methods for converting between decimal and binary floating point values are demonstrated through examples.
This document provides an overview of floating point representation and arithmetic based on the IEEE 754 standard. It discusses topics such as normalized and denormalized values, special values like infinity and NaN, and examples using tiny 8-bit floating point formats to illustrate concepts like dynamic range and value distribution. The goal is to explain how computers represent inexact real numbers using a finite number of bits.
The document discusses floating point numbers and the IEEE 754 standard. It describes how floating point numbers represent numbers with fractions using a sign bit, exponent field, and fraction field. The IEEE 754 standard uses a biased exponent representation for normalized floating point values, along with special values like infinity and NaN. It also details denormalized numbers, which allow gradual underflow to zero.
Bca 2nd sem-u-1.8 digital logic circuits, digital component floting and fixed...Rai University
Digital Logic Circuits, Digital Component and Data Representation discusses floating point numbers and the IEEE 754 standard. It describes how floating point numbers use a sign bit, exponent field, and fraction field to represent values too large or small for integers. The standard uses biased exponent representation and defines special values like infinity, zero, and NaN. Floating point numbers can be normalized, denormalized, or have special values and are ordered by magnitude.
B.sc cs-ii-u-1.8 digital logic circuits, digital component floting and fixed ...Rai University
Digital Logic Circuits, Digital Component and Data Representation discusses floating point numbers and the IEEE 754 standard. It describes how floating point numbers use a sign bit, exponent field, and fraction field to represent values in scientific notation. It also summarizes the IEEE 754 standard for single and double precision floating point numbers, including how special values like infinity and NaN are represented.
Exponential notation can be used to represent very large and very small numbers in a normalized form. A floating point number uses a sign, exponent, and mantissa to represent values in a fixed number of bits. Common standards like IEEE 754 specify single and double precision formats that use 1 sign bit, 8 or 11 exponent bits, and 23 or 52 mantissa bits respectively. Calculations with floating point numbers require aligning the exponents before adding or multiplying the mantissas and adjusting the result exponent.
Fixed-point and floating-point numbers can be represented in computers using binary numbers. Floating-point numbers represent numbers in scientific notation with a sign, mantissa, and exponent. In 8-bit floating point, numbers use 1 bit for sign, 3 bits for exponent, and 4 bits for mantissa, such as 0.001 x 21 = 2.25. Larger precision formats such as 32-bit and 64-bit floating point according to the IEEE standard use more bits for exponent and mantissa.
The IEEE 754 standard defines the floating point representation of real numbers. It uses a sign-magnitude format that includes:
1) A sign bit to indicate positive or negative.
2) A biased exponent stored with an offset to allow negative exponents.
3) A mantissa or significant digits with a leading 1 not explicitly stored.
For single precision floats, this uses 32 bits broken into an 8-bit exponent and 23-bit mantissa. Doubles use 64 bits with an 11-bit exponent and 52-bit mantissa for greater precision and range.
Digital and Logic Design Chapter 1 binary_systemsImran Waris
This document discusses binary number systems and digital computing. It covers binary numbers, number base conversions between decimal, binary, octal and hexadecimal. It also discusses binary coding techniques like binary-coded decimal, signed magnitude representation, one's complement and two's complement representations for negative numbers.
This document provides an overview of floating point representation and arithmetic based on the IEEE 754 standard. It discusses topics such as normalized and denormalized values, special values like infinity and NaN, and examples using tiny 8-bit floating point formats to illustrate concepts like dynamic range and value distribution. The goal is to explain how computers represent inexact real numbers using a finite number of bits.
The document discusses floating point numbers and the IEEE 754 standard. It describes how floating point numbers represent numbers with fractions using a sign bit, exponent field, and fraction field. The IEEE 754 standard uses a biased exponent representation for normalized floating point values, along with special values like infinity and NaN. It also details denormalized numbers, which allow gradual underflow to zero.
Bca 2nd sem-u-1.8 digital logic circuits, digital component floting and fixed...Rai University
Digital Logic Circuits, Digital Component and Data Representation discusses floating point numbers and the IEEE 754 standard. It describes how floating point numbers use a sign bit, exponent field, and fraction field to represent values too large or small for integers. The standard uses biased exponent representation and defines special values like infinity, zero, and NaN. Floating point numbers can be normalized, denormalized, or have special values and are ordered by magnitude.
B.sc cs-ii-u-1.8 digital logic circuits, digital component floting and fixed ...Rai University
Digital Logic Circuits, Digital Component and Data Representation discusses floating point numbers and the IEEE 754 standard. It describes how floating point numbers use a sign bit, exponent field, and fraction field to represent values in scientific notation. It also summarizes the IEEE 754 standard for single and double precision floating point numbers, including how special values like infinity and NaN are represented.
Exponential notation can be used to represent very large and very small numbers in a normalized form. A floating point number uses a sign, exponent, and mantissa to represent values in a fixed number of bits. Common standards like IEEE 754 specify single and double precision formats that use 1 sign bit, 8 or 11 exponent bits, and 23 or 52 mantissa bits respectively. Calculations with floating point numbers require aligning the exponents before adding or multiplying the mantissas and adjusting the result exponent.
Fixed-point and floating-point numbers can be represented in computers using binary numbers. Floating-point numbers represent numbers in scientific notation with a sign, mantissa, and exponent. In 8-bit floating point, numbers use 1 bit for sign, 3 bits for exponent, and 4 bits for mantissa, such as 0.001 x 21 = 2.25. Larger precision formats such as 32-bit and 64-bit floating point according to the IEEE standard use more bits for exponent and mantissa.
The IEEE 754 standard defines the floating point representation of real numbers. It uses a sign-magnitude format that includes:
1) A sign bit to indicate positive or negative.
2) A biased exponent stored with an offset to allow negative exponents.
3) A mantissa or significant digits with a leading 1 not explicitly stored.
For single precision floats, this uses 32 bits broken into an 8-bit exponent and 23-bit mantissa. Doubles use 64 bits with an 11-bit exponent and 52-bit mantissa for greater precision and range.
Digital and Logic Design Chapter 1 binary_systemsImran Waris
This document discusses binary number systems and digital computing. It covers binary numbers, number base conversions between decimal, binary, octal and hexadecimal. It also discusses binary coding techniques like binary-coded decimal, signed magnitude representation, one's complement and two's complement representations for negative numbers.
Real numbers can be stored using floating point representation, which separates a real number into three parts: a sign bit, exponent, and mantissa. The exponent indicates the power of the base 10 that the mantissa is multiplied by. Common standards like IEEE 754 define single and double precision formats that allocate more bits for higher precision at the cost of range. Summarizing a floating point number involves determining the exponent by shifting the decimal, converting the number to a leading digit mantissa, and writing the sign, exponent, and mantissa based on the specified precision format.
Logic Circuits Design - "Chapter 1: Digital Systems and Information"Ra'Fat Al-Msie'deen
Logic Circuits Design: This material is based on chapter 1 of “Logic and Computer Design Fundamentals” by M. Morris Mano, Charles R. Kime and Tom Martin
The document discusses various number systems including decimal, binary, and signed binary numbers. It provides the following key points:
1) Decimal numbers use ten digits from 0-9 while binary only uses two digits, 0 and 1. Binary numbers represent values through place values determined by powers of two.
2) Conversions can be done between decimal and binary numbers through either summing the place value weights or repeated division/multiplication by two.
3) Binary arithmetic follows simple rules to add, subtract, multiply and divide numbers in binary representation.
4) Signed binary numbers use a sign bit to indicate positive or negative values, with the most common 2's complement form representing negative numbers as the 2's
The document discusses different methods for representing integers and fractional numbers in binary, including sign and modulus representation, one's complement, two's complement, fixed point representation, and floating point representation. It provides examples and activities to help understand how to convert between decimal and binary representations using these methods.
The document discusses different number systems used in computing, including binary, hexadecimal, and octal. It explains that computers internally use the binary number system to represent data and perform calculations. Hexadecimal provides a shorthand way to work with binary numbers, with each hex digit corresponding to four binary digits. The document also covers how to convert between decimal, binary, hexadecimal, and octal numbers. It provides examples of expanding numbers in different bases, as well as adding and subtracting binary numbers using complements.
The document discusses binary number representation and arithmetic. It explains decimal to binary conversion. It also describes signed number representation using sign-magnitude and one's complement and two's complement methods. The key advantages of two's complement are that addition can be performed using the same method for positive and negative numbers. Subtraction using two's complement is performed by adding the number to the complement of the subtrahend. Examples of binary addition and subtraction are provided to illustrate these concepts.
This document provides an overview of data representation in computers. It discusses binary, decimal, hexadecimal, and floating point number systems. Binary numbers use only two digits, 0 and 1, and can represent values as sums of powers of two. Decimal uses ten digits from 0-9. Hexadecimal uses sixteen values from 0-9 and A-F. Negative binary integers can be represented using ones' complement or twos' complement methods. Twos' complement avoids multiple representations of zero and is commonly used in computers. Converting between number bases involves expressing the value in one base using the digits of another.
To convert a binary number to octal:
1) Separate the binary number into groups of 3 digits from the right.
2) Convert each 3-digit group to its octal equivalent.
3) The octal number is the combination of each converted group from right to left.
Digital systems represent information using discrete binary values of 0 and 1 rather than continuous analog values. Binary numbers use a base-2 numbering system with place values that are powers of 2. There are various number systems like decimal, binary, octal and hexadecimal that use different number bases and represent the same number in different ways. Complements are used in binary arithmetic to perform subtraction by adding the 1's or 2's complement of a number. The 1's complement is obtained by inverting all bits, while the 2's complement is obtained by inverting all bits and adding 1.
The document discusses different number systems including binary, octal, decimal, and hexadecimal. It explains that number systems have a radix or base, which determines the set of symbols used and their positional values. The key representations for binary numbers discussed are sign-magnitude, one's complement, and two's complement, which provide different methods for representing positive and negative numbers. The document provides examples of addition, subtraction, multiplication, and division operations in binary.
The document discusses floating point arithmetic and the IEEE 754 standard. It covers topics such as floating point representation including normalized and denormalized numbers, special values like infinity and NaN, rounding modes, and properties of floating point operations like addition and multiplication. Examples are provided to illustrate concepts like normalized encoding, the dynamic range of different representations, and answers to puzzles about floating point comparisons and operations.
Real numbers include whole numbers, rational numbers like fractions and decimals, and irrational numbers like pi. They can be positive, negative or zero. In computing, real numbers are represented using floating point notation, which stores numbers as a mantissa and exponent. The mantissa holds the significant digits of the number, while the exponent tracks the decimal place. Increasing the bit size of the mantissa improves accuracy, while increasing the exponent size expands the representable range of numbers.
The document discusses different number systems used in computing like binary, decimal, octal and hexadecimal. It explains that computers use the binary number system and each system has a base and set of digits. Decimal uses base 10 with 0-9 digits. Binary uses base 2 with 0-1 digits. Octal uses base 8 with 0-7 digits. Hexadecimal uses base 16 with 0-9 and A-F digits. It also provides examples of how to convert between decimal and these other number systems.
This document discusses floating point number representation in IEEE-754 format. It explains that floating point numbers consist of a sign bit, exponent, and mantissa. It describes single and double precision formats, which use excess-127 and excess-1023 exponent biases respectively. Examples are given of representing sample numbers in both implicit and explicit normalized forms using single and double precision formats.
The document discusses different number systems used to represent numeric values in computers, including binary, octal, hexadecimal, and decimal. It provides examples of converting between these number systems using techniques like repeated division and multiplying digits by their place values. Character encoding schemes like ASCII, EBCDIC, and Unicode are also covered, explaining how they allow computers to represent letters, punctuation, and other characters with binary values.
The document discusses various number systems including binary, decimal, octal and hexadecimal. It covers how to convert between these number systems using techniques like dividing by the base, tracking remainders, and grouping bits. Examples are provided for converting between the different systems. Common number prefixes like kilo, mega and giga are also explained in the context of computing.
This document provides an overview of Boolean algebra and logic gates. It begins with reviewing binary number systems, binary arithmetic, and binary codes. It then covers Boolean algebra, truth tables, canonical and standard forms. It also discusses logic operations and logic gates like Karnaugh maps up to 6 variables including don't care conditions. Finally, it discusses sum of products and products of sum representations.
This document summarizes 6 different studies on floating point unit designs. The studies examined fully pipelined single-precision units, energy efficient designs, fused add-subtract units, optimized logarithmic architectures, unified rectangular designs, and an FPGA implementation. The studies described the architectures, advantages like performance and power improvements, and disadvantages like increased complexity. Overall, the document reviews optimizations for floating point units across different technologies.
The document discusses fixed-point arithmetic, which represents numbers using a fixed number of bits after the binary point. It compares fixed-point to integer and floating-point representations. It then covers notation for representing fixed-point numbers, converting between types, rounding methods, basic operations, and implementing common mathematical functions using fixed-point arithmetic. Several open-source fixed-point arithmetic libraries are also mentioned.
Real numbers can be stored using floating point representation, which separates a real number into three parts: a sign bit, exponent, and mantissa. The exponent indicates the power of the base 10 that the mantissa is multiplied by. Common standards like IEEE 754 define single and double precision formats that allocate more bits for higher precision at the cost of range. Summarizing a floating point number involves determining the exponent by shifting the decimal, converting the number to a leading digit mantissa, and writing the sign, exponent, and mantissa based on the specified precision format.
Logic Circuits Design - "Chapter 1: Digital Systems and Information"Ra'Fat Al-Msie'deen
Logic Circuits Design: This material is based on chapter 1 of “Logic and Computer Design Fundamentals” by M. Morris Mano, Charles R. Kime and Tom Martin
The document discusses various number systems including decimal, binary, and signed binary numbers. It provides the following key points:
1) Decimal numbers use ten digits from 0-9 while binary only uses two digits, 0 and 1. Binary numbers represent values through place values determined by powers of two.
2) Conversions can be done between decimal and binary numbers through either summing the place value weights or repeated division/multiplication by two.
3) Binary arithmetic follows simple rules to add, subtract, multiply and divide numbers in binary representation.
4) Signed binary numbers use a sign bit to indicate positive or negative values, with the most common 2's complement form representing negative numbers as the 2's
The document discusses different methods for representing integers and fractional numbers in binary, including sign and modulus representation, one's complement, two's complement, fixed point representation, and floating point representation. It provides examples and activities to help understand how to convert between decimal and binary representations using these methods.
The document discusses different number systems used in computing, including binary, hexadecimal, and octal. It explains that computers internally use the binary number system to represent data and perform calculations. Hexadecimal provides a shorthand way to work with binary numbers, with each hex digit corresponding to four binary digits. The document also covers how to convert between decimal, binary, hexadecimal, and octal numbers. It provides examples of expanding numbers in different bases, as well as adding and subtracting binary numbers using complements.
The document discusses binary number representation and arithmetic. It explains decimal to binary conversion. It also describes signed number representation using sign-magnitude and one's complement and two's complement methods. The key advantages of two's complement are that addition can be performed using the same method for positive and negative numbers. Subtraction using two's complement is performed by adding the number to the complement of the subtrahend. Examples of binary addition and subtraction are provided to illustrate these concepts.
This document provides an overview of data representation in computers. It discusses binary, decimal, hexadecimal, and floating point number systems. Binary numbers use only two digits, 0 and 1, and can represent values as sums of powers of two. Decimal uses ten digits from 0-9. Hexadecimal uses sixteen values from 0-9 and A-F. Negative binary integers can be represented using ones' complement or twos' complement methods. Twos' complement avoids multiple representations of zero and is commonly used in computers. Converting between number bases involves expressing the value in one base using the digits of another.
To convert a binary number to octal:
1) Separate the binary number into groups of 3 digits from the right.
2) Convert each 3-digit group to its octal equivalent.
3) The octal number is the combination of each converted group from right to left.
Digital systems represent information using discrete binary values of 0 and 1 rather than continuous analog values. Binary numbers use a base-2 numbering system with place values that are powers of 2. There are various number systems like decimal, binary, octal and hexadecimal that use different number bases and represent the same number in different ways. Complements are used in binary arithmetic to perform subtraction by adding the 1's or 2's complement of a number. The 1's complement is obtained by inverting all bits, while the 2's complement is obtained by inverting all bits and adding 1.
The document discusses different number systems including binary, octal, decimal, and hexadecimal. It explains that number systems have a radix or base, which determines the set of symbols used and their positional values. The key representations for binary numbers discussed are sign-magnitude, one's complement, and two's complement, which provide different methods for representing positive and negative numbers. The document provides examples of addition, subtraction, multiplication, and division operations in binary.
The document discusses floating point arithmetic and the IEEE 754 standard. It covers topics such as floating point representation including normalized and denormalized numbers, special values like infinity and NaN, rounding modes, and properties of floating point operations like addition and multiplication. Examples are provided to illustrate concepts like normalized encoding, the dynamic range of different representations, and answers to puzzles about floating point comparisons and operations.
Real numbers include whole numbers, rational numbers like fractions and decimals, and irrational numbers like pi. They can be positive, negative or zero. In computing, real numbers are represented using floating point notation, which stores numbers as a mantissa and exponent. The mantissa holds the significant digits of the number, while the exponent tracks the decimal place. Increasing the bit size of the mantissa improves accuracy, while increasing the exponent size expands the representable range of numbers.
The document discusses different number systems used in computing like binary, decimal, octal and hexadecimal. It explains that computers use the binary number system and each system has a base and set of digits. Decimal uses base 10 with 0-9 digits. Binary uses base 2 with 0-1 digits. Octal uses base 8 with 0-7 digits. Hexadecimal uses base 16 with 0-9 and A-F digits. It also provides examples of how to convert between decimal and these other number systems.
This document discusses floating point number representation in IEEE-754 format. It explains that floating point numbers consist of a sign bit, exponent, and mantissa. It describes single and double precision formats, which use excess-127 and excess-1023 exponent biases respectively. Examples are given of representing sample numbers in both implicit and explicit normalized forms using single and double precision formats.
The document discusses different number systems used to represent numeric values in computers, including binary, octal, hexadecimal, and decimal. It provides examples of converting between these number systems using techniques like repeated division and multiplying digits by their place values. Character encoding schemes like ASCII, EBCDIC, and Unicode are also covered, explaining how they allow computers to represent letters, punctuation, and other characters with binary values.
The document discusses various number systems including binary, decimal, octal and hexadecimal. It covers how to convert between these number systems using techniques like dividing by the base, tracking remainders, and grouping bits. Examples are provided for converting between the different systems. Common number prefixes like kilo, mega and giga are also explained in the context of computing.
This document provides an overview of Boolean algebra and logic gates. It begins with reviewing binary number systems, binary arithmetic, and binary codes. It then covers Boolean algebra, truth tables, canonical and standard forms. It also discusses logic operations and logic gates like Karnaugh maps up to 6 variables including don't care conditions. Finally, it discusses sum of products and products of sum representations.
This document summarizes 6 different studies on floating point unit designs. The studies examined fully pipelined single-precision units, energy efficient designs, fused add-subtract units, optimized logarithmic architectures, unified rectangular designs, and an FPGA implementation. The studies described the architectures, advantages like performance and power improvements, and disadvantages like increased complexity. Overall, the document reviews optimizations for floating point units across different technologies.
The document discusses fixed-point arithmetic, which represents numbers using a fixed number of bits after the binary point. It compares fixed-point to integer and floating-point representations. It then covers notation for representing fixed-point numbers, converting between types, rounding methods, basic operations, and implementing common mathematical functions using fixed-point arithmetic. Several open-source fixed-point arithmetic libraries are also mentioned.
The document discusses computer arithmetic and binary numbers. It begins by explaining why computers use the binary number system instead of decimal. The key reasons are that electronic components can only represent two states, binary is simpler for circuit design, and arithmetic is possible with binary. The document then covers the basic arithmetic operations of addition, subtraction, multiplication, and division in binary. It provides rules for each operation and examples to illustrate how to perform the calculations in binary. Finally, it discusses complementary subtraction and the additive method for multiplication and division.
This document discusses computer arithmetic and floating point representation. It begins with an introduction to computer arithmetic and covers topics like addition, subtraction, multiplication, division and their algorithms. It then discusses floating point representation which uses scientific notation to represent real numbers. Key aspects covered include single and double precision formats, normalized and denormalized numbers, overflow and underflow, and biased exponent representation. Examples are provided to illustrate floating point addition and multiplication. The document also discusses floating point instructions in MIPS and the need for accurate arithmetic in floating point operations.
This document discusses computer arithmetic and hardware for signed-magnitude addition and subtraction. It contains the following key points:
1) Computer arithmetic refers to basic operations like addition, subtraction, multiplication, and division performed with operands. It provides examples of signed-magnitude addition and subtraction rules and the hardware used to perform these operations.
2) The hardware for signed-magnitude addition and subtraction includes an A register, B register, complementer, parallel adder, and mode control. It performs the operations by setting the registers and control signals.
3) Algorithms for signed 2's complement addition and subtraction are also presented, showing how numbers are added or subtracted based on their relative magnitudes stored in the registers.
Booth's multiplication algorithm was invented by Andrew D. Booth in 1951 while studying crystallography at Birkbeck College in London. It improves the speed of computer multiplication by reducing the number of additions or subtractions needed. The algorithm uses a grid with the multiplicand in the top row, the negative multiplicand in the middle row, and the multiplier in the bottom row. It then iteratively shifts and adds or subtracts based on the last two bits of the product to build up the final result in fewer steps than standard addition methods. Several examples are provided to demonstrate how the algorithm works.
This document discusses floating point arithmetic. It begins by introducing the IEEE floating point standard and topics like rounding and floating point operations. It then discusses how fractional binary numbers are represented and the numerical format used in floating point, including normalized and denormalized values. Special values like infinity, NaN, positive/negative zero are also covered. The document concludes by discussing properties of floating point addition, multiplication, and how they differ from mathematical ideals due to rounding errors and special values.
This document discusses floating point arithmetic. It begins by recalling scientific notation and describing how numbers are represented in binary scientific notation according to the IEEE 754 floating point standard. It then provides examples of how common numbers like 1, 1/2, -2, 0, infinity, and NaN are represented. The document concludes by noting that floating point arithmetic has limitations and is not the same as typical math, and discusses some historical floating point failures in systems.
1. The document describes the von Neumann architecture and its key components including the ALU, control unit, memory and I/O devices.
2. It explains the structure of the von Neumann machine and details the functions of components like the program counter, memory address register, and instruction register.
3. The document covers integer and floating point representation in binary, including sign-magnitude, two's complement, and IEEE 754 standard. It describes arithmetic operations like addition, subtraction, multiplication and division on binary numbers.
This document discusses floating point numbers, representation, arithmetic, and numeric coprocessors. It describes how floating point numbers are represented in binary using the sign, exponent, and significand. Arithmetic on floating point numbers is approximate due to limited precision. Numeric coprocessors perform floating point operations in hardware for improved speed over software methods. Examples demonstrate using a coprocessor to implement the quadratic formula, read arrays from files, and find prime numbers.
The document discusses digital and analog systems. It explains that digital systems represent information as discrete values using bits, whereas analog systems represent information as continuous values. It provides examples of digital and analog signals and discusses how a continuous analog signal can be converted to a discrete digital signal through sampling and quantization. It also covers binary, octal, and hexadecimal number systems and how to convert between them. Finally, it discusses binary addition and subtraction using complement representations.
BOOTH ALGO, DIVISION(RESTORING _ NON RESTORING) etc etcAbhishek Rajpoot
The document discusses various aspects of central processing unit (CPU) architecture and arithmetic operations. It covers the main components of a CPU - the arithmetic logic unit (ALU), control unit, and registers. It then describes different data representation methods including fixed-point and floating-point numbers. Various arithmetic operations for both types of numbers such as addition, subtraction, multiplication, and division are explained. Different adder designs like ripple-carry adder and carry lookahead adder are also summarized.
Aviraj --floating point representation and arithmetic.pptxIronmanhmarvel
This document discusses floating point representation and arithmetic. It begins by introducing floating point numbers and their approximate nature with finite range and limited precision. It then covers the IEEE standard for single and double precision floating point numbers which uses a sign bit, exponent, and significand. Examples are given of representing decimal numbers in single and double precision format. The document concludes by discussing floating point addition through aligning exponents, adding significands, normalizing, and rounding the result.
1) Floating point numbers use a binary format defined by the IEEE 754 standard, which represents values as (-1)s × m × 2e, where s is the sign bit, m is the mantissa, and e is the exponent.
2) The mantissa is a fraction stored in bits with an implied leading 1, and the exponent is an offset from a bias of 127, allowing the representation of numbers between roughly 10-38 and 1038.
3) Converting values to their binary representation involves repeatedly multiplying/dividing the fractional part by 2 until the value is in the form 1.xxx × 2e, and the exponent tracks the number of times the decimal is shifted.
This document provides lecture notes on digital system design. It covers topics like logic simplification, combinational logic design, understanding binary and other number systems, binary operations, and Boolean algebra. The first section discusses decimal, binary, octal and hexadecimal number systems. Later sections explain binary addition, subtraction, multiplication and conversions between number bases. Signed number representations like 1's complement and 2's complement are also introduced. Finally, the document discusses Boolean algebra, logic functions, truth tables, and basic logic gates like AND and INVERTER.
Review of Number systems - Logic gates - Boolean
algebra - Boolean postulates and laws - De-Morgan’s
Theorem, Principle of Duality - Simplification using
Boolean algebra - Canonical forms, Sum of product and
Product of sum - Minimization using Karnaugh map -
NAND and NOR Implementation.
The document discusses floating point arithmetic and assembly language basics. It provides details about an upcoming exam, Project #6 which involves using C to write a library module and driver module. It also reviews the system bus model and describes the main components of an ARM microprocessor including the RAM, control unit, integer unit, floating point unit, and optional coprocessor.
The document discusses different number systems including binary, decimal, octal and hexadecimal. It provides examples of converting between these number systems. The key points covered are:
- Binary, decimal, octal and hexadecimal number systems use different bases (2, 10, 8, 16 respectively) and sets of digits.
- Numbers can be converted between these systems through repetitive division or multiplication by the base to determine each place value digit.
- Fractional numbers are represented similarly with place values decreasing as negative powers of the base moving right of the radix point.
IEEE 754 Standards For Floating Point Representation.pdfkkumaraditya301
The document discusses floating point number representation according to the IEEE 754 standard. It describes:
- Single precision representation uses 32 bits with 1 sign bit, 8 exponent bits, and 23 mantissa bits.
- Double precision representation uses 64 bits with 1 sign bit, 11 exponent bits, and 52 mantissa bits.
- Floating point numbers are represented as (-1)S × 1.M × 2E, where S is the sign bit, M is the mantissa, and E is the exponent.
- Addition of floating point numbers involves normalizing the numbers to the same exponent before adding the mantissas.
The document summarizes computer arithmetic and floating point representation. It discusses:
1) The arithmetic logic unit handles integer and floating point calculations. Integer values are represented in binary using two's complement. Floating point values use a sign-magnitude format with a fixed or moving binary point.
2) Addition and subtraction of integers is done through normal binary addition and subtraction. Multiplication requires generating partial products and addition. Division uses a long division approach.
3) Floating point numbers follow the IEEE 754 standard which represents values as ±mantissax2exponent in 32 or 64 bit formats. Arithmetic requires aligning operands and performing operations on significands and exponents.
The document summarizes computer arithmetic and the arithmetic logic unit (ALU). It discusses:
1) The ALU handles integer and floating point calculations. It may have a separate floating point unit.
2) There are different methods for representing integers like sign-magnitude and two's complement. Two's complement is commonly used.
3) Floating point numbers use a sign, significand, and exponent to represent real numbers in a normalized format like ±.significand × 2exponent.
The document summarizes computer arithmetic and the arithmetic logic unit (ALU). It discusses:
1) The ALU handles integer and floating point calculations. It may have a separate floating point unit.
2) There are different methods for representing integers like sign-magnitude and two's complement. Two's complement is commonly used.
3) Floating point numbers use a sign, significand, and exponent to represent real numbers in a normalized format like ±.significand × 2exponent.
The document discusses digital systems and binary numbers. It defines digital systems as systems that manipulate discrete elements of information, such as binary digits represented by the values 0 and 1. It explains how binary numbers are represented and arithmetic operations like addition, subtraction, multiplication and division are performed on binary numbers. It also discusses number base conversions between decimal, binary, octal and hexadecimal numbering systems. Finally, it covers binary complements including 1's complement, 2's complement and subtraction using complements.
This presentation was made for student batch 2017-2018 of MBSTU. Here we will get
IEEE 32 bit floating representation .
IEEE 754 floating point representation
32 bit floating point Addition
The document discusses number representation in computers. It begins by introducing different number systems like decimal, binary, and hexadecimal. It then discusses how numeric data is stored in memory, including how integers, floats, characters and strings are represented. It also covers binary operations like addition, subtraction, multiplication and division. Finally, it discusses signed number representation using sign-magnitude, one's complement and two's complement methods.
Similar to digital logic circuits, digital component floting and fixed point (20)
Rai University provides high quality education for MSc, Law, Mechanical Engineering, BBA, MSc, Computer Science, Microbiology, Hospital Management, Health Management and IT Engineering.
The document discusses various types of retailers including specialty stores, department stores, supermarkets, convenience stores, and discount stores. It then covers marketing decisions for retailers related to target markets, product assortment, store services, pricing, promotion, and store location. The document also discusses wholesaling, including the functions of wholesalers, types of wholesalers, and marketing decisions faced by wholesalers.
This document discusses marketing channels and channel management. It defines marketing channels as sets of interdependent organizations that make a product available for use. Channels perform important functions like information gathering, stimulating purchases, negotiating prices, ordering, financing inventory, storage, and payment. Channel design considers customer expectations, objectives, constraints, alternatives that are evaluated. Channel management includes selecting, training, motivating, and evaluating channel members. Channels are dynamic and can involve vertical, horizontal, and multi-channel systems. Conflicts between channels must be managed to balance cooperation and competition.
The document discusses integrated marketing communication and its various elements. It defines integrated marketing communication as combining different communication modes like advertising, sales promotion, public relations, personal selling, and direct marketing to provide a complete communication portfolio to audiences. It also discusses the communication process and how each element of the marketing mix communicates to customers. The document provides details on the key components of an integrated marketing communication mix and how it can be used to build brand equity.
Pricing is a key element in determining the profitability and success of a business. The price must be set correctly - if too high, demand may decrease and the product may be priced out of the market, but if too low, revenue may not cover costs. Pricing strategies should consider the product lifecycle stage, costs, competitors, and demand factors. Common pricing methods include penetration pricing for new products, market skimming for premium products, value pricing based on perceived worth, and cost-plus pricing which adds a markup to costs. Price affects demand through price elasticity, with elastic demand more sensitive to price changes.
The document discusses various aspects of branding such as definitions of a brand, brand positioning, brand name selection, brand sponsorship, brand development strategies like line extensions and brand extensions, challenges in branding, importance of packaging, labeling, and universal product codes. It provides examples of well-known brands and analyzes their branding strategies. The key points covered are creating emotional value for customers, building relationships and loyalty, using brands to project aspirational lifestyles and values to command premium prices.
This document outlines the key stages in the new product development (NPD) process. It begins with generating ideas for new products, which can come from internal or external sources. Ideas are then screened using criteria like market size and development costs. Successful concepts are developed and test marketed to customers. If testing goes well, the product proceeds to commercialization with a full market launch. The NPD process helps companies focus their resources on projects most likely to be rewarding and brings new products to market more quickly. It describes common challenges in NPD like defining specifications and managing resources and timelines, and how to overcome them through planning and cross-functional involvement.
A product is an item offered for sale that can be physical or virtual. It has a life cycle and may need to be adapted over time to remain relevant. A product needs to serve a purpose, function well, and be effectively communicated to users. It also requires a name to help it stand out.
A product hierarchy has multiple levels from core needs down to specific items. These include the need, product family, class, line, type, and item or stock keeping unit.
Products go through a life cycle with stages of development, introduction, growth, maturity, and decline. Marketing strategies must adapt to each stage such as heavy promotion and price changes in introduction and maturity.
This document discusses barriers between marketing researchers and managerial decision makers. It identifies three types of barriers: behavioral, process, and organizational. Specific behavioral barriers discussed include confirmatory bias, the difficulty balancing creativity and data, and the newcomer syndrome. Process barriers include unsuccessful problem definition and research rigidity. Organizational barriers include misuse of information asymmetries. The document also discusses ethical issues in marketing research such as deceptive practices, invasion of privacy, and breaches of confidentiality.
The document discusses best practices for organizing, writing, and presenting a marketing research report. It provides guidance on structuring the report with appropriate headings, formatting the introduction and conclusion/recommendation sections, effectively utilizing visuals like tables and graphs, and tips for an ethical and impactful oral presentation of the findings. The goal is to clearly communicate the research results and insights to the client to inform their decision-making.
This document discusses marketing research and its key steps and methods. Marketing research involves collecting, analyzing and communicating information to make informed marketing decisions. There are 5 key steps in marketing research: 1) define the problem, 2) collect data, 3) analyze and interpret data, 4) reach a conclusion, 5) implement the research. Common data collection methods include interviews, surveys, observations, and experiments. The data is then analyzed using statistical techniques like frequency, percentages, and means to interpret the findings and their implications for marketing decisions.
Bdft ii, tmt, unit-iii, dyeing & types of dyeing,Rai University
Dyeing is a method of imparting color to textiles by applying dyes. There are two major types of dyes - natural dyes extracted from plants/animals/minerals and synthetic dyes made in a laboratory. Dyes can be applied at different stages of textile production from fibers to yarns to fabrics to finished garments. Common dyeing methods include stock dyeing, yarn dyeing, piece dyeing, and garment dyeing. Proper dye and method selection are needed for good colorfastness.
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02Rai University
The government requires public revenue to fund its political, social, and economic activities. There are three main sources of public revenue: tax revenue, non-tax revenue, and capital receipts. Tax revenue is collected through direct taxes like income tax, which are paid directly to the government, and indirect taxes like sales tax, where the burden can be shifted to other parties. Non-tax revenue sources include profits from public enterprises, railways, postal services, and the Reserve Bank of India. While taxes provide wide coverage and influence production, they can also reduce incentives to work and increase inequality.
Public expenditure has increasingly grown over time to fulfill three main roles: protecting society, protecting individuals, and funding public works. The growth can be attributed to several causes like increased income, welfare state ideology, effects of war, increased resources and ability to finance expenditures, inflation, and effects of democracy, socialism, and development. There are also canons that govern public spending like benefits, economy, and approval by authorities. The effects of public expenditure include impacts on consumption, production through efficiency, incentives and allocation, and distribution of resources.
Public finance involves the taxing and spending activities of government. It focuses on the microeconomic functions of government and examines taxes and spending. Government ideology can view the community or individual as most important. In the US, the federal government has more spending flexibility than states. Government spending has increased significantly as a percentage of GDP from 1929 to 2001. Major items of federal spending have shifted from defense to entitlements like Social Security and Medicare. Revenues mainly come from individual income taxes, payroll taxes, and corporate taxes at the federal level and property, sales, and income taxes at the state and local levels.
This document provides an overview of public finance. It defines public finance as the study of how governments raise money through taxes and spending, and how these activities affect the economy. It discusses why public finance is needed to provide public goods and services, redistribute wealth, and correct issues like pollution. The key aspects of public finance covered are government spending, revenue sources like income taxes, and how fiscal policy around spending and taxation can influence economic performance.
The document discusses the classical theory of inflation and how it relates to money supply. It states that inflation is defined as a rise in the overall price level in an economy. The quantity theory of money explains that inflation is primarily caused by increases in the money supply as controlled by the central bank. When the money supply grows faster than the amount of goods and services, it leads to too much money chasing too few goods and a rise in prices, or inflation. The document also notes that hyperinflation, which is a very high rate of inflation, can occur when governments print too much money to fund spending.
Bsc agri 2 pae u-3.2 introduction to macro economicsRai University
This document provides an introduction to macroeconomics. It defines macroeconomics as the study of national economies and the policies that governments use to affect economic performance. It discusses key issues macroeconomists address such as economic growth, business cycles, unemployment, inflation, international trade, and macroeconomic policies. It also outlines different macroeconomic theories including classical, Keynesian, and unified approaches.
Market structure identifies how a market is composed in terms of the number of firms, nature of products, degree of monopoly power, and barriers to entry. Markets range from perfect competition to pure monopoly based on imperfections. The level of competition affects consumer benefits and firm behavior. While models simplify reality, they provide benchmarks to analyze real world situations, where regulation may influence firm actions.
This document discusses the concept of perfect competition in economics. It defines perfect competition as a market with many small firms, identical products, free entry and exit of firms, and complete information. The document outlines the key features of perfect competition including: a large number of buyers and sellers, homogeneous products, no barriers to entry or exit, and profit maximization by firms. It also discusses the short run and long run equilibrium of a perfectly competitive firm, including cases where firms experience super normal profits, normal profits, or losses.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
How to Add Chatter in the odoo 17 ERP ModuleCeline George
In Odoo, the chatter is like a chat tool that helps you work together on records. You can leave notes and track things, making it easier to talk with your team and partners. Inside chatter, all communication history, activity, and changes will be displayed.
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
digital logic circuits, digital component floting and fixed point
1. Digital Logic Circuits, Digital
Component and Data
Representation
Course: MCA-I
Subject: Computer Organization
And Architecture
Unit-1
1
2. The World is Not Just Integers
• Programming languages support numbers with fraction
– Called floating-point numbers
– Examples:
3.14159265… (π)
2.71828… (e)
0.000000001 or 1.0 × 10–9 (seconds in a nanosecond)
86,400,000,000,000 or 8.64 × 1013 (nanoseconds in a day)
last number is a large integer that cannot fit in a 32-bit integer
• We use a scientific notation to represent
– Very small numbers (e.g. 1.0 × 10–9)
– Very large numbers (e.g. 8.64 × 1013)
– Scientific notation: ± d.f1f2f3f4 … × 10 ± e1
e2
e3
2
3. Floating-Point Numbers
• Examples of floating-point numbers in base 10 …
– 5.341×103 , 0.05341×105 , –2.013×10–1 , –201.3×10–3
• Examples of floating-point numbers in base 2 …
– 1.00101×223 , 0.0100101×225 , –1.101101×2–3 , –1101.101×2–6
– Exponents are kept in decimal for clarity
– The binary number (1101.101)2 = 23+22+20+2–1+2–3 = 13.625
• Floating-point numbers should be normalized
– Exactly one non-zero digit should appear before the point
• In a decimal number, this digit can be from 1 to 9
• In a binary number, this digit should be 1
– Normalized FP Numbers: 5.341×103 and –1.101101×2–3
– NOT Normalized: 0.05341×105 and –1101.101×2–6
decimal point
binary point
3
4. Floating-Point Representation
• A floating-point number is represented by the triple
– S is the Sign bit (0 is positive and 1 is negative)
• Representation is called sign and magnitude
– E is the Exponent field (signed)
• Very large numbers have large positive exponents
• Very small close-to-zero numbers have negative exponents
• More bits in exponent field increases range of values
– F is the Fraction field (fraction after binary point)
• More bits in fraction field improves the precision of FP numbers
Value of a floating-point number = (-1)S × val(F) ×
2val(E)
S Exponent Fraction
4
5. Next . . .
• Floating-Point Numbers
• IEEE 754 Floating-Point Standard
• Floating-Point Addition and Subtraction
• Floating-Point Multiplication
• MIPS Floating-Point Instructions
5
6. IEEE 754 Floating-Point Standard
• Found in virtually every computer invented since 1980
– Simplified porting of floating-point numbers
– Unified the development of floating-point algorithms
– Increased the accuracy of floating-point numbers
• Single Precision Floating Point Numbers (32 bits)
– 1-bit sign + 8-bit exponent + 23-bit fraction
• Double Precision Floating Point Numbers (64 bits)
– 1-bit sign + 11-bit exponent + 52-bit fraction
S Exponent8 Fraction23
S Exponent11 Fraction52
(continued)
6
7. Normalized Floating Point Numbers
• For a normalized floating point number (S, E, F)
• Significand is equal to (1.F)2 = (1.f1f2f3f4…)2
– IEEE 754 assumes hidden 1. (not stored) for normalized numbers
– Significand is 1 bit longer than fraction
• Value of a Normalized Floating Point Number is
(–1)S × (1.F)2 × 2val(E)
(–1)S × (1.f1f2f3f4 …)2 × 2val(E)
(–1)S × (1 + f1×2-1 + f2×2-2 + f3×2-3 + f4×2-4 …)2 × 2val(E)
(–1)S is 1 when S is 0 (positive), and –1 when S is 1 (negative)
S E F = f1 f2 f3 f4 …
7
8. Biased Exponent Representation
• How to represent a signed exponent? Choices are …
– Sign + magnitude representation for the exponent
– Two’s complement representation
– Biased representation
• IEEE 754 uses biased representation for the exponent
– Value of exponent = val(E) = E – Bias (Bias is a constant)
• Recall that exponent field is 8 bits for single precision
– E can be in the range 0 to 255
– E = 0 and E = 255 are reserved for special use (discussed later)
– E = 1 to 254 are used for normalized floating point numbers
– Bias = 127 (half of 254), val(E) = E – 127
– val(E=1) = –126, val(E=127) = 0, val(E=254) = 127
8
9. Biased Exponent – Cont’d
• For double precision, exponent field is 11 bits
– E can be in the range 0 to 2047
– E = 0 and E = 2047 are reserved for special use
– E = 1 to 2046 are used for normalized floating point numbers
– Bias = 1023 (half of 2046), val(E) = E – 1023
– val(E=1) = –1022, val(E=1023) = 0, val(E=2046) = 1023
• Value of a Normalized Floating Point Number is
(–1)S × (1.F)2 × 2E – Bias
(–1)S × (1.f1f2f3f4 …)2 × 2E – Bias
(–1)S × (1 + f1×2-1 + f2×2-2 + f3×2-3 + f4×2-4 …)2 × 2E – Bias
9
10. Examples of Single Precision Float
• What is the decimal value of this Single Precision float?
• Solution:
– Sign = 1 is negative
– Exponent = (01111100)2 = 124, E – bias = 124 – 127 = –3
– Significand = (1.0100 … 0)2 = 1 + 2-2 = 1.25 (1. is implicit)
– Value in decimal = –1.25 × 2–3 = –0.15625
• What is the decimal value of?
• Solution:
– Value in decimal = +(1.01001100 … 0)2 × 2130–127 =
(1.01001100 … 0)2 × 23 = (1010.01100 … 0)2 = 10.375
1 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
implicit
10
11. Examples of Double Precision Float
• What is the decimal value of this Double Precision float ?
• Solution:
– Value of exponent = (10000000101)2 – Bias = 1029 – 1023 = 6
– Value of double float = (1.00101010 … 0)2 × 26 (1. is implicit) =
(1001010.10 … 0)2 = 74.5
• What is the decimal value of ?
• Do it yourself! (answer should be –1.5 × 2–7 = –0.01171875)
0 1 0 0 0 0 0 0 0 1 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 1 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11
13. Largest Normalized Float
• What is the Largest normalized float?
• Solution for Single Precision:
– Exponent – bias = 254 – 127 = 127 (largest exponent for SP)
– Significand = (1.111 … 1)2 = almost 2
– Value in decimal ≈ 2 × 2127 ≈ 2128 ≈ 3.4028 … × 1038
• Solution for Double Precision:
– Value in decimal ≈ 2 × 21023 ≈ 21024 ≈ 1.79769 … × 10308
• Overflow: exponent is too large to fit in the exponent field
0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
13
14. Smallest Normalized Float
• What is the smallest (in absolute value) normalized float?
• Solution for Single Precision:
– Exponent – bias = 1 – 127 = –126 (smallest exponent for SP)
– Significand = (1.000 … 0)2 = 1
– Value in decimal = 1 × 2–126 = 1.17549 … × 10–38
• Solution for Double Precision:
– Value in decimal = 1 × 2–1022 = 2.22507 … × 10–308
• Underflow: exponent is too small to fit in exponent field
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
14
15. Zero, Infinity, and NaN
• Zero
– Exponent field E = 0 and fraction F = 0
– +0 and –0 are possible according to sign bit S
• Infinity
– Infinity is a special value represented with maximum E and F =
0
• For single precision with 8-bit exponent: maximum E = 255
• For double precision with 11-bit exponent: maximum E = 2047
– Infinity can result from overflow or division by zero
– +∞ and –∞ are possible according to sign bit S
• NaN (Not a Number)
– NaN is a special value represented with maximum E and F ≠ 0
– Result from exceptional situations, such as 0/0 or sqrt(negative)
– Operation on a NaN results is NaN: Op(X, NaN) = NaN
15
16. Denormalized Numbers
• IEEE standard uses denormalized numbers to …
– Fill the gap between 0 and the smallest normalized float
– Provide gradual underflow to zero
• Denormalized: exponent field E is 0 and fraction F ≠ 0
– Implicit 1. before the fraction now becomes 0. (not
normalized)
• Value of denormalized number ( S, 0, F )
Single precision: (–1) S × (0.F)2 × 2–126
Double precision: (–1) S × (0.F)2 × 2–1022
Denorm Denorm +∞
Positive
Overflow
-∞
Negative
Overflow
Negative
Underflow
Positive
Underflow
Normalized (–ve) Normalized (+ve)
2–126 2128
0-2128 -2–126
16
17. Summary of IEEE 754 Encoding
Single-Precision Exponent = 8 Fraction = 23 Value
Normalized Number 1 to 254 Anything ± (1.F)2 × 2E – 127
Denormalized Number 0 nonzero ± (0.F)2 × 2–126
Zero 0 0 ± 0
Infinity 255 0 ± ∞
NaN 255 nonzero NaN
Double-Precision Exponent = 11 Fraction = 52 Value
Normalized Number 1 to 2046 Anything ± (1.F)2 × 2E – 1023
Denormalized Number 0 nonzero ± (0.F)2 × 2–1022
Zero 0 0 ± 0
Infinity 2047 0 ± ∞
NaN 2047 nonzero NaN
18. Floating-Point Comparison
• IEEE 754 floating point numbers are ordered
– Because exponent uses a biased representation …
• Exponent value and its binary representation have same ordering
– Placing exponent before the fraction field orders the
magnitude
• Larger exponent larger magnitude
• For equal exponents, Larger fraction larger magnitude
• 0 < (0.F)2 × 2Emin < (1.F)2 × 2E–Bias < ∞ (Emin = 1 – Bias)
– Because sign bit is most significant quick test of signed
<
• Integer comparator can compare magnitudes
Integer
Magnitude
Comparator
X < Y
X = Y
X > Y
X = (EX , FX)
Y = (EY , FY) 18
19. Next . . .
• Floating-Point Numbers
• IEEE 754 Floating-Point Standard
• Floating-Point Addition and Subtraction
• Floating-Point Multiplication
• MIPS Floating-Point Instructions
19
20. Floating Point Addition Example
• Consider Adding (Single-Precision Floating-Point):
+ 1.111001000000000000000102 × 24
+ 1.100000000000001100001012 × 22
• Cannot add significands … Why?
– Because exponents are not equal
• How to make exponents equal?
– Shift the significand of the lesser exponent right
– Difference between the two exponents = 4 – 2 = 2
– So, shift right second number by 2 bits and increment exponent
1.100000000000001100001012 × 22
= 0.01100000000000001100001 012 × 24
20
21. Floating-Point Addition – cont'd
• Now, ADD the Significands:
+ 1.11100100000000000000010 × 24
+ 1.10000000000000110000101 × 22
+ 1.11100100000000000000010 × 24
+ 0.01100000000000001100001 01 × 24 (shift right)
+10.01000100000000001100011 01 × 24 (result)
• Addition produces a carry bit, result is NOT normalized
• Normalize Result (shift right and increment exponent):
+ 10.01000100000000001100011 01 × 24
= + 1.00100010000000000110001 101 × 25
21
22. Rounding
• Single-precision requires only 23 fraction bits
• However, Normalized result can contain additional bits
1.00100010000000000110001 | 1 01 × 25
• Two extra bits are needed for rounding
– Round bit: appears just after the normalized result
– Sticky bit:appears after the round bit (OR of all additional
bits)
• Since RS = 11, increment fraction to round to nearest
1.00100010000000000110001 × 25
+1
1.00100010000000000110010 × 25
(Rounded)
Round Bit: R = 1 Sticky Bit: S = 1
22
23. Floating-Point Subtraction Example
• Sometimes, addition is converted into subtraction
– If the sign bits of the operands are different
• Consider Adding:
+ 1.00000000101100010001101 × 2-6
– 1.00000000000000010011010 × 2-1
+ 0.00001000000001011000100 01101 × 2-1 (shift right 5 bits)
– 1.00000000000000010011010 × 2-1
0 0.00001000000001011000100 01101 × 2-1
1 0.11111111111111101100110 × 2-1 (2's complement)
1 1.00001000000001000101010 01101 × 2-1 (ADD)
- 0.11110111111110111010101 10011 × 2-1 (2's complement)
2's complement of result is required if result is negative
23
24. Floating-Point Subtraction – cont'd
+ 1.00000000101100010001101 × 2-6
– 1.00000000000000010011010 × 2-1
- 0.11110111111110111010101 10011 × 2-1 (result is negative)
Result should be normalized
For subtraction, we can have leading zeros. To normalize, count the
number of leading zeros, then shift result left and decrement the
exponent accordingly.
- 0.11110111111110111010101 1 0011 × 2-1
- 1.11101111111101110101011 0011 × 2-2 (Normalized)
Guard bit
Guard bit: guards against loss of a fraction bit
Needed for subtraction, when result has a leading zero and should be
normalized.
24
25. Floating-Point Subtraction – cont'd
• Next, normalized result should be rounded
- 0.11110111111110111010101 1 0 011 × 2-1
- 1.11101111111101110101011 0 011 × 2-2 (Normalized)
Guard bit
Round bit: R=0 Sticky bit: S = 1
Since R = 0, it is more accurate to truncate the result even if S
= 1. We simply discard the extra bits.
- 1.11101111111101110101011 0 011 × 2-2 (Normalized)
- 1.11101111111101110101011 × 2-2 (Rounded to nearest)
IEEE 754 Representation of Result
1 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 0 1 0 1 0 1 1
25
26. Rounding to Nearest Even
• Normalized result has the form: 1. f1 f2 … fl R S
– The round bit R appears after the last fraction bit fl
– The sticky bit S is the OR of all remaining additional bits
• Round to Nearest Even: default rounding mode
• Four cases for RS:
– RS = 00 Result is Exact, no need for rounding
– RS = 01 Truncate result by discarding RS
– RS = 11 Increment result: ADD 1 to last fraction bit
– RS = 10 Tie Case (either truncate or increment result)
• Check Last fraction bit fl (f23 for single-precision or f52 for double)
• If fl is 0 then truncate result to keep fraction even
• If fl is 1 then increment result to make fraction even
26
27. Additional Rounding Modes
• IEEE 754 standard specifies four rounding modes:
1. Round to Nearest Even: described in previous slide
2. Round toward +Infinity: result is rounded up
Increment result if sign is positive and R or S = 1
3. Round toward -Infinity: result is rounded down
Increment result if sign is negative and R or S = 1
4. Round toward 0: always truncate result
• Rounding or Incrementing result might generate a carry
– This occurs when all fraction bits are 1
– Re-Normalize after Rounding step is required only in this case
27
28. Example on Rounding
• Round following result using IEEE 754 rounding modes:
–1.11111111111111111111111 1 0 × 2-7
• Round to Nearest Even:
– Increment result since RS = 10 and f23 = 1
– Incremented result: –10.00000000000000000000000 × 2-7
– Renormalize and increment exponent (because of carry)
– Final rounded result: –1.00000000000000000000000 × 2-6
• Round towards +∞:
– Truncated Result: –1.11111111111111111111111 × 2-7
• Round towards –∞:
– Final rounded result: –1.00000000000000000000000 × 2-6
• Round towards 0:
Round Bit Sticky Bit
Truncate result since negative
Increment since negative and R = 1
Truncate always
28
29. Floating Point Addition / Subtraction
1. Compare the exponents of the two numbers. Shift the smaller
number to the right until its exponent would match the larger
exponent.
2. Add / Subtract the significands according to the sign bits.
3. Normalize the sum, either shifting right and incrementing the
exponent or shifting left and decrementing the exponent
4. Round the significand to the appropriate number of bits, and
renormalize if rounding generates a carry
Start
Done
Overflow or
underflow?
Exception
yes
no
Shift significand right by
d = | EX – EY |
Add significands when signs
of X and Y are identical,
Subtract when different
X – Y becomes X + (–Y)
Normalization shifts right by 1 if
there is a carry, or shifts left by the
number of leading zeros in the
case of subtraction
Rounding either truncates fraction,
or adds a 1 to least significant
fraction bit
29
30. Floating Point Adder Block Diagram
c
z
EZ
EX
FX
Shift Right / Left
Inc / Dec
EY
Swap
FY
Shift Right
Exponent
Subtractor
Significand
Adder/Subtractor
1 1
sign
Sign
Computation
d = | EX – EY |
max ( EX , EY )
add / subtract
Rounding Logic
sign
SY
add/sub
FZSZ
c
SX
z
Detect carry, or
Count leading 0’s
c
0 1
30
31. Next . . .
• Floating-Point Numbers
• IEEE 754 Floating-Point Standard
• Floating-Point Addition and Subtraction
• Floating-Point Multiplication
• MIPS Floating-Point Instructions
31
32. Floating Point Multiplication Example
• Consider multiplying:
-1.110 1000 0100 0000 1010 00012 × 2–4
× 1.100 0000 0001 0000 0000 00002 × 2–2
• Unlike addition, we add the exponents of the operands
– Result exponent value = (–4) + (–2) = –6
• Using the biased representation: EZ = EX + EY – Bias
– EX = (–4) + 127 = 123 (Bias = 127 for single precision)
– EY = (–2) + 127 = 125
– EZ = 123 + 125 – 127 = 121 (value = –6)
• Sign bit of product can be computed independently
• Sign bit of product = SignX XOR SignY = 1 (negative)
32
33. Floating-Point Multiplication, cont'd
• Now multiply the significands:
(Multiplicand) 1.11010000100000010100001
(Multiplier) × 1.10000000001000000000000
111010000100000010100001
111010000100000010100001
1.11010000100000010100001
10.1011100011111011111100110010100001000000000000
24 bits × 24 bits 48 bits (double number of bits)
Multiplicand × 0 = 0 Zero rows are eliminated
Multiplicand × 1 = Multiplicand (shifted left)
33
34. Floating-Point Multiplication, cont'd
• Normalize Product:
-10.10111000111110111111001100... × 2-6
Shift right and increment exponent because of carry bit
= -1.010111000111110111111001100... × 2-5
• Round to Nearest Even: (keep only 23 fraction bits)
1.01011100011111011111100 | 1 100... ×
2-5
Round bit = 1, Sticky bit = 1, so increment fraction
Final result = -1.01011100011111011111101 × 2-
5
• IEEE 754 Representation
1 0 1 1 1 1 0 1 0 0 1 0 1 1 1 0 0 0 1 1 1 1 1 0 1 1 1 1 1 1 0 1
34
35. Floating Point Multiplication
1. Add the biased exponents of the two numbers, subtracting the
bias from the sum to get the new biased exponent
2. Multiply the significands. Set the result sign to positive if
operands have same sign, and negative otherwise
3. Normalize the product if necessary, shifting its significand right
and incrementing the exponent
4. Round the significand to the appropriate number of bits, and
renormalize if rounding generates a carry
Start
Done
Overflow or
underflow?
Exception
yes
no
Biased Exponent Addition
EZ = EX + EY – Bias
Result sign SZ = SX xor SY can be
computed independently
Since the operand significands
1.FX and 1.FY are ≥ 1 and < 2, their
product is ≥ 1 and < 4.
To normalize product, we need to
shift right at most by 1 bit and
increment exponent
Rounding either truncates fraction,
or adds a 1 to least significant
fraction bit
35
36. Extra Bits to Maintain Precision
• Floating-point numbers are approximations for …
– Real numbers that they cannot represent
• Infinite variety of real numbers exist between 1.0 and 2.0
– However, exactly 223 fractions represented in Single Precision
– Exactly 252 fractions can be represented in Double Precision
• Extra bits are generated in intermediate results when …
– Shifting and adding/subtracting a p-bit significand
– Multiplying two p-bit significands (product is 2p bits)
• But when packing result fraction, extra bits are discarded
• Few extra bits are needed: guard, round, and sticky bits
• Minimize hardware but without compromising accuracy
36
37. Advantages of IEEE 754 Standard
• Used predominantly by the industry
• Encoding of exponent and fraction simplifies comparison
– Integer comparator used to compare magnitude of FP numbers
• Includes special exceptional values: NaN and ±∞
– Special rules are used such as:
• 0/0 is NaN, sqrt(–1) is NaN, 1/0 is ∞, and 1/∞ is 0
– Computation may continue in the face of exceptional conditions
• Denormalized numbers to fill the gap
– Between smallest normalized number 1.0 × 2
Emin
and zero
– Denormalized numbers , values 0.F × 2
Emin
, are closer to zero
– Gradual underflow to zero
37
38. Floating Point Complexities
• Operations are somewhat more complicated
• In addition to overflow we can have underflow
• Accuracy can be a big problem
– Extra bits to maintain precision: guard, round, and sticky
– Four rounding modes
– Division by zero yields Infinity
– Zero divide by zero yields Not-a-Number
– Other complexities
• Implementing the standard can be tricky
– See text for description of 80x86 and Pentium bug!
• Not using the standard can be even worse
38
39. Accuracy can be a Big Problem
Value1 Value2 Value3 Value4 Sum
1.0E+30 -1.0E+30 9.5 -2.3 7.2
1.0E+30 9.5 -1.0E+30 -2.3 -2.3
1.0E+30 9.5 -2.3 -1.0E+30 0
Adding double-precision floating-point numbers (Excel)
Floating-Point addition is NOT associative
Produces different sums for the same data values
Rounding errors when the difference in exponent is large
39
40. Next . . .
• Floating-Point Numbers
• IEEE 754 Floating-Point Standard
• Floating-Point Addition and Subtraction
• Floating-Point Multiplication
• MIPS Floating-Point Instructions
40
41. MIPS Floating Point Coprocessor
• Called Coprocessor 1 or the Floating Point Unit (FPU)
• 32 separate floating point registers: $f0, $f1, …, $f31
• FP registers are 32 bits for single precision numbers
• Even-odd register pair form a double precision register
• Use the even number for double precision registers
– $f0, $f2, $f4, …, $f30 are used for double precision
• Separate FP instructions for single/double precision
– Single precision: add.s, sub.s, mul.s, div.s (.s extension)
– Double precision: add.d, sub.d, mul.d, div.d (.d extension)
• FP instructions are more complex than the integer ones
– Take more cycles to execute
41
43. Separate floating point load/store instructions
lwc1: load word coprocessor 1
ldc1: load double coprocessor 1
swc1: store word coprocessor 1
sdc1: store double coprocessor 1
Better names can be used for the above instructions
l.s = lwc1 (load FP single), l.d = ldc1 (load FP double)
s.s = swc1 (store FP single), s.d = sdc1 (store FP double)
FP Load/Store Instructions
Instruction Meaning Format
lwc1 $f2, 40($t0) ($f2) = Mem[($t0)+40] 0x31 $t0 $f2 im16 = 40
ldc1 $f2, 40($t0) ($f2) = Mem[($t0)+40] 0x35 $t0 $f2 im16 = 40
swc1 $f2, 40($t0) Mem[($t0)+40] = ($f2) 0x39 $t0 $f2 im16 = 40
sdc1 $f2, 40($t0) Mem[($t0)+40] = ($f2) 0x3d $t0 $f2 im16 = 40
General purpose
register is used as
the base register
44. Moving data between general purpose and FP registers
mfc1: move from coprocessor 1 (to general purpose register)
mtc1: move to coprocessor 1 (from general purpose register)
Moving data between FP registers
mov.s: move single precision float
mov.d: move double precision float = even/odd pair of registers
FP Data Movement Instructions
Instruction Meaning Format
mfc1 $t0, $f2 ($t0) = ($f2) 0x11 0 $t0 $f2 0 0
mtc1 $t0, $f2 ($f2) = ($t0) 0x11 4 $t0 $f2 0 0
mov.s $f4, $f2 ($f4) = ($f2) 0x11 0 0 $f2 $f4 6
mov.d $f4, $f2 ($f4) = ($f2) 0x11 1 0 $f2 $f4 6
45. FP Convert Instructions
Instruction Meaning Format
cvt.s.w fd, fs to single from integer 0x11 0 0 fs5 fd5 0x20
cvt.s.d fd, fs to single from double 0x11 1 0 fs5 fd5 0x20
cvt.d.w fd, fs to double from integer 0x11 0 0 fs5 fd5 0x21
cvt.d.s fd, fs to double from single 0x11 1 0 fs5 fd5 0x21
cvt.w.s fd, fs to integer from single 0x11 0 0 fs5 fd5 0x24
cvt.w.d fd, fs to integer from double 0x11 1 0 fs5 fd5 0x24
Convert instruction: cvt.x.y
Convert to destination format x from source format y
Supported formats
Single precision float = .s (single precision float in FP register)
Double precision float = .d (double float in even-odd FP register)
Signed integer word = .w (signed integer in FP register)
46. FP Compare and Branch Instructions
Instruction Meaning Format
c.eq.s fs, ft cflag = ((fs) == (ft)) 0x11 0 ft5 fs5 0 0x32
c.eq.d fs, ft cflag = ((fs) == (ft)) 0x11 1 ft5 fs5 0 0x32
c.lt.s fs, ft cflag = ((fs) <= (ft)) 0x11 0 ft5 fs5 0 0x3c
c.lt.d fs, ft cflag = ((fs) <= (ft)) 0x11 1 ft5 fs5 0 0x3c
c.le.s fs, ft cflag = ((fs) <= (ft)) 0x11 0 ft5 fs5 0 0x3e
c.le.d fs, ft cflag = ((fs) <= (ft)) 0x11 1 ft5 fs5 0 0x3e
bc1f Label branch if (cflag == 0) 0x11 8 0 im16
bc1t Label branch if (cflag == 1) 0x11 8 1 im16
FP unit (co-processor 1) has a condition flag
Set to 0 (false) or 1 (true) by any comparison instruction
Three comparisons: equal, less than, less than or equal
Two branch instructions based on the condition flag
47. Example 1: Area of a Circle
.data
pi: .double 3.1415926535897924
msg: .asciiz "Circle Area = "
.text
main:
ldc1 $f2, pi # $f2,3 = pi
li $v0, 7 # read double (radius)
syscall # $f0,1 = radius
mul.d $f12, $f0, $f0 # $f12,13 = radius*radius
mul.d $f12, $f2, $f12 # $f12,13 = area
la $a0, msg
li $v0, 4 # print string (msg)
syscall
li $v0, 3 # print double (area)
syscall # print $f12,13
47
48. Example 2: Matrix Multiplication
void mm (int n, double x[n][n], y[n][n], z[n][n]) {
for (int i=0; i!=n; i=i+1)
for (int j=0; j!=n; j=j+1) {
double sum = 0.0;
for (int k=0; k!=n; k=k+1)
sum = sum + y[i][k] * z[k][j];
x[i][j] = sum;
}
}
• Matrices x, y, and z are n×n double precision float
• Matrix size is passed in $a0 = n
• Array addresses are passed in $a1, $a2, and $a3
• What is the MIPS assembly code for the procedure?
48
49. Address Calculation for 2D Arrays
• Row-Major Order: 2D arrays are stored as rows
• Calculate Address of: X[i][j]
= Address of X + (i×n+j)×8 (8 bytes per element)
row 0
row i-1
row i
i × n
elements
n elements per row
j elements
X[i][j]
n elements per row
Address of Y[i][k] =
Address of Z[k][j] =
Address of Y + (i×n+k)×8
Address of Z + (k×n+j)×8
49
50. Matrix Multiplication Procedure – 1/3
• Initialize Loop Variables
mm: addu $t1, $0, $0 # $t1 = i = 0; for 1st loop
L1: addu $t2, $0, $0 # $t2 = j = 0; for 2nd loop
L2: addu $t3, $0, $0 # $t3 = k = 0; for 3rd loop
sub.d $f0, $f0, $f0 # $f0 = sum = 0.0
• Calculate address of y[i][k] and load it into
$f2,$f3
• Skip i rows (i×n) and add k elements
L3: mul $t4, $t1, $a0 # $t4 = i*size(row) = i*n
addu $t4, $t4, $t3 # $t4 = i*n + k
sll $t4, $t4, 3 # $t4 =(i*n + k)*8
addu $t4, $a2, $t4 # $t4 = address of y[i][k]
l.d $f2, 0($t4) # $f2 = y[i][k]
50
51. Matrix Multiplication Procedure – 2/3
• Similarly, calculate address and load value of
z[k][j]
• Skip k rows (k×n) and add j elements
mul $t5, $t3, $a0 # $t5 = k*size(row) = k*n
addu $t5, $t5, $t2 # $t5 = k*n + j
sll $t5, $t5, 3 # $t5 =(k*n + j)*8
addu $t5, $a3, $t5 # $t5 = address of z[k][j]
l.d $f4, 0($t5) # $f4 = z[k][j]
• Now, multiply y[i][k] by z[k][j] and add it
to $f0
mul.d $f6, $f2, $f4 # $f6 = y[i][k]*z[k][j]
add.d $f0, $f0, $f6 # $f0 = sum
addiu $t3, $t3, 1 # k = k + 1
bne $t3, $a0, L3 # loop back if (k != n)
51
52. Matrix Multiplication Procedure – 3/3
• Calculate address of x[i][j] and store sum
mul $t6, $t1, $a0 # $t6 = i*size(row) = i*n
addu $t6, $t6, $t2 # $t6 = i*n + j
sll $t6, $t6, 3 # $t6 =(i*n + j)*8
addu $t6, $a1, $t6 # $t6 = address of x[i][j]
s.d $f0, 0($t6) # x[i][j] = sum
• Repeat outer loops: L2 (for j = …) and L1 (for i =
…)
addiu $t2, $t2, 1 # j = j + 1
bne $t2, $a0, L2 # loop L2 if (j != n)
addiu $t1, $t1, 1 # i = i + 1
bne $t1, $a0, L1 # loop L1 if (i != n)
• Return:
jr $ra # return
52
53. Reference
Reference Book
• Computer Organization & Architecture 7e By
Stallings
• Computer System Architecture By Mano
• Digital Logic & Computer Design By Mano
53