SlideShare a Scribd company logo
1 of 84
CSUT111
Paradigm of Programming
Language
Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Chapter 4
Data Types
• Introduction to the concept of a data type
• Characteristics of the primitive data types
• Designs of enumeration and subrange types
• Details of structured data types—arrays, associative
arrays, records, tuples, lists, and unions
• Pointers and references
• Design issues and design choices of data types
Contents
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Introduction
Data type
• It is a collection of data values
• and a set of predefined operations on those values
int
char
Descriptor
• collection of the attributes of a variable.
• During implementation, a descriptor is an
area of memory that stores the attributes of a
variable.
For static attributes :
• descriptors are required only at compile
time.
• are built by the compiler, as a part of the
symbol table, and are used during
compilation.
For dynamic attributes:
• the descriptor must be maintained during
execution.
• the descriptor is used by the run-time
system.
Variable
• Is not an identifier
• Identifier is just an attribute of variable
• a variable is a value that can change
Object
• is associated with the value of a variable and the
space it occupies.
• Here object is reserved exclusively for instances of
user-defined abstract data types
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Primitive Data Types
Data types that are not defined in terms of other types
Data Types
Numeric
Integer
Floating-point
Decimal
Complex
Character
Boolean
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Numeric Types :
Numeric types still play a central role among the
collections of types
Integer
The most common primitive numeric data type is integer
byte, short, int, & long
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Numeric Types : Floating point
Floating-point data types model real numbers
float and double
Precision is the accuracy of the fractional part of a
value, measured as the number of bits.
Range is a combination of the range of fractions and
the range of exponents.
The collection of values that can be represented by a
floating-point type is defined in terms of precision and
range
SINGLE PRECISION
DOUBLE PRECISION
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Numeric Types : Complex
Fortran and Python support a complex data type
Complex values are represented as ordered pairs of
floating-point values
• Complex data type usually provide special syntax for
building such values
• Extend the basic arithmetic operations ('+', '−', '×', '÷')
to act on them.
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Numeric Types : Decimal
• Decimal data types store a fixed number of decimal digits, with the
decimal point at a fixed position in the value.
• Separate hardware support for decimal data types.
• It is a primary data types for business data processing and are
therefore essential to COBOL(Common Business Oriented Language).
• C# and F# also have decimal data types.
Decimal types precisely used to store decimal values, within a restricted
range, which cannot be done with floating-point
• Decimal types are stored like character strings, using binary codes for the
decimal digits.
• It is called binary coded decimal (BCD).
• They are stored one digit per byte, or packed two digits per byte.
• Four bits to code a decimal digit ,hence a six-digit coded decimal number
requires 24 bits
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Boolean Types
Simplest of all types.
Their range of values has only two elements: one for true (1)and
one for false.(0)
C99 and C++ have a Boolean type but also allow numeric expressions to
be used as if they were Boolean
Java and C# has Boolean data type
Boolean types are often used to represent switches or flags
in programs
It is represented by a single bit
But a single bit of memory cannot be accessed efficiently on many
machines, they are often stored in the smallest efficiently addressable cell
of memory, typically a byte.
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Character Types
• Character data are stored in numeric coding.
• Commonly used coding was the 8-bit code ASCII.
• It allows 256 different characters.
• The ASCII character set became inadequate, in 1991, the Unicode
Consortium published the UCS-2 standard, a 16-bit character set.
• This character code is often called Unicode.
• Unicode includes the characters from most of the world’s natural languages
• Unicode includes the Cyrillic alphabet, as used in Serbia, and the Thai
digits.
• The first 128 characters of Unicode are identical to those of ASCII.
• Java was the first widely used language to use the Unicode character set.
• Later used into JavaScript, Python, Perl, C#, and F#.
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Character String Types
• The values consist of sequences of characters.
• Character string constants are used to label output
• The input and output of all kinds of data are often done in terms
of strings.
• Character strings also are an essential type for character
manipulation.
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Design Issues
• Should strings be simply a special kind of character array or a
primitive type?
• Should strings have static or dynamic length?
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Strings and Their Operations
1. Assignment
2. Catenation
3. Substring reference
4. Comparison
5. Pattern matching.
String functions in C and C++
1.strlen
2.strcpy
3.strcat
4.strcmp
Moving string data do not guard against overflowing the
destination.
Error
strcpy(dest, src);
If length of src is more then dest, then strcpy will write over the
remaining bytes that follow dest.
As strcpy does not know the length of dest, it cannot ensure that the
memory following it will not be overwritten
To avoid
C++ programmers should use the string class from the standard
library, rather than char arrays
Java uses String class – immutable strings
And StringBuffer class – mutable strings
C#, F# and Ruby include string classes
Python includes strings as a primitive type
Perl, JavaScript, Ruby, and PHP include built-in pattern-matching
operations called regular expressions
/[A-Za-z][A-Za-zd]+/
String Length Operations
Design choices regarding the length of string values
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Static Length
The length can be static and set when the string is created called a static
length string
EXAMPLE
1. strings of Python
2. Java’s String class
3. stringg class in C++
4. Ruby’s built-in String class
5. in .NET class library available to C# and F#
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Limited Dynamic Length
Strings can have varying length up to a declared and fixed maximum set
by the variable’s definition, called limited dynamic length strings.
It can store any number of characters between zero and the maximum.
EXAMPLE
strings in C and the C-style strings of C++
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Dynamic Length
Strings can have varying length with no maximum called dynamic
length strings.
Ada 95+ supports all three string length options
EXAMPLE
JavaScript
Perl
standard C++ library functions.
This option requires the overhead of dynamic storage allocation and
deallocation but provides maximum flexibility.
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Evaluation
Dealing with strings as arrays can be more cumbersome than dealing
with a primitive string type.
Addition of strings as a primitive type to a language is not costly in terms
of either language or compiler complexity.
Pattern matching and concatenation are essential and should be
included for string type values
Although dynamic length strings are obviously the most flexible
But the overhead of implementation must be compared against
additional flexibility
Implementation of Character String Types
Compile-time descriptor for
static strings
Run-time descriptor for
limited dynamic strings
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
User defined Ordinal types
The type in which the range of possible values can be easily
associated with the set of positive integers
In Java primitive ordinal types are integer, char, and boolean.
User Defined
Ordinal Types
Enumeration Subrange
Enumeration types
• In which all of the possible values, which are named constants, are provided, or
enumerated, in the definition.
• It provide a way of defining and grouping collections of named constants, which
are called enumeration constants
enum days {Mon, Tue, Wed, Thu, Fri, Sat, Sun};
In C++
enum colors {red = 1, blue = 1000, green = 100000}
In ML
datatype weekdays = Monday | Tuesday | Wednesday | Thursday | Friday
In C and Pascal
enum colors {red, blue, green, yellow, black};
colors myColor = blue, yourColor = red;
myColor++
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Subrange types
• A subrange type is a contiguous subsequence of an ordinal type. For example,
12..14 is a subrange of integer type.
• Subrange types were introduced by Pascal and are included in Ada.
Ada
type Days is (Mon, Tue, Wed, Thu, Fri, Sat, Sun);
subtype Weekdays is Days range Mon..Fri;
subtype Index is Integer range 1..100;
Subtypes are new names for possibly restricted versions of existing types
Day1 : Days;
Day2 : Weekdays;
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Array types
• An array is a homogeneous aggregate/collection of data elements.
• an individual element is identified by its position in the aggregate,
relative to the first element.
• The individual data elements of an array are of the same type
C, C++, Java, Ada, and C#
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Design issues
• What types are legal for subscripts?
• Are subscripting expressions in element references range checked?
• When are subscript ranges bound?
• When does array allocation take place?
• Are jagged or rectangular multidimensional arrays allowed, or both?
• Can arrays be initialized when they have their storage allocated?
• What kinds of slices are allowed, if any
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Arrays and indices
Specific elements of an array are referenced by means of a
two-level syntactic mechanism
1. the aggregate name
2. a dynamic selector consisting of one or more items
known as subscripts or indices
Eg.
int A[10];
ans = A[2]+A[5];
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Subscript bindings
The binding of the subscript type to an array variable is usually
static, but the subscript value ranges are sometimes dynamically
bound (A[n] and n is an input ).
In some languages, the lower bound of the subscript range is
implicit.
For example
C-based languages, the lower bound of all subscript ranges is fixed
at 0
Fortran 95+ it defaults to 1 but can be set to any integer literal.
In some other languages, the lower bounds of the subscript ranges
must be specified by the programmer
Array categories
Five categories of arrays, based on
• the binding to subscript ranges
• the binding to storage
• and from where the storage is allocated.
Array
categories
Static array
Fixed stack-
dynamic array
Stack-dynamic
array
Fixed heap-
dynamic array
Heap-dynamic
array
Static array
• the subscript ranges are statically bound and storage allocation is static
(compile time).
• Advantage
• efficiency: No dynamic allocation or deallocation is required.
• Disadvantage
• the storage for the array is fixed for the entire execution time of the program
Fixed stack-dynamic array
• the subscript ranges are statically bound, but the allocation is done at
declaration elaboration time during execution.
• Advantage
• fixed stack-dynamic arrays over static arrays is space efficiency.
• Disadvantage
• More allocation and deallocation time.
STACK-DYNAMIC ARRAY
• both the subscript ranges and the storage allocation are dynamically bound at
elaboration time.
• Once the subscript ranges are bound and the storage is allocated, however, they remain
fixed during the lifetime of the variable.
• Advantage
• stack-dynamic arrays over static and fixed stack-dynamic arrays is flexibility.
• The size of an array need not be known until the array is about to be used.
FIXED HEAP-DYNAMIC ARRAY
• the subscript ranges and the storage binding are both fixed after storage is allocated.
• both the subscript ranges and storage bindings are done when the user program
requests them during execution
• and the storage is allocated from the heap, rather than the stack.
• advantage
• flexibility—the array’s size always fits the problem.
• disadvantage
• allocation time from the heap is longer than allocation time from the stack
HEAP-DYNAMIC ARRAY
• the binding of subscript ranges and storage allocation is dynamic
and can change any number of times during the array’s lifetime.
• advantage
• flexibility: Arrays can grow and shrink during program execution
as the need for space changes.
• disadvantage
• allocation and deallocation take longer and may happen many
times during execution of the program.
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Heterogeneous arrays
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Array initialization
Initialize arrays at the time their storage is allocated
int list [] = {4, 5, 7, 83};
char name [] = "freddie";
char *names [] = {"Bob", "Jake", "Darcie"}; // C, C++
String[] names = ["Bob", "Jake", "Darcie"]; //Java
List : array (1..5) of Integer := (1, 3, 5, 7, 9); //Ada
Bunch : array (1..5) of Integer := (1 => 17, 3 => 34, others => 0)
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Array operations
An array operation is one that operates on an array as a unit
Most common array operations :
1. Assignment
2. Catenation
3. Comparison for equality and inequality
4. Slices
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Array operations
• The C-based languages do not provide any array operations, but uses
methods of Java, C++, and C#.
• Perl supports array assignments but does not support comparisons.
• Ada allows array assignments and concatenations(&)
• Python provides array assignment, concatenation(+), membership (in),
comparison (is and ==)
• Ruby supports comparison (==) and concatenation
• Fortran 95+ assignment, arithmetic, relational, and logical operators are
all overloaded for arrays of any size or shape. Also includes intrinsic, or
library, functions for matrix multiplication, matrix transpose, and vector
dot product
• F# includes many array operators in its Array module. Among these are
Array.append, Array.copy, and Array.length
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Rectangular and Jagged arrays
• A rectangular array is a multidimensional array in which all of the rows
have the same number of elements and all of the columns have the
same number of elements. Rectangular arrays model rectangular tables
exactly
• A jagged array is one in which the lengths of the rows
need not be the same
• C, C++, and Java support jagged arrays but not rectangular arrays. In
those languages, a reference to an element of a multidimensional
array uses a separate pair of brackets for each dimension.
• For example, myArray[3][7]
• Fortran, Ada, C#, and F# support rectangular arrays. (C# and F#
also support jagged arrays.)
• myArray[3, 7]
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Slices
A slice of an array is some substructure of that array
It is important to realize that a slice is not a new data type.
Rather, it is a mechanism for referencing part of an array as a unit
Python
vector = [2, 4, 6, 8, 10, 12, 14, 16]
mat = [[1, 2, 3],[4, 5, 6],[7, 8, 9]]
vector[3:6] is a three-element array with the fourth through sixth elements of vector
mat[0][0:2] refers to the first and second element of the first row of
mat, which is [1, 2]
Perl
@list[1..5] = @list2[3, 5, 7, 9, 13]
Ruby supports slices with the slice method of its Array object
list = [2, 4, 6, 8, 10]
list.slice(2, 2) returns [6, 8].
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Implementation of Array Types
• Implementing arrays requires considerably more compile-time
effort than does implementing primitive types.
• address(list[k]) = address(list[0]) + k * element_size
Generalized formula
• address(list[k]) = address(list[lower_bound]) + ((k - lower_bound) *
element_size)
A compile-time descriptor for a one dimensional array
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Implementation of Array Types
Multidimensional arrays are
• Row major
• Column Major
• row major order as 3, 4, 7, 6, 2, 5, 1, 3, 8
• Column major order as 3, 6, 1, 4, 2, 3, 7, 5, 8
location(a[i,j]) = address of a[0, 0] + ((((number of rows above the ith row) * (size of a
row)) + (number of elements left of the jth column)) * element size)
location(a[i, j]) = address of a[0, 0] + (((i * n) + j) * element_size)
A compile-time descriptor for a multidimensional array
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Associative Arrays
An associative array is an unordered collection of data elements that are
indexed by an equal number of values called keys.
In Normal arrays, the indices never need to be stored
In an associative array, the user-defined keys must be stored in the structure
each element of an associative array is in fact a pair of entities, a key
and a value
Associative arrays are also supported directly by Python, Ruby, and
Lua and by the standard class libraries of Java, C++, C#, and F#.
Design issue
the form of references to their elements.
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Structure and operations
Perl, associative arrays are called hashes, because in the implementation
their elements are stored and retrieved with hash functions
Every hash variable name must begin with a percent sign (%).
Each hash element consists of two parts: a key, which is a string, and a
value, which is a scalar (number, string, or reference).
%salaries = ("Gary" => 75000, "Perry" => 57000, "Mary" => 55750, "Cedric" => 47850)
Referencing associative array.
in Perl- variable name begins with $
$salaries{"Perry"} = 58850;
An element can be removed from the hash
delete $salaries{"Gary"};
The entire hash can be emptied
%salaries = ();
The size of a Perl hash is dynamic: It grows when an element is added
and shrinks when an element is deleted,
Python’s associative arrays are called dictionaries similar to Perl, except the
values are all references to objects.
In Ruby it is similar to Python, except that the keys can be any object, rather
than just strings.
PHP’s arrays are both normal arrays and associative arrays. They can
be treated as either. The language provides functions that allow both
indexed and hashed access to elements
C# and F# support associative arrays through a .NET class.
Benefits of associative array over normal array
1. An associative array supports implicit hashing operation to access
elements and is very efficient.
2. associative arrays are ideal when the data to be stored is paired,
as with employee names and their salaries.
On the other hand, if every element of a list must be processed, it is
more efficient to use an array
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Implementing associative arrays,
• The implementation of Perl’s associative arrays is optimized for
fast lookups, but it also provides relatively fast reorganization
when array growth requires it.
• A 32-bit hash value is computed for each entry and is stored
with the entry, although an associative array initially uses only a
small part of the hash value.
• When an associative array must be expanded beyond its initial
size, the hash function need not be changed; rather, more bits of
the hash value are used.
• Only half of the entries must be moved when this happens
• although expansion of an associative array is not free, it is not as
costly as might be expected.
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Record types
A record is an aggregate of data elements
in which the individual elements are identified by names
and accessed through offsets from the beginning of the structure
There is frequently a need in programs to model a collection of data in
which the individual elements are not of the same type or size
Example : Information of Student, Patient, Books etc
Are records and heterogeneous arrays same??
The elements of a heterogeneous array are all references to data objects that
reside in scattered locations, often on the heap.
The elements of a record are of potentially different sizes and reside in
adjacent memory locations.
C, C++, and C#, records are supported with the struct data type
Structs are also included in ML and F#
Fortran and COBOL supports records
In Python and Ruby, records can be implemented as hashes, which
themselves can be elements
Design issues
1. What is the syntactic form of references to fields?
2. Are elliptical references allowed?
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Definitions of records
The fundamental difference between a record and an array is that record
elements ( fields) , are not referenced by indices. But fields are named with
identifiers, and references to the fields are made using these identifiers.
COBOL uses level numbers to show nested records;
others use recursive definition
01 EMP-REC.
02 EMP-NAME.
05 FIRST PIC X(20).
05 MID PIC X(10).
05 LAST PIC X(20).
02 HOURLY-RATE PIC 99V99.
Definition of Records in Ada
• Record structures are indicated in an orthogonal way
(nested example)
type Emp_Name_Type is record
First: String (1..20);
Mid: String (1..10);
Last: String (1..20);
end record;
type Emp_Rec_Type is record
Emp_Name: Emp_Name_Type;
Hourly_Rate: Float;
end record;
Definition of Records in C++
• Nested example (more similar to Ada)
structEmp_Name_Type { string
first;
stringmiddle;
stringlast;
};
structEmp_Rec_Type{
Emp_Name_TypeEmp_name;
floathourly_rate;
}
• Record field references
– 1. COBOL
– field_name OF record_name_1OF ... OF record_name_n
– 2. Others (dot notation)
– record_name_1.record_name_2. ...record_name_n.field_name
• Fully qualifiedreferences must includeall record names
• Ellipticalreferences allowleaving out record namesas long as the reference
is unambiguous, for examplein COBOL
• FIRST,FIRSTOFEMP-NAME,and FIRSTof EMP-RECare ellipticalreferences
to the employeeʼsfirst name
References to record fields
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Operations on records
• Assignment is very common if the types are identical
• Ada allows record comparison
• Ada records can be initialized with aggregate literals
• COBOL provides MOVE CORRESPONDING
– Copies a field of the source record to the corresponding field in the
target record
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Evaluation
• Records are used when collection of data values is heterogeneous
• Access to array elements is much slower than access to record fields,
because subscripts are dynamic (field names are static)
• Dynamic subscripts could be used with record field access, but it would
disallow type checking and it would be much slower
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Implementation of Record types
• Offset address relative to
the beginning of the
records is associated with
each field
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Union Types
• A union is a type whose variables are allowed to store different type
values at different times during execution
• Design issues
1. Should type checking be required?
2. Should unions be embedded in records?
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Discriminated versus Free unions
• Fortran, C, and C++ provide union constructs in which there is
no language support for type checking; the union in these
languages is called free union
• Type checking of unions require that each union include a type
indicator called a discriminant
– Supported by Ada
Ada Union Types
• type Shape is (Circle,Triangle,Rectangle);
• type Colors is (Red, Green, Blue);
• type Figure (Form:Shape)is record
• Filled:Boolean;
• Color: Colors;
• case Form is
• when Circle => Diameter: Float;
• when Triangle =>
• Leftside, Rightside: Integer;
• Angle: Float;
• when Rectangle => Side1,Side2:Integer;
• end case;
• end record;
Ada Union Type Illustrated
• A discriminated union of three shape variables
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Evaluation of Union types
• Free unions are unsafe
– Do not allow type checking
• Java and C# do not support unions
– Reflective of growing concerns for safety in programming language
• Adaʼs discriminated unions are safe
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Pointer and Reference Types
• A pointer type variable has a range of values that consists of memory
addresses and a special value, nil
• Provide the power of indirect addressing
• Provide a way to manage dynamic memory
• A pointer can be used to access a location in the area where storage is
dynamically created (usually called a heap)
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Design issues : Pointer operations
• What are the scope of and lifetime of a pointer variable?
• What is the lifetime of a heap-dynamic variable?
• Are pointers restricted as to the type of value to which they can point?
• Are pointers used for dynamic storage management, indirect addressing,
or both?
• Should the language support pointer types, reference types, or both?
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Pointer problems
• Two fundamentaloperations: assignmentand dereferencing
• Assignment is used to set a pointer variableʼs value to some useful
address
• Dereferencing yields the value stored at the location represented by the
pointerʼs value
• Dereferencing can be explicit or implicit
• C++ uses an explicit operation via *
• j = *ptr
• sets j to the value located at ptr
Pointer Assignment Illustrated
• The assignment operation j = *ptr
Problems with Pointers
• Dangling pointers (dangerous)
– A pointer points to a heap-dynamicvariablethat has been deallocated
• Lost heap-dynamic variable
– An allocated heap-dynamicvariablethat is no longer accessibleto
the user program (often called garbage)
•Pointer p1 is set to point to a newly created heap-dynamic variable
•Pointer p1 is later set to point to another newly created heap-
dynamic variable
•The process of losing heap-dynamic variables is called memory
leakage
Pointers in Ada
• Some dangling pointers are disallowed because dynamic objects can
be automatically deallocated at the end of pointer's type scope
• The lost heap-dynamic variable problem is not eliminated by Ada
(possible with UNCHECKED_DEALLOCATION)
Pointers in C and C++
• Extremely flexible but must be used with care
• Pointers can point at any variable regardless of when or where it
was allocated
• Used for dynamic storage management and addressing
• Pointer arithmetic is possible
• Explicit dereferencing and address-of operators
Pointer Arithmetic in C and C++
• float list[100];
• float *p;
• p = list;
• *(p+5) is equivalent to list[5] and p[5]
• *(p+i) is equivalent to list[i] and p[i]
• Domain type need not be fixed (void *)
• void * can point to any type and can be type checked (cannot be
de-referenced)
Reference Types
• C++ includes a special kind of pointer type called a reference type that
is used primarily for formal parameters
– Advantages of both pass-by-reference and pass-by- value
• Java extends C++ʼs reference variables and allows them to
replace pointers entirely
– References are references to objects, rather than being
addresses
• C# includes both the references of Java and the pointers of C++,
must include ʻunsafeʼmodifier
• Smalltalk, Python, Ruby, Lua: all variables are references;
always implicitly dereferenced
Evaluation of Pointers
• Dangling pointers and dangling objects are problems as is heap
management
• Pointers are like goto's--they widen the range of cells that can be accessed
by a variable
• Pointers or references are necessary for dynamic data structures--so we
can't design a language without them
Representations of Pointers
• Large computers use single values
• Intel microprocessors use segment and offset
Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Dangling pointers
• Tombstone: extra heap cell that is a pointer to the heap-dynamic
variable
– The actual pointer variablepoints only at tombstones
– When heap-dynamicvariablede-allocated,tombstone remains but
set to nil
– Costly in time and space – no popular languages use this..
• Locks-and-keys: Pointer values are represented as (key, address) pairs
– Heap-dynamicvariablesare represented as variable plus cell for
integer lock value
– When heap-dynamicvariableallocated,lock value is created and
placedin lock cell and key cell of pointer. Used in UW-Pascal
(compilerof Pascal)
• Best solution: out of hands of programmer (implicit deallocation:
Java; C# references)
Heap Management
• One of design goals of LISP was that reclamation of unused cells not task of
programmer (most LISP data consists of cells in linked list)
• A very complex run-time process
• Single-size cells vs. variable-size cells
• Fundamental design question: When should deallocation be
performed?
Heap Management
• Fundamental design question: When should
deallocation be performed?
• Two approaches to reclaim garbage
– Reference counters (eager): reclamation is gradual
– Mark-sweep (lazy approach): reclamation occurs
when the list of variable space becomes empty
Reference Counter
• Reference counters: maintain a counter in every cell that store the
number of pointers currently pointing at the cell
– Disadvantages: space required, execution time required to change
counters, complications for cells connected circularly
– Advantage: it is intrinsically incremental, so significant delays
in the application execution are avoided
Mark-Sweep
• The run-time system allocates storage cells as requested and
disconnects pointers from cells as necessary; mark-sweep then begins
to gather garbage
Mark-Sweep
• The run-time system allocates storage cells as requested and disconnects
pointers from cells as necessary; mark-sweep then begins to gather garbage
– Every heap cell has an extra bit used by collection algorithm
– All cells initially set to garbage
– All pointers traced into heap, and reachable cells marked as not
garbage
– All garbage cells returned to list of availablecells
– Disadvantages: in its original form, it was done too infrequently. When done,
it caused significant delays in application execution.
–Contemporary mark-sweep algorithms avoid this by doing it more
often—called incremental mark-sweep

More Related Content

What's hot

FP 201 Unit 2 - Part 2
FP 201 Unit 2 - Part 2FP 201 Unit 2 - Part 2
FP 201 Unit 2 - Part 2
rohassanie
 
C++ Template Metaprogramming
C++ Template MetaprogrammingC++ Template Metaprogramming
C++ Template Metaprogramming
Akira Takahashi
 

What's hot (20)

Python Part 1
Python Part 1Python Part 1
Python Part 1
 
Vulkan 1.1 Reference Guide
Vulkan 1.1 Reference GuideVulkan 1.1 Reference Guide
Vulkan 1.1 Reference Guide
 
Data Analysis with Python Pandas
Data Analysis with Python PandasData Analysis with Python Pandas
Data Analysis with Python Pandas
 
Pointer to array and structure
Pointer to array and structurePointer to array and structure
Pointer to array and structure
 
Python programming
Python  programmingPython  programming
Python programming
 
Kernel Recipes 2017 - 20 years of Linux Virtual Memory - Andrea Arcangeli
Kernel Recipes 2017 - 20 years of Linux Virtual Memory - Andrea ArcangeliKernel Recipes 2017 - 20 years of Linux Virtual Memory - Andrea Arcangeli
Kernel Recipes 2017 - 20 years of Linux Virtual Memory - Andrea Arcangeli
 
Introduction to the basics of Python programming (part 1)
Introduction to the basics of Python programming (part 1)Introduction to the basics of Python programming (part 1)
Introduction to the basics of Python programming (part 1)
 
Pointer in c program
Pointer in c programPointer in c program
Pointer in c program
 
Idiomatic Kotlin
Idiomatic KotlinIdiomatic Kotlin
Idiomatic Kotlin
 
C pointer
C pointerC pointer
C pointer
 
Intro to Python Programming Language
Intro to Python Programming LanguageIntro to Python Programming Language
Intro to Python Programming Language
 
Perl Scripting
Perl ScriptingPerl Scripting
Perl Scripting
 
POINTERS IN C
POINTERS IN CPOINTERS IN C
POINTERS IN C
 
FP 201 Unit 2 - Part 2
FP 201 Unit 2 - Part 2FP 201 Unit 2 - Part 2
FP 201 Unit 2 - Part 2
 
Python Interview Questions And Answers
Python Interview Questions And AnswersPython Interview Questions And Answers
Python Interview Questions And Answers
 
C++ Template Metaprogramming
C++ Template MetaprogrammingC++ Template Metaprogramming
C++ Template Metaprogramming
 
Functions in python
Functions in python Functions in python
Functions in python
 
Python dictionary
Python dictionaryPython dictionary
Python dictionary
 
C++ template-primer
C++ template-primerC++ template-primer
C++ template-primer
 
Pointer in c
Pointer in cPointer in c
Pointer in c
 

Similar to Chapter 4.pptx

L2 datatypes and variables
L2 datatypes and variablesL2 datatypes and variables
L2 datatypes and variables
Ravi_Kant_Sahu
 
332 ch07
332 ch07332 ch07
332 ch07
YaQ10
 

Similar to Chapter 4.pptx (20)

chapter 5.ppt
chapter 5.pptchapter 5.ppt
chapter 5.ppt
 
Datatype
DatatypeDatatype
Datatype
 
L2 datatypes and variables
L2 datatypes and variablesL2 datatypes and variables
L2 datatypes and variables
 
8. data types
8. data types8. data types
8. data types
 
332 ch07
332 ch07332 ch07
332 ch07
 
CS4443 - Modern Programming Language - I Lecture (2)
CS4443 - Modern Programming Language - I  Lecture (2)CS4443 - Modern Programming Language - I  Lecture (2)
CS4443 - Modern Programming Language - I Lecture (2)
 
C programming basic concepts of mahi.pptx
C programming basic concepts of mahi.pptxC programming basic concepts of mahi.pptx
C programming basic concepts of mahi.pptx
 
Java session3
Java session3Java session3
Java session3
 
Learning core java
Learning core javaLearning core java
Learning core java
 
Basic Concepts of C Language.pptx
Basic Concepts of C Language.pptxBasic Concepts of C Language.pptx
Basic Concepts of C Language.pptx
 
VB.NET Datatypes.pptx
VB.NET Datatypes.pptxVB.NET Datatypes.pptx
VB.NET Datatypes.pptx
 
CSC111-Chap_02.pdf
CSC111-Chap_02.pdfCSC111-Chap_02.pdf
CSC111-Chap_02.pdf
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
 
Full CSE 310 Unit 1 PPT.pptx for java language
Full CSE 310 Unit 1 PPT.pptx for java languageFull CSE 310 Unit 1 PPT.pptx for java language
Full CSE 310 Unit 1 PPT.pptx for java language
 
C# Basics
C# BasicsC# Basics
C# Basics
 
Csc240 -lecture_4
Csc240  -lecture_4Csc240  -lecture_4
Csc240 -lecture_4
 
Quick Scala
Quick ScalaQuick Scala
Quick Scala
 
pl12ch6.ppt
pl12ch6.pptpl12ch6.ppt
pl12ch6.ppt
 
L2 datatypes and variables
L2 datatypes and variablesL2 datatypes and variables
L2 datatypes and variables
 
Introduction to c converted-converted
Introduction to c converted-convertedIntroduction to c converted-converted
Introduction to c converted-converted
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
AnaAcapella
 
Call Girls in Uttam Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in  Uttam Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in  Uttam Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in Uttam Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
latest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answerslatest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answers
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Call Girls in Uttam Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in  Uttam Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in  Uttam Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in Uttam Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 

Chapter 4.pptx

  • 1. CSUT111 Paradigm of Programming Language Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind
  • 2. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Chapter 4 Data Types
  • 3. • Introduction to the concept of a data type • Characteristics of the primitive data types • Designs of enumeration and subrange types • Details of structured data types—arrays, associative arrays, records, tuples, lists, and unions • Pointers and references • Design issues and design choices of data types Contents
  • 4. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Introduction Data type • It is a collection of data values • and a set of predefined operations on those values int char
  • 5. Descriptor • collection of the attributes of a variable. • During implementation, a descriptor is an area of memory that stores the attributes of a variable. For static attributes : • descriptors are required only at compile time. • are built by the compiler, as a part of the symbol table, and are used during compilation.
  • 6. For dynamic attributes: • the descriptor must be maintained during execution. • the descriptor is used by the run-time system.
  • 7. Variable • Is not an identifier • Identifier is just an attribute of variable • a variable is a value that can change
  • 8. Object • is associated with the value of a variable and the space it occupies. • Here object is reserved exclusively for instances of user-defined abstract data types
  • 9. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Primitive Data Types Data types that are not defined in terms of other types Data Types Numeric Integer Floating-point Decimal Complex Character Boolean
  • 10. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Numeric Types : Numeric types still play a central role among the collections of types Integer The most common primitive numeric data type is integer byte, short, int, & long
  • 11. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Numeric Types : Floating point Floating-point data types model real numbers float and double Precision is the accuracy of the fractional part of a value, measured as the number of bits. Range is a combination of the range of fractions and the range of exponents. The collection of values that can be represented by a floating-point type is defined in terms of precision and range
  • 13. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Numeric Types : Complex Fortran and Python support a complex data type Complex values are represented as ordered pairs of floating-point values • Complex data type usually provide special syntax for building such values • Extend the basic arithmetic operations ('+', '−', '×', '÷') to act on them.
  • 14. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Numeric Types : Decimal • Decimal data types store a fixed number of decimal digits, with the decimal point at a fixed position in the value. • Separate hardware support for decimal data types. • It is a primary data types for business data processing and are therefore essential to COBOL(Common Business Oriented Language). • C# and F# also have decimal data types. Decimal types precisely used to store decimal values, within a restricted range, which cannot be done with floating-point • Decimal types are stored like character strings, using binary codes for the decimal digits. • It is called binary coded decimal (BCD). • They are stored one digit per byte, or packed two digits per byte. • Four bits to code a decimal digit ,hence a six-digit coded decimal number requires 24 bits
  • 15. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Boolean Types Simplest of all types. Their range of values has only two elements: one for true (1)and one for false.(0) C99 and C++ have a Boolean type but also allow numeric expressions to be used as if they were Boolean Java and C# has Boolean data type Boolean types are often used to represent switches or flags in programs It is represented by a single bit But a single bit of memory cannot be accessed efficiently on many machines, they are often stored in the smallest efficiently addressable cell of memory, typically a byte.
  • 16. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Character Types • Character data are stored in numeric coding. • Commonly used coding was the 8-bit code ASCII. • It allows 256 different characters. • The ASCII character set became inadequate, in 1991, the Unicode Consortium published the UCS-2 standard, a 16-bit character set. • This character code is often called Unicode. • Unicode includes the characters from most of the world’s natural languages • Unicode includes the Cyrillic alphabet, as used in Serbia, and the Thai digits. • The first 128 characters of Unicode are identical to those of ASCII. • Java was the first widely used language to use the Unicode character set. • Later used into JavaScript, Python, Perl, C#, and F#.
  • 17. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Character String Types • The values consist of sequences of characters. • Character string constants are used to label output • The input and output of all kinds of data are often done in terms of strings. • Character strings also are an essential type for character manipulation.
  • 18. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Design Issues • Should strings be simply a special kind of character array or a primitive type? • Should strings have static or dynamic length?
  • 19. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Strings and Their Operations 1. Assignment 2. Catenation 3. Substring reference 4. Comparison 5. Pattern matching. String functions in C and C++ 1.strlen 2.strcpy 3.strcat 4.strcmp
  • 20. Moving string data do not guard against overflowing the destination. Error strcpy(dest, src); If length of src is more then dest, then strcpy will write over the remaining bytes that follow dest. As strcpy does not know the length of dest, it cannot ensure that the memory following it will not be overwritten
  • 21. To avoid C++ programmers should use the string class from the standard library, rather than char arrays Java uses String class – immutable strings And StringBuffer class – mutable strings C#, F# and Ruby include string classes Python includes strings as a primitive type Perl, JavaScript, Ruby, and PHP include built-in pattern-matching operations called regular expressions /[A-Za-z][A-Za-zd]+/
  • 22. String Length Operations Design choices regarding the length of string values
  • 23. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Static Length The length can be static and set when the string is created called a static length string EXAMPLE 1. strings of Python 2. Java’s String class 3. stringg class in C++ 4. Ruby’s built-in String class 5. in .NET class library available to C# and F#
  • 24. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Limited Dynamic Length Strings can have varying length up to a declared and fixed maximum set by the variable’s definition, called limited dynamic length strings. It can store any number of characters between zero and the maximum. EXAMPLE strings in C and the C-style strings of C++
  • 25. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Dynamic Length Strings can have varying length with no maximum called dynamic length strings. Ada 95+ supports all three string length options EXAMPLE JavaScript Perl standard C++ library functions. This option requires the overhead of dynamic storage allocation and deallocation but provides maximum flexibility.
  • 26. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Evaluation Dealing with strings as arrays can be more cumbersome than dealing with a primitive string type. Addition of strings as a primitive type to a language is not costly in terms of either language or compiler complexity. Pattern matching and concatenation are essential and should be included for string type values Although dynamic length strings are obviously the most flexible But the overhead of implementation must be compared against additional flexibility
  • 27. Implementation of Character String Types Compile-time descriptor for static strings Run-time descriptor for limited dynamic strings
  • 28. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind User defined Ordinal types The type in which the range of possible values can be easily associated with the set of positive integers In Java primitive ordinal types are integer, char, and boolean. User Defined Ordinal Types Enumeration Subrange
  • 29. Enumeration types • In which all of the possible values, which are named constants, are provided, or enumerated, in the definition. • It provide a way of defining and grouping collections of named constants, which are called enumeration constants enum days {Mon, Tue, Wed, Thu, Fri, Sat, Sun}; In C++ enum colors {red = 1, blue = 1000, green = 100000} In ML datatype weekdays = Monday | Tuesday | Wednesday | Thursday | Friday In C and Pascal enum colors {red, blue, green, yellow, black}; colors myColor = blue, yourColor = red; myColor++
  • 30. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Subrange types • A subrange type is a contiguous subsequence of an ordinal type. For example, 12..14 is a subrange of integer type. • Subrange types were introduced by Pascal and are included in Ada. Ada type Days is (Mon, Tue, Wed, Thu, Fri, Sat, Sun); subtype Weekdays is Days range Mon..Fri; subtype Index is Integer range 1..100; Subtypes are new names for possibly restricted versions of existing types Day1 : Days; Day2 : Weekdays;
  • 31. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Array types • An array is a homogeneous aggregate/collection of data elements. • an individual element is identified by its position in the aggregate, relative to the first element. • The individual data elements of an array are of the same type C, C++, Java, Ada, and C#
  • 32. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Design issues • What types are legal for subscripts? • Are subscripting expressions in element references range checked? • When are subscript ranges bound? • When does array allocation take place? • Are jagged or rectangular multidimensional arrays allowed, or both? • Can arrays be initialized when they have their storage allocated? • What kinds of slices are allowed, if any
  • 33. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Arrays and indices Specific elements of an array are referenced by means of a two-level syntactic mechanism 1. the aggregate name 2. a dynamic selector consisting of one or more items known as subscripts or indices Eg. int A[10]; ans = A[2]+A[5];
  • 34. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Subscript bindings The binding of the subscript type to an array variable is usually static, but the subscript value ranges are sometimes dynamically bound (A[n] and n is an input ). In some languages, the lower bound of the subscript range is implicit. For example C-based languages, the lower bound of all subscript ranges is fixed at 0 Fortran 95+ it defaults to 1 but can be set to any integer literal. In some other languages, the lower bounds of the subscript ranges must be specified by the programmer
  • 35. Array categories Five categories of arrays, based on • the binding to subscript ranges • the binding to storage • and from where the storage is allocated. Array categories Static array Fixed stack- dynamic array Stack-dynamic array Fixed heap- dynamic array Heap-dynamic array
  • 36. Static array • the subscript ranges are statically bound and storage allocation is static (compile time). • Advantage • efficiency: No dynamic allocation or deallocation is required. • Disadvantage • the storage for the array is fixed for the entire execution time of the program Fixed stack-dynamic array • the subscript ranges are statically bound, but the allocation is done at declaration elaboration time during execution. • Advantage • fixed stack-dynamic arrays over static arrays is space efficiency. • Disadvantage • More allocation and deallocation time.
  • 37. STACK-DYNAMIC ARRAY • both the subscript ranges and the storage allocation are dynamically bound at elaboration time. • Once the subscript ranges are bound and the storage is allocated, however, they remain fixed during the lifetime of the variable. • Advantage • stack-dynamic arrays over static and fixed stack-dynamic arrays is flexibility. • The size of an array need not be known until the array is about to be used. FIXED HEAP-DYNAMIC ARRAY • the subscript ranges and the storage binding are both fixed after storage is allocated. • both the subscript ranges and storage bindings are done when the user program requests them during execution • and the storage is allocated from the heap, rather than the stack. • advantage • flexibility—the array’s size always fits the problem. • disadvantage • allocation time from the heap is longer than allocation time from the stack
  • 38. HEAP-DYNAMIC ARRAY • the binding of subscript ranges and storage allocation is dynamic and can change any number of times during the array’s lifetime. • advantage • flexibility: Arrays can grow and shrink during program execution as the need for space changes. • disadvantage • allocation and deallocation take longer and may happen many times during execution of the program.
  • 39. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Heterogeneous arrays
  • 40. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Array initialization Initialize arrays at the time their storage is allocated int list [] = {4, 5, 7, 83}; char name [] = "freddie"; char *names [] = {"Bob", "Jake", "Darcie"}; // C, C++ String[] names = ["Bob", "Jake", "Darcie"]; //Java List : array (1..5) of Integer := (1, 3, 5, 7, 9); //Ada Bunch : array (1..5) of Integer := (1 => 17, 3 => 34, others => 0)
  • 41. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Array operations An array operation is one that operates on an array as a unit Most common array operations : 1. Assignment 2. Catenation 3. Comparison for equality and inequality 4. Slices
  • 42. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Array operations • The C-based languages do not provide any array operations, but uses methods of Java, C++, and C#. • Perl supports array assignments but does not support comparisons. • Ada allows array assignments and concatenations(&) • Python provides array assignment, concatenation(+), membership (in), comparison (is and ==) • Ruby supports comparison (==) and concatenation • Fortran 95+ assignment, arithmetic, relational, and logical operators are all overloaded for arrays of any size or shape. Also includes intrinsic, or library, functions for matrix multiplication, matrix transpose, and vector dot product • F# includes many array operators in its Array module. Among these are Array.append, Array.copy, and Array.length
  • 43. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Rectangular and Jagged arrays • A rectangular array is a multidimensional array in which all of the rows have the same number of elements and all of the columns have the same number of elements. Rectangular arrays model rectangular tables exactly • A jagged array is one in which the lengths of the rows need not be the same • C, C++, and Java support jagged arrays but not rectangular arrays. In those languages, a reference to an element of a multidimensional array uses a separate pair of brackets for each dimension. • For example, myArray[3][7] • Fortran, Ada, C#, and F# support rectangular arrays. (C# and F# also support jagged arrays.) • myArray[3, 7]
  • 44. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Slices A slice of an array is some substructure of that array It is important to realize that a slice is not a new data type. Rather, it is a mechanism for referencing part of an array as a unit Python vector = [2, 4, 6, 8, 10, 12, 14, 16] mat = [[1, 2, 3],[4, 5, 6],[7, 8, 9]] vector[3:6] is a three-element array with the fourth through sixth elements of vector mat[0][0:2] refers to the first and second element of the first row of mat, which is [1, 2] Perl @list[1..5] = @list2[3, 5, 7, 9, 13] Ruby supports slices with the slice method of its Array object list = [2, 4, 6, 8, 10] list.slice(2, 2) returns [6, 8].
  • 45. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Implementation of Array Types • Implementing arrays requires considerably more compile-time effort than does implementing primitive types. • address(list[k]) = address(list[0]) + k * element_size Generalized formula • address(list[k]) = address(list[lower_bound]) + ((k - lower_bound) * element_size) A compile-time descriptor for a one dimensional array
  • 46. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Implementation of Array Types Multidimensional arrays are • Row major • Column Major • row major order as 3, 4, 7, 6, 2, 5, 1, 3, 8 • Column major order as 3, 6, 1, 4, 2, 3, 7, 5, 8 location(a[i,j]) = address of a[0, 0] + ((((number of rows above the ith row) * (size of a row)) + (number of elements left of the jth column)) * element size) location(a[i, j]) = address of a[0, 0] + (((i * n) + j) * element_size)
  • 47. A compile-time descriptor for a multidimensional array
  • 48. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Associative Arrays An associative array is an unordered collection of data elements that are indexed by an equal number of values called keys. In Normal arrays, the indices never need to be stored In an associative array, the user-defined keys must be stored in the structure each element of an associative array is in fact a pair of entities, a key and a value Associative arrays are also supported directly by Python, Ruby, and Lua and by the standard class libraries of Java, C++, C#, and F#. Design issue the form of references to their elements.
  • 49. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Structure and operations Perl, associative arrays are called hashes, because in the implementation their elements are stored and retrieved with hash functions Every hash variable name must begin with a percent sign (%). Each hash element consists of two parts: a key, which is a string, and a value, which is a scalar (number, string, or reference). %salaries = ("Gary" => 75000, "Perry" => 57000, "Mary" => 55750, "Cedric" => 47850) Referencing associative array. in Perl- variable name begins with $ $salaries{"Perry"} = 58850;
  • 50. An element can be removed from the hash delete $salaries{"Gary"}; The entire hash can be emptied %salaries = (); The size of a Perl hash is dynamic: It grows when an element is added and shrinks when an element is deleted, Python’s associative arrays are called dictionaries similar to Perl, except the values are all references to objects. In Ruby it is similar to Python, except that the keys can be any object, rather than just strings. PHP’s arrays are both normal arrays and associative arrays. They can be treated as either. The language provides functions that allow both indexed and hashed access to elements C# and F# support associative arrays through a .NET class.
  • 51. Benefits of associative array over normal array 1. An associative array supports implicit hashing operation to access elements and is very efficient. 2. associative arrays are ideal when the data to be stored is paired, as with employee names and their salaries. On the other hand, if every element of a list must be processed, it is more efficient to use an array
  • 52. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Implementing associative arrays, • The implementation of Perl’s associative arrays is optimized for fast lookups, but it also provides relatively fast reorganization when array growth requires it. • A 32-bit hash value is computed for each entry and is stored with the entry, although an associative array initially uses only a small part of the hash value. • When an associative array must be expanded beyond its initial size, the hash function need not be changed; rather, more bits of the hash value are used. • Only half of the entries must be moved when this happens • although expansion of an associative array is not free, it is not as costly as might be expected.
  • 53. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Record types A record is an aggregate of data elements in which the individual elements are identified by names and accessed through offsets from the beginning of the structure There is frequently a need in programs to model a collection of data in which the individual elements are not of the same type or size Example : Information of Student, Patient, Books etc Are records and heterogeneous arrays same?? The elements of a heterogeneous array are all references to data objects that reside in scattered locations, often on the heap. The elements of a record are of potentially different sizes and reside in adjacent memory locations.
  • 54. C, C++, and C#, records are supported with the struct data type Structs are also included in ML and F# Fortran and COBOL supports records In Python and Ruby, records can be implemented as hashes, which themselves can be elements Design issues 1. What is the syntactic form of references to fields? 2. Are elliptical references allowed?
  • 55. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Definitions of records The fundamental difference between a record and an array is that record elements ( fields) , are not referenced by indices. But fields are named with identifiers, and references to the fields are made using these identifiers. COBOL uses level numbers to show nested records; others use recursive definition 01 EMP-REC. 02 EMP-NAME. 05 FIRST PIC X(20). 05 MID PIC X(10). 05 LAST PIC X(20). 02 HOURLY-RATE PIC 99V99.
  • 56. Definition of Records in Ada • Record structures are indicated in an orthogonal way (nested example) type Emp_Name_Type is record First: String (1..20); Mid: String (1..10); Last: String (1..20); end record; type Emp_Rec_Type is record Emp_Name: Emp_Name_Type; Hourly_Rate: Float; end record;
  • 57. Definition of Records in C++ • Nested example (more similar to Ada) structEmp_Name_Type { string first; stringmiddle; stringlast; }; structEmp_Rec_Type{ Emp_Name_TypeEmp_name; floathourly_rate; }
  • 58. • Record field references – 1. COBOL – field_name OF record_name_1OF ... OF record_name_n – 2. Others (dot notation) – record_name_1.record_name_2. ...record_name_n.field_name • Fully qualifiedreferences must includeall record names • Ellipticalreferences allowleaving out record namesas long as the reference is unambiguous, for examplein COBOL • FIRST,FIRSTOFEMP-NAME,and FIRSTof EMP-RECare ellipticalreferences to the employeeʼsfirst name References to record fields
  • 59. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Operations on records • Assignment is very common if the types are identical • Ada allows record comparison • Ada records can be initialized with aggregate literals • COBOL provides MOVE CORRESPONDING – Copies a field of the source record to the corresponding field in the target record
  • 60. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Evaluation • Records are used when collection of data values is heterogeneous • Access to array elements is much slower than access to record fields, because subscripts are dynamic (field names are static) • Dynamic subscripts could be used with record field access, but it would disallow type checking and it would be much slower
  • 61. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Implementation of Record types • Offset address relative to the beginning of the records is associated with each field
  • 62. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Union Types • A union is a type whose variables are allowed to store different type values at different times during execution • Design issues 1. Should type checking be required? 2. Should unions be embedded in records?
  • 63. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Discriminated versus Free unions • Fortran, C, and C++ provide union constructs in which there is no language support for type checking; the union in these languages is called free union • Type checking of unions require that each union include a type indicator called a discriminant – Supported by Ada
  • 64. Ada Union Types • type Shape is (Circle,Triangle,Rectangle); • type Colors is (Red, Green, Blue); • type Figure (Form:Shape)is record • Filled:Boolean; • Color: Colors; • case Form is • when Circle => Diameter: Float; • when Triangle => • Leftside, Rightside: Integer; • Angle: Float; • when Rectangle => Side1,Side2:Integer; • end case; • end record;
  • 65. Ada Union Type Illustrated • A discriminated union of three shape variables
  • 66. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Evaluation of Union types • Free unions are unsafe – Do not allow type checking • Java and C# do not support unions – Reflective of growing concerns for safety in programming language • Adaʼs discriminated unions are safe
  • 67. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Pointer and Reference Types • A pointer type variable has a range of values that consists of memory addresses and a special value, nil • Provide the power of indirect addressing • Provide a way to manage dynamic memory • A pointer can be used to access a location in the area where storage is dynamically created (usually called a heap)
  • 68. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Design issues : Pointer operations • What are the scope of and lifetime of a pointer variable? • What is the lifetime of a heap-dynamic variable? • Are pointers restricted as to the type of value to which they can point? • Are pointers used for dynamic storage management, indirect addressing, or both? • Should the language support pointer types, reference types, or both?
  • 69. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Pointer problems • Two fundamentaloperations: assignmentand dereferencing • Assignment is used to set a pointer variableʼs value to some useful address • Dereferencing yields the value stored at the location represented by the pointerʼs value • Dereferencing can be explicit or implicit • C++ uses an explicit operation via * • j = *ptr • sets j to the value located at ptr
  • 70. Pointer Assignment Illustrated • The assignment operation j = *ptr
  • 71. Problems with Pointers • Dangling pointers (dangerous) – A pointer points to a heap-dynamicvariablethat has been deallocated • Lost heap-dynamic variable – An allocated heap-dynamicvariablethat is no longer accessibleto the user program (often called garbage) •Pointer p1 is set to point to a newly created heap-dynamic variable •Pointer p1 is later set to point to another newly created heap- dynamic variable •The process of losing heap-dynamic variables is called memory leakage
  • 72. Pointers in Ada • Some dangling pointers are disallowed because dynamic objects can be automatically deallocated at the end of pointer's type scope • The lost heap-dynamic variable problem is not eliminated by Ada (possible with UNCHECKED_DEALLOCATION)
  • 73. Pointers in C and C++ • Extremely flexible but must be used with care • Pointers can point at any variable regardless of when or where it was allocated • Used for dynamic storage management and addressing • Pointer arithmetic is possible • Explicit dereferencing and address-of operators
  • 74. Pointer Arithmetic in C and C++ • float list[100]; • float *p; • p = list; • *(p+5) is equivalent to list[5] and p[5] • *(p+i) is equivalent to list[i] and p[i] • Domain type need not be fixed (void *) • void * can point to any type and can be type checked (cannot be de-referenced)
  • 75. Reference Types • C++ includes a special kind of pointer type called a reference type that is used primarily for formal parameters – Advantages of both pass-by-reference and pass-by- value • Java extends C++ʼs reference variables and allows them to replace pointers entirely – References are references to objects, rather than being addresses • C# includes both the references of Java and the pointers of C++, must include ʻunsafeʼmodifier • Smalltalk, Python, Ruby, Lua: all variables are references; always implicitly dereferenced
  • 76. Evaluation of Pointers • Dangling pointers and dangling objects are problems as is heap management • Pointers are like goto's--they widen the range of cells that can be accessed by a variable • Pointers or references are necessary for dynamic data structures--so we can't design a language without them
  • 77. Representations of Pointers • Large computers use single values • Intel microprocessors use segment and offset
  • 78. Ranjana Shevkar, Assistant Professor, Modern College Ganeshkhind Dangling pointers • Tombstone: extra heap cell that is a pointer to the heap-dynamic variable – The actual pointer variablepoints only at tombstones – When heap-dynamicvariablede-allocated,tombstone remains but set to nil – Costly in time and space – no popular languages use this.. • Locks-and-keys: Pointer values are represented as (key, address) pairs – Heap-dynamicvariablesare represented as variable plus cell for integer lock value – When heap-dynamicvariableallocated,lock value is created and placedin lock cell and key cell of pointer. Used in UW-Pascal (compilerof Pascal) • Best solution: out of hands of programmer (implicit deallocation: Java; C# references)
  • 79.
  • 80. Heap Management • One of design goals of LISP was that reclamation of unused cells not task of programmer (most LISP data consists of cells in linked list) • A very complex run-time process • Single-size cells vs. variable-size cells • Fundamental design question: When should deallocation be performed?
  • 81. Heap Management • Fundamental design question: When should deallocation be performed? • Two approaches to reclaim garbage – Reference counters (eager): reclamation is gradual – Mark-sweep (lazy approach): reclamation occurs when the list of variable space becomes empty
  • 82. Reference Counter • Reference counters: maintain a counter in every cell that store the number of pointers currently pointing at the cell – Disadvantages: space required, execution time required to change counters, complications for cells connected circularly – Advantage: it is intrinsically incremental, so significant delays in the application execution are avoided
  • 83. Mark-Sweep • The run-time system allocates storage cells as requested and disconnects pointers from cells as necessary; mark-sweep then begins to gather garbage
  • 84. Mark-Sweep • The run-time system allocates storage cells as requested and disconnects pointers from cells as necessary; mark-sweep then begins to gather garbage – Every heap cell has an extra bit used by collection algorithm – All cells initially set to garbage – All pointers traced into heap, and reachable cells marked as not garbage – All garbage cells returned to list of availablecells – Disadvantages: in its original form, it was done too infrequently. When done, it caused significant delays in application execution. –Contemporary mark-sweep algorithms avoid this by doing it more often—called incremental mark-sweep