3. • Introduction to the concept of a data type
• Characteristics of the primitive data types
• Designs of enumeration and subrange types
• Details of structured data types—arrays, associative
arrays, records, tuples, lists, and unions
• Pointers and references
• Design issues and design choices of data types
Contents
4. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Introduction
Data type
• It is a collection of data values
• and a set of predefined operations on those values
int
char
5. Descriptor
• collection of the attributes of a variable.
• During implementation, a descriptor is an
area of memory that stores the attributes of a
variable.
For static attributes :
• descriptors are required only at compile
time.
• are built by the compiler, as a part of the
symbol table, and are used during
compilation.
6. For dynamic attributes:
• the descriptor must be maintained during
execution.
• the descriptor is used by the run-time
system.
7. Variable
• Is not an identifier
• Identifier is just an attribute of variable
• a variable is a value that can change
8. Object
• is associated with the value of a variable and the
space it occupies.
• Here object is reserved exclusively for instances of
user-defined abstract data types
9. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Primitive Data Types
Data types that are not defined in terms of other types
Data Types
Numeric
Integer
Floating-point
Decimal
Complex
Character
Boolean
10. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Numeric Types :
Numeric types still play a central role among the
collections of types
Integer
The most common primitive numeric data type is integer
byte, short, int, & long
11. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Numeric Types : Floating point
Floating-point data types model real numbers
float and double
Precision is the accuracy of the fractional part of a
value, measured as the number of bits.
Range is a combination of the range of fractions and
the range of exponents.
The collection of values that can be represented by a
floating-point type is defined in terms of precision and
range
13. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Numeric Types : Complex
Fortran and Python support a complex data type
Complex values are represented as ordered pairs of
floating-point values
• Complex data type usually provide special syntax for
building such values
• Extend the basic arithmetic operations ('+', '−', '×', '÷')
to act on them.
14. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Numeric Types : Decimal
• Decimal data types store a fixed number of decimal digits, with the
decimal point at a fixed position in the value.
• Separate hardware support for decimal data types.
• It is a primary data types for business data processing and are
therefore essential to COBOL(Common Business Oriented Language).
• C# and F# also have decimal data types.
Decimal types precisely used to store decimal values, within a restricted
range, which cannot be done with floating-point
• Decimal types are stored like character strings, using binary codes for the
decimal digits.
• It is called binary coded decimal (BCD).
• They are stored one digit per byte, or packed two digits per byte.
• Four bits to code a decimal digit ,hence a six-digit coded decimal number
requires 24 bits
15. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Boolean Types
Simplest of all types.
Their range of values has only two elements: one for true (1)and
one for false.(0)
C99 and C++ have a Boolean type but also allow numeric expressions to
be used as if they were Boolean
Java and C# has Boolean data type
Boolean types are often used to represent switches or flags
in programs
It is represented by a single bit
But a single bit of memory cannot be accessed efficiently on many
machines, they are often stored in the smallest efficiently addressable cell
of memory, typically a byte.
16. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Character Types
• Character data are stored in numeric coding.
• Commonly used coding was the 8-bit code ASCII.
• It allows 256 different characters.
• The ASCII character set became inadequate, in 1991, the Unicode
Consortium published the UCS-2 standard, a 16-bit character set.
• This character code is often called Unicode.
• Unicode includes the characters from most of the world’s natural languages
• Unicode includes the Cyrillic alphabet, as used in Serbia, and the Thai
digits.
• The first 128 characters of Unicode are identical to those of ASCII.
• Java was the first widely used language to use the Unicode character set.
• Later used into JavaScript, Python, Perl, C#, and F#.
17. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Character String Types
• The values consist of sequences of characters.
• Character string constants are used to label output
• The input and output of all kinds of data are often done in terms
of strings.
• Character strings also are an essential type for character
manipulation.
18. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Design Issues
• Should strings be simply a special kind of character array or a
primitive type?
• Should strings have static or dynamic length?
19. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Strings and Their Operations
1. Assignment
2. Catenation
3. Substring reference
4. Comparison
5. Pattern matching.
String functions in C and C++
1.strlen
2.strcpy
3.strcat
4.strcmp
20. Moving string data do not guard against overflowing the
destination.
Error
strcpy(dest, src);
If length of src is more then dest, then strcpy will write over the
remaining bytes that follow dest.
As strcpy does not know the length of dest, it cannot ensure that the
memory following it will not be overwritten
21. To avoid
C++ programmers should use the string class from the standard
library, rather than char arrays
Java uses String class – immutable strings
And StringBuffer class – mutable strings
C#, F# and Ruby include string classes
Python includes strings as a primitive type
Perl, JavaScript, Ruby, and PHP include built-in pattern-matching
operations called regular expressions
/[A-Za-z][A-Za-zd]+/
23. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Static Length
The length can be static and set when the string is created called a static
length string
EXAMPLE
1. strings of Python
2. Java’s String class
3. stringg class in C++
4. Ruby’s built-in String class
5. in .NET class library available to C# and F#
24. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Limited Dynamic Length
Strings can have varying length up to a declared and fixed maximum set
by the variable’s definition, called limited dynamic length strings.
It can store any number of characters between zero and the maximum.
EXAMPLE
strings in C and the C-style strings of C++
25. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Dynamic Length
Strings can have varying length with no maximum called dynamic
length strings.
Ada 95+ supports all three string length options
EXAMPLE
JavaScript
Perl
standard C++ library functions.
This option requires the overhead of dynamic storage allocation and
deallocation but provides maximum flexibility.
26. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Evaluation
Dealing with strings as arrays can be more cumbersome than dealing
with a primitive string type.
Addition of strings as a primitive type to a language is not costly in terms
of either language or compiler complexity.
Pattern matching and concatenation are essential and should be
included for string type values
Although dynamic length strings are obviously the most flexible
But the overhead of implementation must be compared against
additional flexibility
27. Implementation of Character String Types
Compile-time descriptor for
static strings
Run-time descriptor for
limited dynamic strings
28. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
User defined Ordinal types
The type in which the range of possible values can be easily
associated with the set of positive integers
In Java primitive ordinal types are integer, char, and boolean.
User Defined
Ordinal Types
Enumeration Subrange
29. Enumeration types
• In which all of the possible values, which are named constants, are provided, or
enumerated, in the definition.
• It provide a way of defining and grouping collections of named constants, which
are called enumeration constants
enum days {Mon, Tue, Wed, Thu, Fri, Sat, Sun};
In C++
enum colors {red = 1, blue = 1000, green = 100000}
In ML
datatype weekdays = Monday | Tuesday | Wednesday | Thursday | Friday
In C and Pascal
enum colors {red, blue, green, yellow, black};
colors myColor = blue, yourColor = red;
myColor++
30. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Subrange types
• A subrange type is a contiguous subsequence of an ordinal type. For example,
12..14 is a subrange of integer type.
• Subrange types were introduced by Pascal and are included in Ada.
Ada
type Days is (Mon, Tue, Wed, Thu, Fri, Sat, Sun);
subtype Weekdays is Days range Mon..Fri;
subtype Index is Integer range 1..100;
Subtypes are new names for possibly restricted versions of existing types
Day1 : Days;
Day2 : Weekdays;
31. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Array types
• An array is a homogeneous aggregate/collection of data elements.
• an individual element is identified by its position in the aggregate,
relative to the first element.
• The individual data elements of an array are of the same type
C, C++, Java, Ada, and C#
32. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Design issues
• What types are legal for subscripts?
• Are subscripting expressions in element references range checked?
• When are subscript ranges bound?
• When does array allocation take place?
• Are jagged or rectangular multidimensional arrays allowed, or both?
• Can arrays be initialized when they have their storage allocated?
• What kinds of slices are allowed, if any
33. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Arrays and indices
Specific elements of an array are referenced by means of a
two-level syntactic mechanism
1. the aggregate name
2. a dynamic selector consisting of one or more items
known as subscripts or indices
Eg.
int A[10];
ans = A[2]+A[5];
34. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Subscript bindings
The binding of the subscript type to an array variable is usually
static, but the subscript value ranges are sometimes dynamically
bound (A[n] and n is an input ).
In some languages, the lower bound of the subscript range is
implicit.
For example
C-based languages, the lower bound of all subscript ranges is fixed
at 0
Fortran 95+ it defaults to 1 but can be set to any integer literal.
In some other languages, the lower bounds of the subscript ranges
must be specified by the programmer
35. Array categories
Five categories of arrays, based on
• the binding to subscript ranges
• the binding to storage
• and from where the storage is allocated.
Array
categories
Static array
Fixed stack-
dynamic array
Stack-dynamic
array
Fixed heap-
dynamic array
Heap-dynamic
array
36. Static array
• the subscript ranges are statically bound and storage allocation is static
(compile time).
• Advantage
• efficiency: No dynamic allocation or deallocation is required.
• Disadvantage
• the storage for the array is fixed for the entire execution time of the program
Fixed stack-dynamic array
• the subscript ranges are statically bound, but the allocation is done at
declaration elaboration time during execution.
• Advantage
• fixed stack-dynamic arrays over static arrays is space efficiency.
• Disadvantage
• More allocation and deallocation time.
37. STACK-DYNAMIC ARRAY
• both the subscript ranges and the storage allocation are dynamically bound at
elaboration time.
• Once the subscript ranges are bound and the storage is allocated, however, they remain
fixed during the lifetime of the variable.
• Advantage
• stack-dynamic arrays over static and fixed stack-dynamic arrays is flexibility.
• The size of an array need not be known until the array is about to be used.
FIXED HEAP-DYNAMIC ARRAY
• the subscript ranges and the storage binding are both fixed after storage is allocated.
• both the subscript ranges and storage bindings are done when the user program
requests them during execution
• and the storage is allocated from the heap, rather than the stack.
• advantage
• flexibility—the array’s size always fits the problem.
• disadvantage
• allocation time from the heap is longer than allocation time from the stack
38. HEAP-DYNAMIC ARRAY
• the binding of subscript ranges and storage allocation is dynamic
and can change any number of times during the array’s lifetime.
• advantage
• flexibility: Arrays can grow and shrink during program execution
as the need for space changes.
• disadvantage
• allocation and deallocation take longer and may happen many
times during execution of the program.
40. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Array initialization
Initialize arrays at the time their storage is allocated
int list [] = {4, 5, 7, 83};
char name [] = "freddie";
char *names [] = {"Bob", "Jake", "Darcie"}; // C, C++
String[] names = ["Bob", "Jake", "Darcie"]; //Java
List : array (1..5) of Integer := (1, 3, 5, 7, 9); //Ada
Bunch : array (1..5) of Integer := (1 => 17, 3 => 34, others => 0)
41. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Array operations
An array operation is one that operates on an array as a unit
Most common array operations :
1. Assignment
2. Catenation
3. Comparison for equality and inequality
4. Slices
42. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Array operations
• The C-based languages do not provide any array operations, but uses
methods of Java, C++, and C#.
• Perl supports array assignments but does not support comparisons.
• Ada allows array assignments and concatenations(&)
• Python provides array assignment, concatenation(+), membership (in),
comparison (is and ==)
• Ruby supports comparison (==) and concatenation
• Fortran 95+ assignment, arithmetic, relational, and logical operators are
all overloaded for arrays of any size or shape. Also includes intrinsic, or
library, functions for matrix multiplication, matrix transpose, and vector
dot product
• F# includes many array operators in its Array module. Among these are
Array.append, Array.copy, and Array.length
43. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Rectangular and Jagged arrays
• A rectangular array is a multidimensional array in which all of the rows
have the same number of elements and all of the columns have the
same number of elements. Rectangular arrays model rectangular tables
exactly
• A jagged array is one in which the lengths of the rows
need not be the same
• C, C++, and Java support jagged arrays but not rectangular arrays. In
those languages, a reference to an element of a multidimensional
array uses a separate pair of brackets for each dimension.
• For example, myArray[3][7]
• Fortran, Ada, C#, and F# support rectangular arrays. (C# and F#
also support jagged arrays.)
• myArray[3, 7]
44. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Slices
A slice of an array is some substructure of that array
It is important to realize that a slice is not a new data type.
Rather, it is a mechanism for referencing part of an array as a unit
Python
vector = [2, 4, 6, 8, 10, 12, 14, 16]
mat = [[1, 2, 3],[4, 5, 6],[7, 8, 9]]
vector[3:6] is a three-element array with the fourth through sixth elements of vector
mat[0][0:2] refers to the first and second element of the first row of
mat, which is [1, 2]
Perl
@list[1..5] = @list2[3, 5, 7, 9, 13]
Ruby supports slices with the slice method of its Array object
list = [2, 4, 6, 8, 10]
list.slice(2, 2) returns [6, 8].
45. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Implementation of Array Types
• Implementing arrays requires considerably more compile-time
effort than does implementing primitive types.
• address(list[k]) = address(list[0]) + k * element_size
Generalized formula
• address(list[k]) = address(list[lower_bound]) + ((k - lower_bound) *
element_size)
A compile-time descriptor for a one dimensional array
46. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Implementation of Array Types
Multidimensional arrays are
• Row major
• Column Major
• row major order as 3, 4, 7, 6, 2, 5, 1, 3, 8
• Column major order as 3, 6, 1, 4, 2, 3, 7, 5, 8
location(a[i,j]) = address of a[0, 0] + ((((number of rows above the ith row) * (size of a
row)) + (number of elements left of the jth column)) * element size)
location(a[i, j]) = address of a[0, 0] + (((i * n) + j) * element_size)
48. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Associative Arrays
An associative array is an unordered collection of data elements that are
indexed by an equal number of values called keys.
In Normal arrays, the indices never need to be stored
In an associative array, the user-defined keys must be stored in the structure
each element of an associative array is in fact a pair of entities, a key
and a value
Associative arrays are also supported directly by Python, Ruby, and
Lua and by the standard class libraries of Java, C++, C#, and F#.
Design issue
the form of references to their elements.
49. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Structure and operations
Perl, associative arrays are called hashes, because in the implementation
their elements are stored and retrieved with hash functions
Every hash variable name must begin with a percent sign (%).
Each hash element consists of two parts: a key, which is a string, and a
value, which is a scalar (number, string, or reference).
%salaries = ("Gary" => 75000, "Perry" => 57000, "Mary" => 55750, "Cedric" => 47850)
Referencing associative array.
in Perl- variable name begins with $
$salaries{"Perry"} = 58850;
50. An element can be removed from the hash
delete $salaries{"Gary"};
The entire hash can be emptied
%salaries = ();
The size of a Perl hash is dynamic: It grows when an element is added
and shrinks when an element is deleted,
Python’s associative arrays are called dictionaries similar to Perl, except the
values are all references to objects.
In Ruby it is similar to Python, except that the keys can be any object, rather
than just strings.
PHP’s arrays are both normal arrays and associative arrays. They can
be treated as either. The language provides functions that allow both
indexed and hashed access to elements
C# and F# support associative arrays through a .NET class.
51. Benefits of associative array over normal array
1. An associative array supports implicit hashing operation to access
elements and is very efficient.
2. associative arrays are ideal when the data to be stored is paired,
as with employee names and their salaries.
On the other hand, if every element of a list must be processed, it is
more efficient to use an array
52. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Implementing associative arrays,
• The implementation of Perl’s associative arrays is optimized for
fast lookups, but it also provides relatively fast reorganization
when array growth requires it.
• A 32-bit hash value is computed for each entry and is stored
with the entry, although an associative array initially uses only a
small part of the hash value.
• When an associative array must be expanded beyond its initial
size, the hash function need not be changed; rather, more bits of
the hash value are used.
• Only half of the entries must be moved when this happens
• although expansion of an associative array is not free, it is not as
costly as might be expected.
53. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Record types
A record is an aggregate of data elements
in which the individual elements are identified by names
and accessed through offsets from the beginning of the structure
There is frequently a need in programs to model a collection of data in
which the individual elements are not of the same type or size
Example : Information of Student, Patient, Books etc
Are records and heterogeneous arrays same??
The elements of a heterogeneous array are all references to data objects that
reside in scattered locations, often on the heap.
The elements of a record are of potentially different sizes and reside in
adjacent memory locations.
54. C, C++, and C#, records are supported with the struct data type
Structs are also included in ML and F#
Fortran and COBOL supports records
In Python and Ruby, records can be implemented as hashes, which
themselves can be elements
Design issues
1. What is the syntactic form of references to fields?
2. Are elliptical references allowed?
55. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Definitions of records
The fundamental difference between a record and an array is that record
elements ( fields) , are not referenced by indices. But fields are named with
identifiers, and references to the fields are made using these identifiers.
COBOL uses level numbers to show nested records;
others use recursive definition
01 EMP-REC.
02 EMP-NAME.
05 FIRST PIC X(20).
05 MID PIC X(10).
05 LAST PIC X(20).
02 HOURLY-RATE PIC 99V99.
56. Definition of Records in Ada
• Record structures are indicated in an orthogonal way
(nested example)
type Emp_Name_Type is record
First: String (1..20);
Mid: String (1..10);
Last: String (1..20);
end record;
type Emp_Rec_Type is record
Emp_Name: Emp_Name_Type;
Hourly_Rate: Float;
end record;
57. Definition of Records in C++
• Nested example (more similar to Ada)
structEmp_Name_Type { string
first;
stringmiddle;
stringlast;
};
structEmp_Rec_Type{
Emp_Name_TypeEmp_name;
floathourly_rate;
}
58. • Record field references
– 1. COBOL
– field_name OF record_name_1OF ... OF record_name_n
– 2. Others (dot notation)
– record_name_1.record_name_2. ...record_name_n.field_name
• Fully qualifiedreferences must includeall record names
• Ellipticalreferences allowleaving out record namesas long as the reference
is unambiguous, for examplein COBOL
• FIRST,FIRSTOFEMP-NAME,and FIRSTof EMP-RECare ellipticalreferences
to the employeeʼsfirst name
References to record fields
59. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Operations on records
• Assignment is very common if the types are identical
• Ada allows record comparison
• Ada records can be initialized with aggregate literals
• COBOL provides MOVE CORRESPONDING
– Copies a field of the source record to the corresponding field in the
target record
60. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Evaluation
• Records are used when collection of data values is heterogeneous
• Access to array elements is much slower than access to record fields,
because subscripts are dynamic (field names are static)
• Dynamic subscripts could be used with record field access, but it would
disallow type checking and it would be much slower
61. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Implementation of Record types
• Offset address relative to
the beginning of the
records is associated with
each field
62. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Union Types
• A union is a type whose variables are allowed to store different type
values at different times during execution
• Design issues
1. Should type checking be required?
2. Should unions be embedded in records?
63. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Discriminated versus Free unions
• Fortran, C, and C++ provide union constructs in which there is
no language support for type checking; the union in these
languages is called free union
• Type checking of unions require that each union include a type
indicator called a discriminant
– Supported by Ada
64. Ada Union Types
• type Shape is (Circle,Triangle,Rectangle);
• type Colors is (Red, Green, Blue);
• type Figure (Form:Shape)is record
• Filled:Boolean;
• Color: Colors;
• case Form is
• when Circle => Diameter: Float;
• when Triangle =>
• Leftside, Rightside: Integer;
• Angle: Float;
• when Rectangle => Side1,Side2:Integer;
• end case;
• end record;
65. Ada Union Type Illustrated
• A discriminated union of three shape variables
66. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Evaluation of Union types
• Free unions are unsafe
– Do not allow type checking
• Java and C# do not support unions
– Reflective of growing concerns for safety in programming language
• Adaʼs discriminated unions are safe
67. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Pointer and Reference Types
• A pointer type variable has a range of values that consists of memory
addresses and a special value, nil
• Provide the power of indirect addressing
• Provide a way to manage dynamic memory
• A pointer can be used to access a location in the area where storage is
dynamically created (usually called a heap)
68. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Design issues : Pointer operations
• What are the scope of and lifetime of a pointer variable?
• What is the lifetime of a heap-dynamic variable?
• Are pointers restricted as to the type of value to which they can point?
• Are pointers used for dynamic storage management, indirect addressing,
or both?
• Should the language support pointer types, reference types, or both?
69. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Pointer problems
• Two fundamentaloperations: assignmentand dereferencing
• Assignment is used to set a pointer variableʼs value to some useful
address
• Dereferencing yields the value stored at the location represented by the
pointerʼs value
• Dereferencing can be explicit or implicit
• C++ uses an explicit operation via *
• j = *ptr
• sets j to the value located at ptr
71. Problems with Pointers
• Dangling pointers (dangerous)
– A pointer points to a heap-dynamicvariablethat has been deallocated
• Lost heap-dynamic variable
– An allocated heap-dynamicvariablethat is no longer accessibleto
the user program (often called garbage)
•Pointer p1 is set to point to a newly created heap-dynamic variable
•Pointer p1 is later set to point to another newly created heap-
dynamic variable
•The process of losing heap-dynamic variables is called memory
leakage
72. Pointers in Ada
• Some dangling pointers are disallowed because dynamic objects can
be automatically deallocated at the end of pointer's type scope
• The lost heap-dynamic variable problem is not eliminated by Ada
(possible with UNCHECKED_DEALLOCATION)
73. Pointers in C and C++
• Extremely flexible but must be used with care
• Pointers can point at any variable regardless of when or where it
was allocated
• Used for dynamic storage management and addressing
• Pointer arithmetic is possible
• Explicit dereferencing and address-of operators
74. Pointer Arithmetic in C and C++
• float list[100];
• float *p;
• p = list;
• *(p+5) is equivalent to list[5] and p[5]
• *(p+i) is equivalent to list[i] and p[i]
• Domain type need not be fixed (void *)
• void * can point to any type and can be type checked (cannot be
de-referenced)
75. Reference Types
• C++ includes a special kind of pointer type called a reference type that
is used primarily for formal parameters
– Advantages of both pass-by-reference and pass-by- value
• Java extends C++ʼs reference variables and allows them to
replace pointers entirely
– References are references to objects, rather than being
addresses
• C# includes both the references of Java and the pointers of C++,
must include ʻunsafeʼmodifier
• Smalltalk, Python, Ruby, Lua: all variables are references;
always implicitly dereferenced
76. Evaluation of Pointers
• Dangling pointers and dangling objects are problems as is heap
management
• Pointers are like goto's--they widen the range of cells that can be accessed
by a variable
• Pointers or references are necessary for dynamic data structures--so we
can't design a language without them
78. Ranjana Shevkar, Assistant Professor, Modern College
Ganeshkhind
Dangling pointers
• Tombstone: extra heap cell that is a pointer to the heap-dynamic
variable
– The actual pointer variablepoints only at tombstones
– When heap-dynamicvariablede-allocated,tombstone remains but
set to nil
– Costly in time and space – no popular languages use this..
• Locks-and-keys: Pointer values are represented as (key, address) pairs
– Heap-dynamicvariablesare represented as variable plus cell for
integer lock value
– When heap-dynamicvariableallocated,lock value is created and
placedin lock cell and key cell of pointer. Used in UW-Pascal
(compilerof Pascal)
• Best solution: out of hands of programmer (implicit deallocation:
Java; C# references)
79.
80. Heap Management
• One of design goals of LISP was that reclamation of unused cells not task of
programmer (most LISP data consists of cells in linked list)
• A very complex run-time process
• Single-size cells vs. variable-size cells
• Fundamental design question: When should deallocation be
performed?
81. Heap Management
• Fundamental design question: When should
deallocation be performed?
• Two approaches to reclaim garbage
– Reference counters (eager): reclamation is gradual
– Mark-sweep (lazy approach): reclamation occurs
when the list of variable space becomes empty
82. Reference Counter
• Reference counters: maintain a counter in every cell that store the
number of pointers currently pointing at the cell
– Disadvantages: space required, execution time required to change
counters, complications for cells connected circularly
– Advantage: it is intrinsically incremental, so significant delays
in the application execution are avoided
83. Mark-Sweep
• The run-time system allocates storage cells as requested and
disconnects pointers from cells as necessary; mark-sweep then begins
to gather garbage
84. Mark-Sweep
• The run-time system allocates storage cells as requested and disconnects
pointers from cells as necessary; mark-sweep then begins to gather garbage
– Every heap cell has an extra bit used by collection algorithm
– All cells initially set to garbage
– All pointers traced into heap, and reachable cells marked as not
garbage
– All garbage cells returned to list of availablecells
– Disadvantages: in its original form, it was done too infrequently. When done,
it caused significant delays in application execution.
–Contemporary mark-sweep algorithms avoid this by doing it more
often—called incremental mark-sweep