2. Data Type
• A data type is a set
– When you declare that a variable has a
certain type, you are saying that
(1) the values that the variable can have
elements of a certain set, and
(2) there is a collection of operations that can
be applied to those values
2
3. Primitive Data Types
• Almost all programming languages provide a
set of primitive data types
– Those not defined in terms of other data types
– Integer, floating-point, Boolean, character
• Some primitive data types are merely
reflections of the hardware
• Others require only a little non-hardware
support for their implementation
3
4. Character String Types
• Values are sequences of characters
• Design issues:
– Is it a primitive type or just a special kind of array?
– Should the length of strings be static or dynamic?
– What kinds of string operations are allowed?
• Assignment and copying: what if have diff. lengths?
• Comparison (=, >, etc.)
• Catenation
• Substring reference: reference to a substring
• Pattern matching
4
5. Character String in C
• C and C++
– Not primitive; use char arrays, terminate with a
null character
– A library of functions to provide operations
instead of primitive operators
– Problems with C string library:
• Functions in library do not guard against overflowing
the destination, e.g., strcpy(src, dest);
What if src is 50 bytes and dest is 20 bytes?
• Java
– Primitive via the String and StringBuffer class
5
6. Character String Length Options
• Static: COBOL, Java’s String class
– Length is set when string is created
• Limited dynamic length: C and C++
– Length varying, up to a fixed maximum
– In C-based language, a special character is used to
indicate the end of a string’s characters, rather than
maintaining the length
• Dynamic (no maximum): Perl, JavaScript
• Ada supports all three string length options
6
7. User-Defined Ordinal Types
• Ordinal type: range of possible values can be
easily associated with positive integers
• Two common user-defined ordinal types
– Enumeration: All possible values, which are
named constants, are provided or enumerated in
the definition, e.g., C# example
enum days {mon, tue, wed, thu, fri, sat, sun};
– Subrange: An ordered contiguous subsequence of
an ordinal type, e.g., 12..18 is a subrange of
integer type
7
8. Array Types
• Array: an aggregate of homogeneous data
elements, in which an individual element is
identified by its position in the aggregate
– Array indexing (or subscripting): a mapping from
indices to elements
• Array index types:
– FORTRAN, C: integer only
– Pascal: any ordinal type (integer, Boolean, char)
– Java: integer types only
– C++, Perl, and Fortran do not specify range
checking, while Java, C# do
8
9. Categories of Arrays
• Static: subscript ranges are statically bound
and storage allocation is static (before run-
time)
– Advantage: efficiency (no dynamic allocation)
– C and C++ arrays that include static modifier
• Fixed stack-dynamic: subscript ranges are
statically bound, but the allocation is done
during execution
– Advantage: space efficiency
– C and C++ arrays without static modifier
9
10. Categories of Arrays (cont.)
• Stack-dynamic: subscript ranges and storage
allocation are dynamically bound at
elaboration time and remain fixed during
variable lifetime
– Advantage: flexibility (the size of an array need not
be known until the array is to be used), e.g., Ada
Get(List_Len); // input array size
declare
List: array (1..List_Len) of Integer;
begin
...
End // List array deallocated
10
11. Categories of Arrays (cont.)
• Fixed heap-dynamic: storage binding is dynamic but fixed
after allocation, allocated from heap.
– C and C++ through malloc
• Heap-dynamic: binding of subscript ranges and storage is
dynamic and can change.
– Advantage: flexibility (arrays can grow or shrink during
execution), e.g., Perl, JavaScript
– C#: through ArrayList; objects created without element and
added later with Add
ArrayList intList = new ArrayList();
intList.Add(nextOne);
11
12. Unions Types
• A union is a type whose variables are allowed
to store different types of values at different
times
• Design issues
– Should type checking be required?
– Should unions be embedded in records?
12
13. Discriminated vs. Free Unions
• In Fortran, C, and C++, no language-supported type
checking for union, called free union
• Most common way of remembering what is in a union is
to embed it in a structure
13
14. Discriminated vs. Free Unions
• Discriminated union: in order to type-checking
unions, each union includes a type indicator
called a discriminant
– Supported by Ada
– Can be changed only by assigning entire record,
including discriminant no inconsistent records
14
15. Evaluation of Unions
• Free unions are unsafe
– Do not allow type checking
• Java and C# do not support unions
– Reflective of growing concerns for safety in
programming language
• Ada’s descriminated unions are safe
15
16. Pointer and Reference Types
• A pointer type variable has a range of values
that consists of memory addresses and a
special value, nll
• Provide the power of indirect addressing
• Provide a way to manage dynamic memory
• A pointer can be used to access a location in
the area where storage is dynamically created
(heap)
16
17. Design Issues of Pointers
• What are the scope and lifetime of a pointer
variable?
• What is the lifetime of a heap-dynamic
variable?
• Are pointers restricted as to the type of value
to which they can point?
• Are pointers used for dynamic storage
management, indirect addressing, or both?
• Should the language support pointer types,
reference types, or both?
17
18. Pointer Operations
• Two fundamental operations: assignment and
dereferencing
• Assignment sets a pointer variable’s value to
some useful address. Address-Of Operator (&)
a=&b sets a to the address of b
• Dereferencing yields the value stored at the
location represented by the pointer’s value.
– Dereferencing can be explicit or implicit
– C and C++ uses an explicit operator *
j = *ptr
sets j to the value located at ptr
18
19. Dereferencing Operator (*) example using C++.
int number = 88;
int * pNumber ;
pNumber = &number; // Declare and assign the address of
variable number to pointer pNumber (0x22ccec)
cout << pNumber<< endl; // Print the content of the pointer
variable, which contain an address (0x22ccec)
cout << *pNumber << endl; // Print the value "pointed to" by the
pointer, which is an int (88)
*pNumber = 99; // Assign a value to where the pointer is pointed
to, NOT to the pointer variable
cout << *pNumber << endl; // Print the new value "pointed to" by
the pointer (99)
cout << number << endl; // The value of variable number changes
as well (99)
19
20. Problems with Pointers
• Dangling pointers (dangerous)
– A pointer points to a heap-dynamic variable that
has been deallocated
has pointer but no storage
– What happen when deferencing a dangling pointer?
• Lost heap-dynamic variable
– An allocated heap-dynamic variable that is no
longer accessible to the user program (often called
garbage)
has storage but no pointer
• The process of losing heap-dynamic variables is called
memory leakage
20
21. Reference Types
• A reference type variable refers to an object or a value in
memory, while a pointer refers to an address.
• C++ reference type variable: a constant pointer that is
implicitly dereferenced; primarily for formal parameters
initialized with address at definition, remain constant
int result = 0;
int &ref_result = result;
*ref_result = 100;
• Java uses reference variables to replace pointers entirely
– Not constants, can be assigned; reference to class instances
String str1;
str1 = “This is a string.”;
21
22. Expressions
•Expressions are the fundamental means of
specifying computations in a PL
– While variables are the means for specifying
storage
•Primary use of expressions: assignment
– Main purpose: change the value of a variable
– Essence of all imperative PLs: expressions change
contents of variables (computations change states)
•To understand expression evaluation, need to
know orders of operator and operand evaluation
– Dictated by associativity and precedence rules
22
23. Arithmetic Expressions
• Arithmetic expressions consist of operators,
operands, parentheses, and function calls.
– Unary, binary, ternary (e.g., _?_:_) operators
• Implementation involves:
– Fetching operands, usually from memory
– Executing arithmetic operations on the operands
• Design issues for arithmetic expressions
– Operator precedence/associativity rules?
– Order of operand evaluation and their side effects?
– Operator overloading?
– Type mixing in expressions?
23
24. Operator Precedence Rules
• Define the order in which “adjacent” operators
of different precedence levels are evaluated
– Based on the hierarchy of operator priorities
• Typical precedence levels
– parentheses
– unary operators
– ** (exponentiation, if the language supports it)
– *, /
– +, -
24
25. Conditional Expressions
• Conditional expressions by ternary operator ?:
– C-based languages (e.g., C, C++), e.g.,
average = (count == 0)? 0 : sum / count
– Evaluates as if written like
if (count == 0)
average = 0
else
average = sum /count
25
26. Operand Evaluation Order
• How operands in expressions are “evaluated”?
– Variables: fetch the value from memory
– Constants: sometimes fetched from memory;
sometimes in the machine language instruction
– Parenthesized expressions: evaluate all inside
operands and operators first.
– Side effects: A side effect of a function, called a
functional side effect, occurs when the function
changes either one of its parameters or a global
variable.
– e.g.,
b = a + fun1(); 26
27. int a = 5, b;
int fun1()
{
a = 17;
return 3;
}/* end of fun1 */
void main()
{
b = a + fun1(); // C language a = 20; Java a = 8
printf("%d", b);
}/* end of main*/
27
28. Overloaded Operators
int a,b;
float x,y;
…
b = a + 3;
y = x + 3.0;
• We wish to use the same operator ‘+’ to
operate on integers and floating-point numbers
– Let compiler make proper translation
28
29. Overloaded Operators
• Use of an operator for more than one purpose
is called operator overloading
• Some are common (e.g., + for int and float)
• Some are troublesome (e.g., * in C and C++)
– Loss of compiler error detection (omission of an
operand should be a detectable error)
– Some loss of readability
• C++/C# allow user-defined overloaded
operator
– Users can define nonsense operations
– Readability may suffer, even when operators make
sense, e.g., need to check operand types to know 29
30. Type Conversions
• A narrowing conversion is one that converts
an object to a type that cannot include all of
the values of the original type, e.g., float to int
– Not always safe
• A widening conversion is one in which an
object is converted to a type that can include at
least approximations to all of the values of the
original type, e.g., int to float
– Usually safe but may lose accuracy
30
31. Type Conversions: Mixed Mode
• A mixed-mode expression is one that has
operands of different types
– Need type conversion implicitly or explicitly
• Implicit type conversion by compiler: coercion
– Disadvantage: decrease in the type error detection
ability of the compiler
– In most languages, all numeric types are coerced
in expressions, using widening conversions
– In Ada, there are virtually no coercions in
expressions to minimize errors due to mixed-mode
expressions
31
32. Type Conversions
• Explicit type conversion by programmer:
casting in C-based languages, e.g.,
– C: (int) angle
– Ada: Float (Sum)
• Causes of errors in expressions
– Inherent limitations of arithmetic, e.g., division by
zero
– Limitations of computer arithmetic, e.g. overflow
• Errors often ignored by the run-time system
32
33. Relational Expressions
• Expressions using relational operators and
operands of various types; evaluate to Boolean
– Relational operators: compare values of 2
operands
– Operator symbols vary among languages (!=, /=,
~=, .NE., <>, #)
33
34. Boolean Expressions
• Expressions using Boolean operators and
Boolean operands, and evaluate to Boolean
– Boolean operands: Boolean variables, Boolean
constants, relational expressions
– Example operators:
FORTRAN 77 FORTRAN 90 C Ada
.AND. and && and
.OR. or || or
.NOT. not ! not
34
36. Assignment Statements
• The general syntax
<target_var> <assign_operator> <expression>
• The assignment operator
– = FORTRAN, BASIC, the C-based languages
– := ALGOLs, Pascal, Ada
• = can be bad when it is overloaded for the
relational operator for equality (that’s why the
C-based languages use == as the relational
operator)
36
37. Conditional Targets
• Conditional targets (Perl)
($flag ? $total : $subtotal) = 0
– Which is equivalent to
if ($flag){
$total = 0
} else {
$subtotal = 0
}
37
38. Compound Assignment Operators
• A shorthand method of specifying a
commonly needed form of assignment
• Introduced in ALGOL; adopted by C
• Example:
a = a + b
is written as
a += b
38
39. Unary Assignment Operators
• Unary assignment operators in C-based
languages combine increment and decrement
operations with assignment
• Examples:
– sum = ++count (count incremented, assigned to
sum)
– sum = count++ (count assigned to sum and then
incremented)
– count++ (count incremented)
– -count++ (count incremented then negated)
39
40. Assignment as an Expression
• In C, C++, and Java, the assignment statement
produces a result and can be used as operands
while ((ch = getchar())!= EOF){…}
– ch = getchar() is carried out; result is used as a
conditional value for the while statement
– Has expression side effect: a=b+(c=d/b)-1
– Multiple-target assignment: sum = count = 0;
– Hard to tell: if (x = y) and if (x == y)
40