Topics on Imperative Languages
Variables: names, bindings, type, scope Sebesta Chapter 5
Data Types Sebesta Chapter 6
Expressions Sebesta Chapter 7
Control Statements Sebesta Chapter 8
Subprograms: Concepts & Implementation Sebesta Chapters 9 & 10
Abstract Data Types Sebesta Chapter 11
Exception Handling Sebesta Chapter 14
Object Oriented languages Sebesta Chapter 12
CSC3403 Comparative Programming Languages Variables & Identifiers 1
Variable concepts
• Imperative languages map simply to von Neuman architecture
(linear memory, CPU, sequential execution)
• Variables correspond to memory cell(s)
– integers, floats (direct support for operations)
– record or structure
– (variables may be optimised to registers)
• Variable attributes
Œ name  value
 type  lifetime
Ž location ‘ scope
• related issues: named constants, initialisation
CSC3403 Comparative Programming Languages Variables & Identifiers 2
Names
• synonym: identifiers parent(amy,cathy). parent(bill,cathy).
• applies also to names of functions, constants, types, etc.
• issues
– Maximum length
∗ limited: truncate or error if identifer too long
∗ unlimited: storage/implementation considerations
(This is really an implementation issue.)
– valid chars—typically: 0-9 a-z A-Z _~; first char non-numeric
– case sensitive?
– special words
∗ keywords — depend on context ⇒ dangerous (see next...)
∗ reserved words—e.g. if then return
∗ predefined names (system/library definitions—may be redefined)
writeln printf cout map fold
CSC3403 Comparative Programming Languages Variables & Identifiers 3
Example: Keywords considered dangerous
1962: NASA Mariner 1 Venus probe lost
• Early FORTRAN compilers implemented keywords
• Early FORTRAN compilers ignored spaces
• Programmer meant to write:
DO 5 I = 1, 3
...body of loop...
5 CONTINUE
• Problem: period instead of comma
DO 5 I = 1. 3
which is interpreted as the assignment:
DO5I = 1.3
• Result: loop not executed ⇒ program fails ⇒ Mariner 1 lost
CSC3403 Comparative Programming Languages Variables & Identifiers 4
Name space
• identifiers can be partitioned into classes: e.g. variables, types, record
names, field names, enumerated constants, module names, etc...
• where ambiguity cannot occur, only require identifiers to be unique within
their name space
• a name space is defined by the class(es) of identifier(s) it contains
• E.g. allow field names and variable names to occupy separate name spaces,
but variables and constants should occupy the same name space.
struct s { int a; int s;};
int main(){ struct s s; s.s = 1; }
• Note: Modern OO languages also implement explicit programmer
specified namespaces; this is different from the language-defined
namespaces discussed here.
CSC3403 Comparative Programming Languages Variables & Identifiers 5
Variable location
• Each variable instance has a unique start address
• It will typically occupy a range of addresses
(Very few data types can be held in just 1 byte.)
• A variable instance is defined by its
1. name
2. declaration site — which subprogram/block
3. current block instance (only required if recursive call)
CSC3403 Comparative Programming Languages Variables & Identifiers 6
Example of variable instances
There are 3 declarations of var in this program, but as many as 9 instances!
#include <stdio.h>
int var; /* 1: global variable */
int f1(){
int var; /* 2: local variable */
var = 42;
}
int f2(int var){ /* 3: parameter == local var */
if (var==0) return 1;
else return var * f2(var-1);
}
main(){
f1();
var = f2(6);
printf("%dn", var);
}
CSC3403 Comparative Programming Languages Variables & Identifiers 7
Variables Aliases
alias: 1 location but >1 name or access method
• explicit: Fortran EQUIVALENCE, C union
struct symtabEntry{ char *name;
int type;
union { int ival;
float fval;
char *sval;
} value;
}
Memory layout:
0 4 8 12
ival
name type fval
sval
CSC3403 Comparative Programming Languages Variables & Identifiers 8
Variables Aliases ....
• indirect: via pointers
int x,*p,*q;
p=&x;
q=&x;
*p=42;
printf("meaning of life = %dn", *q);
• indirect: call-by-reference parameters
void f(int *a, int *b) { *a++; *b++; }
...
f(&x, &x);
...
Comments
• Aliasing is a bad idea; reduces program readability (but variant records
can be useful in rare system programming applications)
• Better dynamic memory systems and abundant computer memory has
reduced/removed the need for aliasing
CSC3403 Comparative Programming Languages Variables & Identifiers 9
Variable type
A variable’s type defines (for language-specified types):
• the range of values that the variable can store (implementation dependent)
• the physical storage requirements (implementation dependent)
• the set of legal operations on the variable (language defined)
Variable value
• each variable instance has a value
• may be automatically initialised (see later)
• a value is contained in set of contiguous locations — called a cell
• r-value is value (contents of cell)
• l-value is address (location of cell)
CSC3403 Comparative Programming Languages Variables & Identifiers 10
Lifetime & Storage
• when is a memory cell allocated?
allocation ≡ location binding
• when is a memory cell deallocated?
• lifetime is time from allocation to deallocation (binding to unbinding)
• 4 classes of variable
– static variables
– stack-dynamic variables (automatic)
– explicit dynamic variables
– implicit dynamic variables
CSC3403 Comparative Programming Languages Variables & Identifiers 11
Static variables
• allocated before run time (usually link or load time)
• executable file contains space for static variables.
• the binding exists for entire lifetime of the program
 easy  cheap to access — compiler can compute the address at compile
time
 handy for global variables which are used throughout program
 useful for subprogram local variables (e.g. C static) when history is
required but variable is only needed in subprogram
# static parameters preclude recursion
CSC3403 Comparative Programming Languages Variables  Identifiers 12
Stack-dynamic variables
• allocated when code containing declaration is executed
• elaboration or instantiation is the name of the binding process
• variables are stored on the run-time stack (RTS)
• each block or subprogram invocation allocates a chunk of storage
(activation record instance) on the RTS
• subprogram exit ⇒ storage deallocation
# disadvantage: slower access — address must be calculated at run time
 advantages:
– enables recursion
– lower average memory usage
– smaller executables
– smaller libraries
CSC3403 Comparative Programming Languages Variables  Identifiers 13
Explicit dynamic variables
• explicitly created by programmer code
• not bound to variable — pointer variable holds address
struct stentry *stnode; // C++
stnode = new struct stentry; // allocate
...
delete stnode; // deallocate
• memory is allocated from the heap
• there is often a maximum heap size
 ideal for dynamic structures whose size is unknown at compile time (trees,
lists, etc.)
# opportunity for programmer errors (pointers—see later)
# cost of reference (indirect via pointer: 2 memory accesses)
# cost allocation/deallocation (heap magement).
CSC3403 Comparative Programming Languages Variables  Identifiers 14
Implicit dynamic variables
• employed by dynamic languages like APL, scripting languages
• storage allocated at assignment time
 advantage: flexible
# disadvantage: costly, poorer error detection
CSC3403 Comparative Programming Languages Variables  Identifiers 15
Scope
• variable scope: from where can it be referenced? (where is it visible?)
• scope rules define how to resolve references to non-local variables
(there may be  1 non-local variables of same name)
• a local variable is declared in current program unit
• each program unit introduces a new “scope”
– subprogram (a.k.a. procedure, function)
a named subprogram, usually with parameters
– block anonymous program unit
in C/C++: { .... }
– module named collection of definitions. (C++: namespace)
• two kinds
– static: compiler can resolve scope
– dynamic: scope resolved at run time
CSC3403 Comparative Programming Languages Variables  Identifiers 16
Static scoping rules
• to resolve reference to a variable
1. locate the variable’s declaration
2. retrieve variable’s attributes (type, address, etc)
• search algorithm
unit ⇐ referencing unit; found ⇐ False; done ⇐ False;
repeat
if variable declared in unit
then use this variable declaration
found ⇐ True; done ⇐ True
else if unit is not outermost
then unit ⇐ staticParent(unit)
else done ⇐ True
until done
CSC3403 Comparative Programming Languages Variables  Identifiers 17
Static scoping rules ....
Equivalent recursive definition:
A variable (say x) is visible within a referencing unit if it is either
• declared within that unit, or
• is visible from the static parent of the unit.
The referencing environment of a unit is the union of
• all identifiers declared local to the unit, and
• identifiers in the static parent’s referencing environment with names
different to those declared in this unit.
CSC3403 Comparative Programming Languages Variables  Identifiers 18
Nested scope
Program a b c d e
procedure main;
var a;
procedure sub1;
var b;
procedure sub2;
var c;
begin
end;
procedure sub3;
var d;
begin
end;
begin
end;
procedure sub4;
var e;
begin
end;
begin
end;
CSC3403 Comparative Programming Languages Variables  Identifiers 19
Hidden variables
Program xmain x1 x2 x3 x4
procedure main;
var x;
procedure sub1;
var x;
procedure sub2;
var x;
begin
end;
procedure sub3;
var x;
begin
end;
begin
end;
procedure sub4;
var x;
begin
end;
begin
end;
CSC3403 Comparative Programming Languages Variables  Identifiers 20
Static scoping issues
• Aim: to restrict access to vars/procs only to those units that need them,
thus avoiding the chance of accidental programming errors caused by
improper access.
• Non-nested function declarations (as in C) allow unrestricted calling of
functions. (With some exceptions—see later.)
• Nested declarations allow procedure calls only to
 Child procedure (but not granchild etc)
# Any ancestor procedure
• problems (see discussion Sebesta §5.8.3 )
– access not rescrictive enough (ancestors)
– overuse of global variables to provide resource sharing
– key question: How can two subprograms share local variables such
that no other subprograms can see them?
 modules, classes, namespaces (see later)
CSC3403 Comparative Programming Languages Variables  Identifiers 21
Case study: Scope in C
File Program x1 x2 x3 x4 f1,f3,f4 f2
p1.c int x; /* x1 */
main(){
int x; /* x2 */
x=2;
}
int f1(){
x=1;
}
p2.c static int x; /* x3 */
static int f2(){
x=3;
}
p3.c int f3(){
static int x; /* x4 */
x=4;
}
extern int x; /* x1 */
int f4(){
x=1;
}
CSC3403 Comparative Programming Languages Variables  Identifiers 22
Dynamic scope
• Apl, Snobol, early Lisp
• scope resolved at run time
• search dynamic parent chain
dynamic parent ≡ calling procedure
 callee automatic inherits caller’s locals
# cost of dynamic scope resolution
# difficulty of programming / debugging
Consider proc p() {a := a + 1; }
Q: Which instance of variable “a” is being updated?
A: Cannot tell from static structure of program!
You must trace through program execution to find out.
CSC3403 Comparative Programming Languages Variables  Identifiers 23
Binding concepts
• a binding is an association
– variable attribute ←→ value of attribute
– operation ←→ symbol
– executable code ←→ physical location
– conceptual/abstract ←→ concrete/value
• binding times (from early to late)
– language design (operator symbols)
– language implementation (range of data types)
– compile time (var ←→ type)
– link time (address of vars, functions in executable image file)
– load time (physical variable addresses)
– run time (dynamic variable addresses, DLL function addresses)
CSC3403 Comparative Programming Languages Variables  Identifiers 24
Static and Dynamic Binding
• static binding: bound before run time
• dynamic binding: bound during run time
• consider variable attributes
– name is always statically determined
– value is always dynamically bound (else it is a constant!)
– location and lifetime can safely be either static or dynamic
– type and scope can be either static or dynamic.
Static binding is safer and more readable for these attributes.
CSC3403 Comparative Programming Languages Variables  Identifiers 25
Type bindings — motivation
Q: Why do we need types?
A: At reference time, a variable must have a type so that either
1. on “read” (e.g. variable access), cell contents can be interpreted correctly
2. on “write” (e.g. assignment to variable), correct conversions are made
E.g. consider
int a,b; float c;
...
c = a + b;
Key design questions
• how is the type determined?
– explicit vs. implicit declarations
• when does variable–type binding occur?
– static vs. dynamic
CSC3403 Comparative Programming Languages Variables  Identifiers 26
How is (static) type specified/determined?
• explicit declarations
– mandatory (C++, Pascal, most modern languages)
– optional ([uncommon] KR C functions default to int)
• implicit declarations
– Fortran, Basic, PL/1
– Fortran: first letter convention: I J K L M N
– others: first use convention
– disadvantage: typographical errors undetected
coutner = counter + 1
• inference (implicit declaration supported by optional explicit declarations)
– ML, Miranda, Haskell, Gofer
– all types can be statically inferred
– comparison of declared vs inferred types finds errors
CSC3403 Comparative Programming Languages Variables  Identifiers 27
Dynamic type binding
• APL, shell languages (e.g. Perl, Tcl)
• automatic conversions required
 advantage: flexibility, generic subprograms possible, e.g.:
int inc(a) {return a+1;} // a can be any numeric type
# disadvantages
– typographical errors not caught (see previous example)
– cost of dynamic type inference
– interpreted implementions usually required
• Note: advanced languages (Ada, C++, Haskell) provide polymorphic
type-checked subprograms
– provides generic subprograms with static type checking
CSC3403 Comparative Programming Languages Variables  Identifiers 28
Typing concepts
• Type error A type error occurs when an operator (or function) is applied
to an operand (e.g. a variable, but in genaral any expression) whose type is
not compatible with the operator’s parameter type.
If this error is not detected by the language implementation system ⇒
program bug!
int i=1; double f,*p; p = (double *) i; *p = f;
• Type checking To check that an operator or subprogram is applied to
arguments of the correct type. (Assignment is considered a binary
operator).
• Compatible type Either a type which matches the operator’s definition or
a type which can be automatically converted, using language rules, into
the correct matching type.
• Coersion Automatic conversion of types.
CSC3403 Comparative Programming Languages Variables  Identifiers 29
Strong typing
Definition: All type errors in a program can be detected
• no imperative language is strongly typed
• typical problems: variant records, non-checked type casts
• many languages (e.g. Java, C#, Ada) are almost strongly typed
• modern functional languages are strongly typed (static typing)
• coercion weakens strong typing
Comment Strong typing is seen by most language researchers/designers as a
“good idea”, and most newer languages have very good type checking.
CSC3403 Comparative Programming Languages Variables  Identifiers 30
Type conversion examples
Consider “assignment” of a floating point to an integer.
C: coersion, type error
int i,j;
double a=999999999.5,
b=9999999999.5;
int main(){ i = a; j = b; printf(i=%d j=%dn,i,j);}
execution ⇒ i=999999999 j=-2147483648
Haskell: strong typing: explicit conversion + runtime check
Hugs :t truncate
truncate :: (RealFrac a, Integral b) = a - b
Hugs (truncate 999999999.5)::Int
999999999
Hugs (truncate 9999999999.5)::Int
Program error: arithmetic overflow
CSC3403 Comparative Programming Languages Variables  Identifiers 31
Type compatibility
• name type compatibility
– both varables appear in same declaration
or both variables declared with the same type name
– C++ uses name compatibility
main(){ //C++
struct point {int x,y;} p1,p2;
struct {int a,b;} p3;
struct point p4;
p1.x = 10; p1.y = 10;
p2=p1; // ok
p3=p1; // ERROR
p4=p1; // ok
}
CSC3403 Comparative Programming Languages Variables  Identifiers 32
Type compatibility ....
• structure type compatibility
– the type of both variables have same structure
 less restrictive than name compatibility
# somewhat more complex to implement
• in general, type compability applies to expressions, not just variables
• more examples: Sebesta §5.7
CSC3403 Comparative Programming Languages Variables  Identifiers 33
Named constants
• e.g. const int x = 10; // C++
• a variable which is initialised and cannot be assigned to
• value  address bound at same time
• optimisers may keep in register
 advantage: readability, easy to modify code when it does not contain
“magic numbers”
• do not confuse with named literal
#define TABLESIZE 128
CSC3403 Comparative Programming Languages Variables  Identifiers 34
Initialisation
• variable can be initialised by declaration statement
• initialisation is dependant on kind of variable:
– static variable: value is stored in executable file
– dynamic variable: implicit assignment statement executed at run time
It is a shorthand rather than extra feature.
The following two C fragments are semantically identical:
void f () { void f () {
int i = 42; int i;
i = 42;
... ...
} }
• some languages guarantee to automatically initialise (e.g. integers to 0)
CSC3403 Comparative Programming Languages Variables  Identifiers 35
A few conclusions
• these features are considered “good”
 static typing
 strong typing
 static scoping
 automatic dynamic memory management (garbage collection)
• these features are considered “poor”(or use with caution!)
# aliasing
# implicit variable declaration
# unrestricted use of dynamic binding
# dynamic scoping
# dynamic type of variables
 if used with care (C++ dynamic function binding)
CSC3403 Comparative Programming Languages Variables  Identifiers 36

Variables: names, bindings, type, scope

  • 1.
    Topics on ImperativeLanguages Variables: names, bindings, type, scope Sebesta Chapter 5 Data Types Sebesta Chapter 6 Expressions Sebesta Chapter 7 Control Statements Sebesta Chapter 8 Subprograms: Concepts & Implementation Sebesta Chapters 9 & 10 Abstract Data Types Sebesta Chapter 11 Exception Handling Sebesta Chapter 14 Object Oriented languages Sebesta Chapter 12 CSC3403 Comparative Programming Languages Variables & Identifiers 1 Variable concepts • Imperative languages map simply to von Neuman architecture (linear memory, CPU, sequential execution) • Variables correspond to memory cell(s) – integers, floats (direct support for operations) – record or structure – (variables may be optimised to registers) • Variable attributes Œ name  value  type  lifetime Ž location ‘ scope • related issues: named constants, initialisation CSC3403 Comparative Programming Languages Variables & Identifiers 2 Names • synonym: identifiers parent(amy,cathy). parent(bill,cathy). • applies also to names of functions, constants, types, etc. • issues – Maximum length ∗ limited: truncate or error if identifer too long ∗ unlimited: storage/implementation considerations (This is really an implementation issue.) – valid chars—typically: 0-9 a-z A-Z _~; first char non-numeric – case sensitive? – special words ∗ keywords — depend on context ⇒ dangerous (see next...) ∗ reserved words—e.g. if then return ∗ predefined names (system/library definitions—may be redefined) writeln printf cout map fold CSC3403 Comparative Programming Languages Variables & Identifiers 3 Example: Keywords considered dangerous 1962: NASA Mariner 1 Venus probe lost • Early FORTRAN compilers implemented keywords • Early FORTRAN compilers ignored spaces • Programmer meant to write: DO 5 I = 1, 3 ...body of loop... 5 CONTINUE • Problem: period instead of comma DO 5 I = 1. 3 which is interpreted as the assignment: DO5I = 1.3 • Result: loop not executed ⇒ program fails ⇒ Mariner 1 lost CSC3403 Comparative Programming Languages Variables & Identifiers 4
  • 2.
    Name space • identifierscan be partitioned into classes: e.g. variables, types, record names, field names, enumerated constants, module names, etc... • where ambiguity cannot occur, only require identifiers to be unique within their name space • a name space is defined by the class(es) of identifier(s) it contains • E.g. allow field names and variable names to occupy separate name spaces, but variables and constants should occupy the same name space. struct s { int a; int s;}; int main(){ struct s s; s.s = 1; } • Note: Modern OO languages also implement explicit programmer specified namespaces; this is different from the language-defined namespaces discussed here. CSC3403 Comparative Programming Languages Variables & Identifiers 5 Variable location • Each variable instance has a unique start address • It will typically occupy a range of addresses (Very few data types can be held in just 1 byte.) • A variable instance is defined by its 1. name 2. declaration site — which subprogram/block 3. current block instance (only required if recursive call) CSC3403 Comparative Programming Languages Variables & Identifiers 6 Example of variable instances There are 3 declarations of var in this program, but as many as 9 instances! #include <stdio.h> int var; /* 1: global variable */ int f1(){ int var; /* 2: local variable */ var = 42; } int f2(int var){ /* 3: parameter == local var */ if (var==0) return 1; else return var * f2(var-1); } main(){ f1(); var = f2(6); printf("%dn", var); } CSC3403 Comparative Programming Languages Variables & Identifiers 7 Variables Aliases alias: 1 location but >1 name or access method • explicit: Fortran EQUIVALENCE, C union struct symtabEntry{ char *name; int type; union { int ival; float fval; char *sval; } value; } Memory layout: 0 4 8 12 ival name type fval sval CSC3403 Comparative Programming Languages Variables & Identifiers 8
  • 3.
    Variables Aliases .... •indirect: via pointers int x,*p,*q; p=&x; q=&x; *p=42; printf("meaning of life = %dn", *q); • indirect: call-by-reference parameters void f(int *a, int *b) { *a++; *b++; } ... f(&x, &x); ... Comments • Aliasing is a bad idea; reduces program readability (but variant records can be useful in rare system programming applications) • Better dynamic memory systems and abundant computer memory has reduced/removed the need for aliasing CSC3403 Comparative Programming Languages Variables & Identifiers 9 Variable type A variable’s type defines (for language-specified types): • the range of values that the variable can store (implementation dependent) • the physical storage requirements (implementation dependent) • the set of legal operations on the variable (language defined) Variable value • each variable instance has a value • may be automatically initialised (see later) • a value is contained in set of contiguous locations — called a cell • r-value is value (contents of cell) • l-value is address (location of cell) CSC3403 Comparative Programming Languages Variables & Identifiers 10 Lifetime & Storage • when is a memory cell allocated? allocation ≡ location binding • when is a memory cell deallocated? • lifetime is time from allocation to deallocation (binding to unbinding) • 4 classes of variable – static variables – stack-dynamic variables (automatic) – explicit dynamic variables – implicit dynamic variables CSC3403 Comparative Programming Languages Variables & Identifiers 11 Static variables • allocated before run time (usually link or load time) • executable file contains space for static variables. • the binding exists for entire lifetime of the program easy cheap to access — compiler can compute the address at compile time handy for global variables which are used throughout program useful for subprogram local variables (e.g. C static) when history is required but variable is only needed in subprogram # static parameters preclude recursion CSC3403 Comparative Programming Languages Variables Identifiers 12
  • 4.
    Stack-dynamic variables • allocatedwhen code containing declaration is executed • elaboration or instantiation is the name of the binding process • variables are stored on the run-time stack (RTS) • each block or subprogram invocation allocates a chunk of storage (activation record instance) on the RTS • subprogram exit ⇒ storage deallocation # disadvantage: slower access — address must be calculated at run time advantages: – enables recursion – lower average memory usage – smaller executables – smaller libraries CSC3403 Comparative Programming Languages Variables Identifiers 13 Explicit dynamic variables • explicitly created by programmer code • not bound to variable — pointer variable holds address struct stentry *stnode; // C++ stnode = new struct stentry; // allocate ... delete stnode; // deallocate • memory is allocated from the heap • there is often a maximum heap size ideal for dynamic structures whose size is unknown at compile time (trees, lists, etc.) # opportunity for programmer errors (pointers—see later) # cost of reference (indirect via pointer: 2 memory accesses) # cost allocation/deallocation (heap magement). CSC3403 Comparative Programming Languages Variables Identifiers 14 Implicit dynamic variables • employed by dynamic languages like APL, scripting languages • storage allocated at assignment time advantage: flexible # disadvantage: costly, poorer error detection CSC3403 Comparative Programming Languages Variables Identifiers 15 Scope • variable scope: from where can it be referenced? (where is it visible?) • scope rules define how to resolve references to non-local variables (there may be 1 non-local variables of same name) • a local variable is declared in current program unit • each program unit introduces a new “scope” – subprogram (a.k.a. procedure, function) a named subprogram, usually with parameters – block anonymous program unit in C/C++: { .... } – module named collection of definitions. (C++: namespace) • two kinds – static: compiler can resolve scope – dynamic: scope resolved at run time CSC3403 Comparative Programming Languages Variables Identifiers 16
  • 5.
    Static scoping rules •to resolve reference to a variable 1. locate the variable’s declaration 2. retrieve variable’s attributes (type, address, etc) • search algorithm unit ⇐ referencing unit; found ⇐ False; done ⇐ False; repeat if variable declared in unit then use this variable declaration found ⇐ True; done ⇐ True else if unit is not outermost then unit ⇐ staticParent(unit) else done ⇐ True until done CSC3403 Comparative Programming Languages Variables Identifiers 17 Static scoping rules .... Equivalent recursive definition: A variable (say x) is visible within a referencing unit if it is either • declared within that unit, or • is visible from the static parent of the unit. The referencing environment of a unit is the union of • all identifiers declared local to the unit, and • identifiers in the static parent’s referencing environment with names different to those declared in this unit. CSC3403 Comparative Programming Languages Variables Identifiers 18 Nested scope Program a b c d e procedure main; var a; procedure sub1; var b; procedure sub2; var c; begin end; procedure sub3; var d; begin end; begin end; procedure sub4; var e; begin end; begin end; CSC3403 Comparative Programming Languages Variables Identifiers 19 Hidden variables Program xmain x1 x2 x3 x4 procedure main; var x; procedure sub1; var x; procedure sub2; var x; begin end; procedure sub3; var x; begin end; begin end; procedure sub4; var x; begin end; begin end; CSC3403 Comparative Programming Languages Variables Identifiers 20
  • 6.
    Static scoping issues •Aim: to restrict access to vars/procs only to those units that need them, thus avoiding the chance of accidental programming errors caused by improper access. • Non-nested function declarations (as in C) allow unrestricted calling of functions. (With some exceptions—see later.) • Nested declarations allow procedure calls only to Child procedure (but not granchild etc) # Any ancestor procedure • problems (see discussion Sebesta §5.8.3 ) – access not rescrictive enough (ancestors) – overuse of global variables to provide resource sharing – key question: How can two subprograms share local variables such that no other subprograms can see them? modules, classes, namespaces (see later) CSC3403 Comparative Programming Languages Variables Identifiers 21 Case study: Scope in C File Program x1 x2 x3 x4 f1,f3,f4 f2 p1.c int x; /* x1 */ main(){ int x; /* x2 */ x=2; } int f1(){ x=1; } p2.c static int x; /* x3 */ static int f2(){ x=3; } p3.c int f3(){ static int x; /* x4 */ x=4; } extern int x; /* x1 */ int f4(){ x=1; } CSC3403 Comparative Programming Languages Variables Identifiers 22 Dynamic scope • Apl, Snobol, early Lisp • scope resolved at run time • search dynamic parent chain dynamic parent ≡ calling procedure callee automatic inherits caller’s locals # cost of dynamic scope resolution # difficulty of programming / debugging Consider proc p() {a := a + 1; } Q: Which instance of variable “a” is being updated? A: Cannot tell from static structure of program! You must trace through program execution to find out. CSC3403 Comparative Programming Languages Variables Identifiers 23 Binding concepts • a binding is an association – variable attribute ←→ value of attribute – operation ←→ symbol – executable code ←→ physical location – conceptual/abstract ←→ concrete/value • binding times (from early to late) – language design (operator symbols) – language implementation (range of data types) – compile time (var ←→ type) – link time (address of vars, functions in executable image file) – load time (physical variable addresses) – run time (dynamic variable addresses, DLL function addresses) CSC3403 Comparative Programming Languages Variables Identifiers 24
  • 7.
    Static and DynamicBinding • static binding: bound before run time • dynamic binding: bound during run time • consider variable attributes – name is always statically determined – value is always dynamically bound (else it is a constant!) – location and lifetime can safely be either static or dynamic – type and scope can be either static or dynamic. Static binding is safer and more readable for these attributes. CSC3403 Comparative Programming Languages Variables Identifiers 25 Type bindings — motivation Q: Why do we need types? A: At reference time, a variable must have a type so that either 1. on “read” (e.g. variable access), cell contents can be interpreted correctly 2. on “write” (e.g. assignment to variable), correct conversions are made E.g. consider int a,b; float c; ... c = a + b; Key design questions • how is the type determined? – explicit vs. implicit declarations • when does variable–type binding occur? – static vs. dynamic CSC3403 Comparative Programming Languages Variables Identifiers 26 How is (static) type specified/determined? • explicit declarations – mandatory (C++, Pascal, most modern languages) – optional ([uncommon] KR C functions default to int) • implicit declarations – Fortran, Basic, PL/1 – Fortran: first letter convention: I J K L M N – others: first use convention – disadvantage: typographical errors undetected coutner = counter + 1 • inference (implicit declaration supported by optional explicit declarations) – ML, Miranda, Haskell, Gofer – all types can be statically inferred – comparison of declared vs inferred types finds errors CSC3403 Comparative Programming Languages Variables Identifiers 27 Dynamic type binding • APL, shell languages (e.g. Perl, Tcl) • automatic conversions required advantage: flexibility, generic subprograms possible, e.g.: int inc(a) {return a+1;} // a can be any numeric type # disadvantages – typographical errors not caught (see previous example) – cost of dynamic type inference – interpreted implementions usually required • Note: advanced languages (Ada, C++, Haskell) provide polymorphic type-checked subprograms – provides generic subprograms with static type checking CSC3403 Comparative Programming Languages Variables Identifiers 28
  • 8.
    Typing concepts • Typeerror A type error occurs when an operator (or function) is applied to an operand (e.g. a variable, but in genaral any expression) whose type is not compatible with the operator’s parameter type. If this error is not detected by the language implementation system ⇒ program bug! int i=1; double f,*p; p = (double *) i; *p = f; • Type checking To check that an operator or subprogram is applied to arguments of the correct type. (Assignment is considered a binary operator). • Compatible type Either a type which matches the operator’s definition or a type which can be automatically converted, using language rules, into the correct matching type. • Coersion Automatic conversion of types. CSC3403 Comparative Programming Languages Variables Identifiers 29 Strong typing Definition: All type errors in a program can be detected • no imperative language is strongly typed • typical problems: variant records, non-checked type casts • many languages (e.g. Java, C#, Ada) are almost strongly typed • modern functional languages are strongly typed (static typing) • coercion weakens strong typing Comment Strong typing is seen by most language researchers/designers as a “good idea”, and most newer languages have very good type checking. CSC3403 Comparative Programming Languages Variables Identifiers 30 Type conversion examples Consider “assignment” of a floating point to an integer. C: coersion, type error int i,j; double a=999999999.5, b=9999999999.5; int main(){ i = a; j = b; printf(i=%d j=%dn,i,j);} execution ⇒ i=999999999 j=-2147483648 Haskell: strong typing: explicit conversion + runtime check Hugs :t truncate truncate :: (RealFrac a, Integral b) = a - b Hugs (truncate 999999999.5)::Int 999999999 Hugs (truncate 9999999999.5)::Int Program error: arithmetic overflow CSC3403 Comparative Programming Languages Variables Identifiers 31 Type compatibility • name type compatibility – both varables appear in same declaration or both variables declared with the same type name – C++ uses name compatibility main(){ //C++ struct point {int x,y;} p1,p2; struct {int a,b;} p3; struct point p4; p1.x = 10; p1.y = 10; p2=p1; // ok p3=p1; // ERROR p4=p1; // ok } CSC3403 Comparative Programming Languages Variables Identifiers 32
  • 9.
    Type compatibility .... •structure type compatibility – the type of both variables have same structure less restrictive than name compatibility # somewhat more complex to implement • in general, type compability applies to expressions, not just variables • more examples: Sebesta §5.7 CSC3403 Comparative Programming Languages Variables Identifiers 33 Named constants • e.g. const int x = 10; // C++ • a variable which is initialised and cannot be assigned to • value address bound at same time • optimisers may keep in register advantage: readability, easy to modify code when it does not contain “magic numbers” • do not confuse with named literal #define TABLESIZE 128 CSC3403 Comparative Programming Languages Variables Identifiers 34 Initialisation • variable can be initialised by declaration statement • initialisation is dependant on kind of variable: – static variable: value is stored in executable file – dynamic variable: implicit assignment statement executed at run time It is a shorthand rather than extra feature. The following two C fragments are semantically identical: void f () { void f () { int i = 42; int i; i = 42; ... ... } } • some languages guarantee to automatically initialise (e.g. integers to 0) CSC3403 Comparative Programming Languages Variables Identifiers 35 A few conclusions • these features are considered “good” static typing strong typing static scoping automatic dynamic memory management (garbage collection) • these features are considered “poor”(or use with caution!) # aliasing # implicit variable declaration # unrestricted use of dynamic binding # dynamic scoping # dynamic type of variables if used with care (C++ dynamic function binding) CSC3403 Comparative Programming Languages Variables Identifiers 36