6 data types


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

6 data types

  1. 1. ICS 313 - Fundamentals of Programming Languages 16. Data Types6.1 IntroductionEvolution of Data TypesFORTRAN I (1957) - INTEGER, REAL, arrays…Ada (1983) - User can create a unique type for every categoryof variables in the problem space and have the system enforcethe typesA descriptor is the collection of the attributes of a variableDesign Issues for all data typesWhat is the syntax of references to variables?What operations are defined and how are they specified?
  2. 2. ICS 313 - Fundamentals of Programming Languages 26.2 Primitive Data TypesThose not defined in terms of other data typesIntegerAlmost always an exact reflection of the hardware, so the mapping istrivialThere may be as many as eight different integer types in a languageFloating PointModel real numbers, but only as approximationsLanguages for scientific use support at least two floating-point types;sometimes moreUsually exactly like the hardware, but not always; some languagesallow accuracy specs in code e.g. (Ada)type SPEED is digits 7 range 0.0..1000.0;type VOLTAGE is delta 0.1 range -12.0..24.0;See book for representation of floating point (p. 223)6.2 Primitive Data Types (continued)representation of floating point
  3. 3. ICS 313 - Fundamentals of Programming Languages 36.2 Primitive Data Types (continued)DecimalFor business applications (money)Store a fixed number of decimal digits (coded)Advantage: accuracyDisadvantages: limited range, wastes memoryBooleanCould be implemented as bits, but often as bytesAdvantage: readabilityCharacter TypesASCII character setUnicode6.3 Character String TypesValues are sequences of charactersDesign issuesIs it a primitive type or just a special kind of array?Is the length of objects static or dynamic?OperationsAssignmentComparison (=, >, etc.)CatenationSubstring referencePattern matching
  4. 4. ICS 313 - Fundamentals of Programming Languages 46.3 Character String Types (continued)ExamplesPascalNot primitive; assignment and comparison only (of packed arrays)Ada, FORTRAN 90, and BASICSomewhat primitiveAssignment, comparison, catenation, substring referenceFORTRAN has an intrinsic for pattern matchingAdaN := N1 & N2 (catenation)N(2..4) (substring reference)C and C++Not primitiveUse char arrays and a library of functions that provide operationsSNOBOL4 (a string manipulation language)PrimitiveMany operations, including elaborate pattern matching6.3 Character String Types (continued)Perl and JavaScriptPatterns are defined in terms of regular expressionsA very powerful facility!e.g.,/[A-Za-z][A-Za-zd]+/Java - String class (not arrays of char)Objects are immutableStringBuffer is a class for changeable string objectsString Length OptionsStatic - FORTRAN 77, Ada, COBOLe.g. (FORTRAN 90)CHARACTER (LEN = 15) NAME;Limited Dynamic Length - C and C++ actual length is indicatedby a null characterDynamic - SNOBOL4, Perl, JavaScript
  5. 5. ICS 313 - Fundamentals of Programming Languages 56.3 Character String Types (continued)Evaluation (of character string types)Aid to writabilityAs a primitive type with static length, they are inexpensive toprovide--why not have them?Dynamic length is nice, but is it worth the expense?ImplementationStatic length - compile-time descriptorLimited dynamic length - may need a run-time descriptor forlength (but not in C and C++)Dynamic length - need run-time descriptor;allocation/deallocation is the biggest implementation problem6.4 User-Defined Ordinal TypesAn ordinal type is one in which the range of possible valuescan be easily associated with the set of positive integersEnumeration TypesOne in which the user enumerates all of the possible values,which are symbolic constantsDesign Issue: Should a symbolic constant be allowed to be inmore than one type definition?ExamplesPascal - cannot reuse constants; they can be used for array subscripts,for variables, case selectors; NO input or output; can be comparedAda - constants can be reused (overloaded literals); disambiguate withcontext or type_name ‘ (one of them); can be used as in Pascal; CANbe input and outputC and C++ - like Pascal, except they can be input and output as integersJava does not include an enumeration type, but provides theEnumeration interface
  6. 6. ICS 313 - Fundamentals of Programming Languages 66.4 User-Defined Ordinal TypesEvaluation (of enumeration types):Aid to readability - e.g. no need to code a color as anumberAid to reliability - e.g. compiler can check:operations (don’t allow colors to be added)ranges of values (if you allow 7 colors and code them as the integers,1..7, 9 will be a legal integer (and thus a legal color))Subrange TypeAn ordered contiguous subsequence of an ordinal typeDesign Issue: How can they be used?6.4 User-Defined Ordinal TypesExamplesPascalSubrange types behave as their parent types; can be used as for variablesand array indicese.g. type pos = 0 .. MAXINT;AdaSubtypes are not new types, just constrained existing types (so they arecompatible); can be used as in Pascal, plus case constantse.g.subtype POS_TYPE is INTEGER range 0 ..INTEGERLAST;Evaluation of subrange typesAid to readabilityReliability - restricted ranges add error detectionImplementation of user-defined ordinal typesEnumeration types are implemented as integersSubrange types are the parent types with code inserted (by thecompiler) to restrict assignments to subrange variables
  7. 7. ICS 313 - Fundamentals of Programming Languages 76.5 ArraysAn array is an aggregate of homogeneous data elements inwhich an individual element is identified by its position in theaggregate, relative to the first elementDesign IssuesWhat types are legal for subscripts?Are subscripting expressions in element references rangechecked?When are subscript ranges bound?When does allocation take place?What is the maximum number of subscripts?Can array objects be initialized?Are any kind of slices allowed?Indexing is a mapping from indices to elementsmap(array_name, index_value_list) → an element6.5 Arrays (continued)Index SyntaxFORTRAN, PL/I, Ada use parenthesesMost other languages use bracketsSubscript Types:FORTRAN, C - integer onlyPascal - any ordinal type (integer, boolean, char, enum)Ada - integer or enum (includes boolean and char)Java - integer types onlyFour Categories of Arrays (based on subscript binding and bindingto storage)Static - range of subscripts and storage bindings are statice.g. FORTRAN 77, some arrays in AdaAdvantage: execution efficiency (no allocation or deallocation)
  8. 8. ICS 313 - Fundamentals of Programming Languages 86.5 Arrays (continued)Fixed stack dynamic - range of subscripts is statically bound, butstorage is bound at elaboration timee.g. Most Java locals, and C locals that are not staticAdvantage: space efficiencyStack-dynamic - range and storage are dynamic, but fixed from thenon for the variable’s lifetimee.g. Ada declare blocksdeclareSTUFF : array (1..N) of FLOAT;begin...end;Advantage: flexibility - size need not be known until the array is aboutto be used6.5 Arrays (continued)Heap-dynamic - subscript range and storage bindings aredynamic and not fixede.g. (FORTRAN 90)INTEGER, ALLOCATABLE, ARRAY (:,:) :: MAT(Declares MAT to be a dynamic 2-dim array)ALLOCATE (MAT (10, NUMBER_OF_COLS))(Allocates MAT to have 10 rows andNUMBER_OF_COLS columns)DEALLOCATE MAT(Deallocates MAT’s storage)In APL, Perl, and JavaScript, arrays grow and shrink as neededIn Java, all arrays are objects (heap-dynamic)
  9. 9. ICS 313 - Fundamentals of Programming Languages 96.5 Arrays (continued)Number of subscriptsFORTRAN I allowed up to threeFORTRAN 77 allows up to sevenOthers - no limitArray InitializationUsually just a list of values that are put in the array in the orderin which the array elements are stored in memoryExamplesFORTRAN - uses the DATA statement, or put the values in / ... / on the declarationC and C++ - put the values in braces; can let the compiler count them e.g.int stuff [] = {2, 4, 6, 8};Ada - positions for the values can be specified e.g.SCORE : array (1..14, 1..2) := (1 => (24, 10), 2 => (10, 7), 3 =>(12, 30), others =>(0, 0));Pascal does not allow array initialization6.5 Arrays (continued)Array OperationsAPL - many, see book (p. 240-241)ADAAssignment; RHS can be an aggregate constant or an array nameCatenation; for all single-dimensioned arraysRelational operators (= and /= only)FORTRAN 90Intrinsics (subprograms) for a wide variety of array operations (e.g., matrixmultiplication, vector dot product)SlicesA slice is some substructure of an array; nothing more than areferencing mechanismSlices are only useful in languages that have array operations
  10. 10. ICS 313 - Fundamentals of Programming Languages 106.5 Arrays (continued)Slice Examples:FORTRAN 90INTEGER MAT (1 : 4, 1 : 4)MAT(1 : 4, 1) - the first columnMAT(2, 1 : 4) - the second rowAda - single-dimensionedarrays onlyLIST(4..10)Implementation of ArraysAccess function maps subscriptexpressions to an address in the arrayRow major (by rows) or column majororder (by columns)6.6 Associative ArraysAn associative array is an unordered collection of data elements thatare indexed by an equal number of values called keysDesign IssuesWhat is the form of references to elements?Is the size static or dynamic?Structure and Operations in PerlNames begin with %Literals are delimited by parentheses, e.g.,%hi_temps = ("Monday" => 77, "Tuesday" => 79,…);Subscripting is done using braces and keys, e.g.,$hi_temps{"Wednesday"} = 83;Elements can be removed with delete, e.g.,delete $hi_temps{"Tuesday"};
  11. 11. ICS 313 - Fundamentals of Programming Languages 116.7 RecordsA record is a possibly heterogeneous aggregate of data elementsin which the individual elements are identified by namesDesign IssuesWhat is the form of references?What unit operations are defined?Record Definition SyntaxCOBOL uses level numbers to show nested records; others userecursive definitionsRecord Field ReferencesCOBOLfield_name OF record_name_1 OF ... OF record_name_nOthers (dot notation)record_name_1.record_name_2. ... .record_name_n.field_nameFully qualified references must include all record names6.7 Records (continued)Elliptical references allow leaving out record names aslong as the reference is unambiguousPascal provides a with clause to abbreviate referencesRecord OperationsAssignmentPascal, Ada, and C allow it if the types are identicalIn Ada, the RHS can be an aggregate constantInitializationAllowed in Ada, using an aggregate constantComparisonIn Ada, = and /=; one operand can be an aggregate constantMOVE CORRESPONDINGIn COBOL - it moves all fields in the source record to fields with the same names in thedestination record
  12. 12. ICS 313 - Fundamentals of Programming Languages 126.7 Records (continued)Comparing records and arraysAccess to array elements is much slower than access torecord fields, because subscripts are dynamic (fieldnames are static)Dynamic subscripts could be used with record fieldaccess, but it would disallow type checking and it wouldbe much slower6.8 UnionsA union is a type whose variables are allowed to store different type valuesat different times during executionDesign Issues for unionsWhat kind of type checking, if any, must be done?Should unions be integrated with records?Examples:FORTRAN - with EQUIVALENCENo type checkingPascal - both discriminated and nondiscriminated unions, e.g.type intreal = recordtagg : Boolean oftrue : (blint : integer);false : (blreal : real);end;
  13. 13. ICS 313 - Fundamentals of Programming Languages 136.8 Unions (continued)Problem with Pascal’s design: type checking is ineffectiveReasons why Pascal’s unions cannot be type checkedeffectively:User can create inconsistent unions (because the tag can beindividually assigned)var blurb : intreal;x : real;blurb.tagg := true; { it is an integer }blurb.blint := 47; { ok }blurb.tagg := false; { it is a real }x := blurb.blreal; {assigns an integer to a real}The tag is optional!Now, only the declaration and the second and last assignments are required to causetrouble6.8 Unions (continued)Ada - discriminated unionsReasons they are safer than Pascal:Tag must be presentIt is impossible for the user to create an inconsistent union (because tag cannot beassigned by itself--All assignments to the union must include the tag value,because they are aggregate values)C and C++ - free unions (no tags)Not part of their recordsNo type checking of referencesJava has neither records nor unionsEvaluation - potentially unsafe in most languages (not Ada)
  14. 14. ICS 313 - Fundamentals of Programming Languages 146.9 SetsA set is a type whose variables can store unorderedcollections of distinct values from some ordinal typeDesign IssueWhat is the maximum number of elements in any setbase type?ExamplesPascalNo maximum size in the language definition(not portable, poor writability if max is too small)Operations: in, union (+), intersection (*), difference (-), =, <>, superset(>=), subset (<=)6.9 Sets (continued)Ada - does not include sets, but defines in as setmembership operator for all enumeration typesJava includes a class for set operationsEvaluationIf a language does not have sets, they must be simulated,either with enumerated types or with arraysArrays are more flexible than sets, but have much slowerset operationsImplementationUsually stored as bit strings and use logical operations forthe set operations
  15. 15. ICS 313 - Fundamentals of Programming Languages 156.10 PointersA pointer type is a type in which the range of values consistsof memory addresses and a special value, nil (or null)UsesAddressing flexibilityDynamic storage managementDesign IssuesWhat is the scope and lifetime of pointer variables?What is the lifetime of heap-dynamic variables?Are pointers restricted to pointing at a particular type?Are pointers used for dynamic storage management, indirectaddressing, or both?Should a language support pointer types, reference types, orboth?6.10 Pointers (continued)Fundamental Pointer Operations:Assignment of an address to a pointerReferences (explicit versus implicit dereferencing)The assignment operation j = *ptr
  16. 16. ICS 313 - Fundamentals of Programming Languages 166.10 Pointers (continued)Problems with pointersDangling pointers (dangerous)A pointer points to a heap-dynamic variable that has been deallocatedCreating one (with explicit deallocation):Allocate a heap-dynamic variable and set a pointer to point at itSet a second pointer to the value of the first pointerDeallocate the heap-dynamic variable, using the first pointerLost Heap-Dynamic Variables (wasteful)A heap-dynamic variable that is no longer referenced by any programpointerCreating one:Pointer p1 is set to point to a newly created heap-dynamic variablep1 is later set to point to another newly created heap-dynamicvariableThe process of losing heap-dynamic variables is called memory leakage6.10 Pointers (continued)Examples:Pascal: used for dynamic storage management onlyExplicit dereferencing (postfix ^)Dangling pointers are possible (dispose)Dangling objects are also possibleAda: a little better than PascalSome dangling pointers are disallowed because dynamicobjects can be automatically deallocated at the end of pointerstype scopeAll pointers are initialized to nullSimilar dangling object problem (but rarely happens, becauseexplicit deallocation is rarely done)
  17. 17. ICS 313 - Fundamentals of Programming Languages 176.10 Pointers (continued)C and C++Used for dynamic storage management and addressingExplicit dereferencing and address-of operatorCan do address arithmetic in restricted formsDomain type need not be fixed (void * )e.g. float stuff[100];float *p;p = stuff;*(p+5) is equivalent to stuff[5] and p[5]*(p+i) is equivalent to stuff[i] and p[i](Implicit scaling)void * - Can point to any type and can be type checked (cannot bedereferenced)6.10 Pointers (continued)FORTRAN 90 PointersCan point to heap and non-heap variablesImplicit dereferencingPointers can only point to variables that have the TARGETattributeThe TARGET attribute is assigned in the declaration, as in:INTEGER, TARGET :: NODEA special assignment operator is used for non-dereferencedreferencese.g.REAL, POINTER :: ptr (POINTER is an attribute)ptr => target (where target is either a pointer or a non-pointerwith the TARGET attribute))This sets ptr to have the same value as target
  18. 18. ICS 313 - Fundamentals of Programming Languages 186.10 Pointers (continued)C++ Reference TypesConstant pointers that are implicitly dereferencedUsed for parametersAdvantages of both pass-by-reference and pass-by-valueJava - Only referencesNo pointer arithmeticCan only point at objects (which are all on the heap)No explicit deallocator (garbage collection is used)Means there can be no dangling referencesDereferencing is always implicit6.10 Pointers (continued)Evaluation of pointersDangling pointers and dangling objects are problems, asis heap managementPointers are like gotosthey widen the range of cells that can be accessed by a variablePointers or references are necessary for dynamic datastructuresso we cant design a language without them
  19. 19. ICS 313 - Fundamentals of Programming Languages 197. Expressions and Assignment Statements