computer notes - Reference variables ii


Published on

Must use the c-> syntax because we get a pointer from the queue. The object is still alive because it was created in the heap. One should be careful about transient objects that are stored by reference in data notes

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

computer notes - Reference variables ii

  1. 1. Data Structures Lecture No. 18 ___________________________________________________________________ Page 1 of 8 Data Structures Lecture No. 18 Reference Variables In the last lecture we were discussing about reference variables, we saw three examples; call by value, call by reference and call by pointer. We saw the use of stack when a function is called by value, by reference or by pointer. The arguments passed to the function and local variables are pushed on to the stack. There is one important point to note that in this course, we are using C/C++ but the usage of stack is similar in most of the computer languages like FORTRAN and Java . The syntax we are using here is C++ specific, like we are sending a parameter by pointer using & sign. In Java, the native data types like int, float are passed by value and the objects are passed by reference. In FORTRAN, every parameter is passed by reference. In PASCAL, you can pass a parameter by value or by reference like C++. You might have heard of ALGOL, this language had provided another way of passing parameter called call by name. These kinds of topics are covered in subjects like Study of Computer Languages or Compiler s Theory. It is recommended while you are doing your degree, you study other computer languages and compare them from different aspects. Java is quite popular now a day, quite similar in syntax to C++. May be as a next language, you can study that and compare its different aspects with C/C++. The concepts like how a program is loaded into memory to become a process, how the functions are called and the role of stack etc are similar in all major languages. we have discussed when the variables are passed by reference then behind the scene what goes on inside the stack. There are few important things to take care of while using reference variables: One should be careful about transient objects that are stored by reference in data structures. We know that the local variables of a function are created on call stack. Those variables are created inside the function, remains in memory until the control is inside the function and destroyed when the function exits. Activation record comprise of function call parameters, return address and local variables. The activation record remains inside stack until the function is executing and it is destroyed once the control is returned from the function. Let s see the following code that stores and retrieves objects in a queue: void loadCustomer( Queue & q) { Customer c1( irfan );
  2. 2. Data Structures Lecture No. 18 ___________________________________________________________________ Page 2 of 8 Customer c2( sohail ); q.enqueue( c1 ); q.enqueue( c2 ); } Above given is a small function loadCustomer( Queue &), which accepts a parameter of type Queue by reference. Inside the function body, firstly, we are creating c1 and c2 Customer objects. c1 and c2 both are initialized to string values irfan and sohail respectively. Then we queue up these objects c1 and c2 in the queue q using the enqueue() method and finally the function returns. Now, the objects created inside the above function are c1 and c2. As local variables are created on stack, therefore, objects are also created on stack, no matter how big is the size of the data members of the object. In the Bank example, in previous lecture, we saw that for each customer we have the name (32 characters maximum), arrival time ( int type, 4 bytes), transaction time ( int type)and departure time (int type) of the customer. So the size of the Customer object is 44 bytes. Our c1 and c2 objects are created on stack and have 44 bytes occupied. It is important to mention here that we are referring each 44 bytes of allocation with the name of the object. The allocated 44 bytes are bound with the name of the object c1 or c2. Another significant point here is that the function enqueue() accepts the object Customer by reference. See the code below of serviceCustomer() method, which is executed after the loadCustomer(). void serviceCustomer( Queue & q) { Customer c = q.dequeue(); cout << c.getName() << endl; } The serviceCustomer(Queue &) also accepts one parameter of type Queue by reference. In the first statement, it is taking out one element from the queue and assigning to newly created object c. Before assignment of address of c1 object (c1 because it was inserted first), the object c is constructed by calling the default (parameter less) constructor. In the next statement, c.getName() function call is to get the name of the customer and then to print it. What do you think about it? Will this name be printed or not? Do you see any problem in its execution? In short, this statement will not work. To see the problem in this statement, we have to understand the mechanism; where the object was created, what was pushed on the stack, when the function loadCustomer() returned and what had happened to the objects pushed on to the stack. The objects c1 and c2, which were created locally in loadCustomer() function, therefore, they were created on stack. After creating the objects, we had added their addresses, not the objects themselves in the queue q. When the function loadCustomer() returned, the local objects c1 and c2 were destroyed but their addresses were there in the queue q. After some time, the serviceCustomer() is called. The address of the object is retrieved from the queue and assigned to another newly created local object c but when we wanted to call a method getName() of c1 object using its retrieved address, we encountered the problem.
  3. 3. Data Structures Lecture No. 18 ___________________________________________________________________ Page 3 of 8 This shows that this is true that use of reference alleviate the burden of copying of object but storing of references of transient objects can create problems because the transient object (object created on stack) is destroyed when the function execution finishes. The question arises, what can we do, if we do not want the objects created in a function to be destroyed. The answer to this is dynamic memory allocation. All the variables or objects created in a function that we want to access later are created on memory heap (sometimes called free store) using the dynamic memory allocation functions or operators like new. Heap is an area in computer memory that is allocated dynamically. You should remember that all the objects created using new operator have to be explicitly destroyed using the delete operator. Let s see the modified code of loadCustomer() function, where the objects created in a function are not transient, means they are created on heap to be used later in the program outside the body of the function loadCustomer(). void loadCustomer( Queue & q) { Customer * c1 = new Customer( irfan ); Customer * c2 = new Customer( sohail ); q.enqueue( c1 ); // enqueue takes pointers q.enqueue( c2 ); } This time, we are creating the same two objects using the new operator and assigning the starting addresses of those objects to c1 and c2 pointers. Nameless objects (objects accessed by pointers) are called anonymous objects. Here c1 and c2 are pointers to the objects not the actual objects themselves, as it was previously. These starting addresses c1 and c2 of the objects are then queued using the enqueue() method. As the objects lie on the heap, so there will not be any problem and the objects will be accessible after the function loadCustomer() returns. There is a bit tricky point to understand here. Although, the objects are created on heap but the pointer variables c1 and c2 are created on stack and they will be definitely destroyed after the loadCustomer() activation record is destroyed. Importantly, you should understand the difference between the pointer variables and the actual objects created on heap. The pointer variables c1 and c2 were just used to store the starting addresses of the objects inside the function loadCustomer(), once the function is returned the pointer variables will not be there. But as the starting addresses of the objects are put in the queue, they will be available to use later after retrieving them from queue using the dequeue() operation. These dynamic objects will live in memory (one heap) unless explicitly deleted. By the way, there is another heap, heap data structure that we are going to cover later in this course. At the moment, see the layout of computer memory and heap as we previously saw in this course. Heap is an area in memory given to a process from operating system when the process does dynamic memory allocation.
  4. 4. Data Structures Lecture No. 18 ___________________________________________________________________ Page 4 of 8 One the left of the picture, we can see different processes in the computer memory. When we zoomed into the one of the processes, we saw the picture on the right. That firstly, there is a section for code, then for static data and for stack. Stack grows in the downward section. You can see the heap section given at the end, which grows upward. An interesting question arises here is that why the stack grows downward and heap in the upward direction. Think about an endless recursive call of a function to itself. For every invocation, there will be an activation record on stack. So the stack keeps on growing and growing even it overwrites the heap section. One the other hand, if your program is performing dynamic memory allocation endlessly, the heap grows in the upward direction such that it overwrites the stack section and destroys it. You might have already understood the idea that if a process has some destructive code then it will not harm any other process, only its own destruction is caused. By the way, lot of viruses exploit the stack overflow to change the memory contents and cause further destruction to the system. Consider that we allocate an array of 100 elements of Customer objects dynamically. As each object is 44 bytes, therefore, the size of memory allocated on heap will be 4400 bytes (44 * 100). To explain the allocation mechanism on stack and heap, let s see the figure below where the objects are created dynamically. Process 1 (Browser) Process 3 (Word) Process 4 (Excel) Process 2 (Dev-C++) Windows OS Code Static Data Stack Heap Fig 18.1: Memory Organization c1 c2 heap grows upwards 688 Customer( sohail ) -> c2 sohail
  5. 5. Data Structures Lecture No. 18 ___________________________________________________________________ Page 5 of 8 The objects are shown in this figure by using the names of the customers inside them. Actually, there are three more int type variables inside each object. You can see that the object with string irfan is from memory address 600 to 643 and object with name customer name as sohail is from address 644 to 687. Now when these objects are inserted in the queue, only their starting addresses are inserted as shown in the below figure. 1068 600 1072 644 . . . . c1 c2 loadCustomer
  6. 6. Data Structures Lecture No. 18 ___________________________________________________________________ Page 6 of 8 The loadCustomer() is being executed. It is containing two pointers c1 and c2 containing the addresses 600 and 643 respectively. enqueue(elt) method is called and the parameter values (which actually are addresses) 600 and 643 are inserted in the queue. Because the objects have been allocated on heap, therefore, there will no issue with them. The pointer variables c1 and c2, which we used to store addresses, are destroyed. But the queue q, which is passed by reference to loadCustomer will be there and it is containing the starting addresses of the Customer objects. Those are valid addresses of valid objects, so they can used in the program later to access the customer objects. See the function below: void serviceCustomer( Queue & q) { Customer* c = q.dequeue(); cout << c->getName() << endl; delete c; // the object in heap dies } You can see that we are taking one pointer out of the queue and in the second line calling the method of the Customer object getName() with c->. We are using -> operator because we are taking out pointer from the queue. Now, we should be sure that this method will be executed successfully because the object was created dynamically inside the loadCustomer() method. The last statement inside the method is delete, which has been used to deallocate the object. So now, we understand that we cannot pass references to transient objects. If we want to use the objects later we create them on heap and keep the address. There is another point to mention here that in case, the object has already been deallocated and we are accessing it (calling any of its member), it may the cause the program to crash. The pointer of the object (when object has already been deallocated or released) is called dangling pointer.
  7. 7. Data Structures Lecture No. 18 ___________________________________________________________________ Page 7 of 8 The const Keyword The const keyword is used for something to be constant. The actual meanings depend on where it occurs but it generally means something is to held constant. There can be constant functions, constant variables or parameters etc. The references are pointers internally, actually they are constant pointers. You cannot perform any kind of arithmetic manipulation with references that you normally do with pointers. You must be remembering when we wrote header file for binary tree class, we had used const keyword many times. The const keyword is often used in function signatures. The function signature is also called the function prototype where we mention the function name, its parameters and return type etc. Here are some common uses of const keyword. 1. The const keyword appears before a function parameter. E.g., in a chess program: int movePiece(const Piece & currentPiece) The function movePiece() above is passed one parameter, which is passed by reference. By writing const, we are saying that parameter must remain constant for the life of the function. If we try to change value, for example, the parameter appears on the left side of the assignment, the compiler will generate an error. This also means that if the parameter is passed to another function, that function must not change it either. Use of const with reference parameters is very common. This is puzzling; why are we passing something by reference and then make it constant, i.e., don t change it? Doesn t passing by reference mean we want to change it? Think about it, consult your C++ book and from the internet. We will discuss about the answer in the next lecture.
  8. 8. Data Structures Lecture No. 18 ___________________________________________________________________ Page 8 of 8 Tips • The arithmetic operations we perform on pointers, cannot be performed on references • Reference variables must be declared and initialized in one statement. • To avoid dangling reference, don t return the reference of a local variable (transient) from a function. • In functions that return reference, return global, static or dynamically allocated variables. • The reference data types are used as ordinary variables without any dereference operator. We normally use arrow operator (->) with pointers. • const objects cannot be assigned any other value. • If an object is declared as const in a function then any further functions called from this function cannot change the value of the const object.