Successfully reported this slideshow.
Upcoming SlideShare
×

# [Yidlug] programming languages differences, the underlying implementation

477 views

Published on

Presented BY: Yoel Halberstam Sr Software Engineer, DBA and Web Admin
Yoel is familiar with assembly language, C, C++, c#, java, JavaScript, PHP, VB, vb.net, batch file scripting, shell scripting, and bash

Description: Many of us are frustrated when dealing with a language outside the scope of one's experience, this is especially true when it comes down to differences that are merely a result from the underlying implementation or ideology.

YIDLUG is a Yiddish speaking Linux (and other open source software) Users Group,
check us out at http://www.meetup.com/yidlug

Published in: Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### [Yidlug] programming languages differences, the underlying implementation

1. 1. Virtual Memory Layout Operating System Reserved .data Break (Heap) .text .bss S t a c k 0 FFFFFFFF
2. 2. How The Sections Work .text .data Instruction1 Instruction2 OpCode Operand1 Operand2 ADD 01010101 11110000 a = 1 b = 1 a + b 00000001 00000001 Variable a @(01010101) Variable b @ (11110000) Code
3. 3. Indirection .text .dataInstruction1 Instruction2 OpCode Operand1 Operand2 ADD 11110011 11111111 a = 1 b = 1 c* = &a d* = &b c* + d* 00000001 00000001 Variable a @(01010101) Variable b @ (11110000)Code 01010101Variable c @ (11110011) 11110000Variable d @ (11111111) I n d i r e c t
4. 4. Example A “C” “Array” myArray = array() i = 1 myArray[i] = 0 00000001 i @(01010101) myArray @ (11110000) Code myArray[i] == myArray[1] == myArray + 1 @ (11110001) 00000000 Index value (indirection)
5. 5. Back To Memory a = 0 p* =&a 11110001 Whole_Memory_Array Pointer p @(01010101) Code a =Actual pointed variable = Whole_Memory_Array[p]@ (11110001) 00000000 Pointer value (indirection)
6. 6. Intro To Binary Switch = Bit 8 Switches = 8 Bits = Byte 1 = On 0 = Off We can have only 0 or 1 Is there any way to get real numbers?
7. 7. Lets Take an Example From Regular (Decimal) Numbers • 0 • 1 • 2 • 3 • 4 • 5 • 6 • 7 • 8 • 9 • ????? We have only 9 numerals So how do we proceed?
8. 8. Lets Take an Example From Regular (Decimal) Numbers • 0 • 1 • 2 • 3 • 4 • 5 • 6 • 7 • 8 • 9 • ????? The answer is combination • 10 • 11 • 12 • … • 20 • 21 • …. • 30 • …. • 99 • ????? • 100 • 101 • …. • 200 • …..
9. 9. Back To Binary • 0 • 1 • ??? The answer is combination • 10 • 11 • ????? • 100 • 101 • 110 • 111 • ?????? • 1000 • 1001 • 1010 • 1100 • 1101 • 1110 • 1111 • ??????
10. 10. Binary Compared To Decimal • 0 • 1 • 2 • 3 • 4 • 5 • 6 • 7 • 8 • 9 • …… • 00000000 • 00000001 • 00000010 • 00000011 • 00000100 • 00000101 • 00000110 • 00000111 • 00001000 • 00001001 • …….. • 0 • 1 • 2 • 3 • 4 • 5 • 6 • 7 • 10 • 11 • …… Decimal Binary Octal
11. 11. Negative Numbers • Rule 1: Negative has a 1 in front Ex: 100000001 • Rule 2: The 2’s complement 1. The 1’s Complement – Xor all bits Ex: 000000001 - (Decimal “1”) Xor: 111111110 1. The 2’s Complement – Add 1 Ex: 000000001 - (Decimal “1”) Xor: 111111110 Add 1: 111111110 - (Decimal “-1”)
12. 12. Converting Between Smaller And Larger Types byte a = 1 short b = a Code 00000001 0000000000000001 Correct unsigned byte a = 255 short b = a 11111111 0000000011111111 Correct byte a = -1 short b = a 11111111 0000000011111111 Wrong 11111111 1111111111111111Correct Positive Values Negative Values
13. 13. Identifiers Turned Into Memory Addresses 1. The identifiers that are being turned into memory addresses: – Global Variables – Functions – Labels 1. The identifiers that are NOT being turned into memory addresses, (are only used to measure the size to reserve): – Custom types – Struct – Class names 1. The identifiers that are used as offsets: – Array index – Class (and struct) field (NOT function) member – Local variables
14. 14. Variables • Size of the reserved memory is based on the type • There might be the following types 1. Built-in type – which is just a group of bytes, based on the compiler and/or platform 2. Custom type – which is just a series of built in types 3. Array (in C and C++) – which is just a series of the same built-in type (but no one keeps track of the length, so it is the programmers job) 4. Pointer – system defined to hold memory addresses 5. Reference – a pointer without the possibility to access the memory address 6. Handle – a value supplied by the operating system (like the index of an array) 7. Typedef and Define – available in C and C++ to rename existing types Variables are a “higher language – human readable” name for a memory address
15. 15. Labels • There is no size associated with it • Location – In assembly it might be everywhere • In fact in assembly it is generally the only way to declare a variable, function, loop, or “else” • In Dos batch it is the only way to declare a function – In C and C++ it might be only in a function – but is only recommended to break out of a nested loop – In VB it is used for error handling “On error goto” – In Java it might only be before a loop Labels are a “higher language – human readable” name for a memory address
16. 16. Label Sample In Assembly .data .int var1: 1 var2: 10 .text .global start: mov var1 %eax call myfunc jmp myfunc myfunc: mov var2 %ebx add %eax %ebx ret
17. 17. Label Sample In Java (Or C) outerLabel: while(1==1) { while(2==2) { //Java syntax break outerLabel; //C syntax (not the same as before, as it will cause the loop again) goto outerLabel; } }
18. 18. Boolean Type • False == 0 (all switches are of) • True == 1 (switch is on, and also matches Boolean algebra) • All other numbers are also considered true (as there are switches on) • There are languages that require conversion between numbers and Boolean (and other are doing it behind the scenes)) However TWO languages are an exception
19. 19. Why Some Languages Require conversion •Consider the following C code, all of them are perfectly valid: if(1==1) //true if(1) //true as well if(a) //Checks if “a” is non-zero if(a==b) //Compares “a” to “b” if(a=b) //Sets “a” to the value of “b”, and then //checks “a” if it is non-zero, Is this by //intention or typo? •However in Java the last statement would not compile, as it is not a Boolean operation •For C there is some avoidance by having constants to the left, ie. if(20==a) Instead of if(a==20) Because if(20=a) Is a syntax error While if(a=20) Is perfectly valid
20. 20. Implicit Conversion • However some languages employ implicit conversion • JavaScript considers – If (1) : true – If (0) : false – If (“”) : false – If (“0”) : true • Php Considers • If (“0”) : false • If (array()) : false
21. 21. Boolean In Visual Basic • True in VB = -1 • To see why let us see it in binary – 1 = 00000001 – 1’s complement = 1111111110 – 2’s complement = 1111111111 • So all switches are on But why different than all others? To answer that we need to understand the difference between Logical and Bitwise operators, and why do Logical operators short circuit?
22. 22. Logical vs bitwise • Logical – Step 1 – Check the left side for true – Step 2 – If still no conclusion check the right side – Step 3 – Compare both sides and give the answer • Bitwise – Step 1 – Translate both sides into binary – Step 2 – Compare both sides bit per bit – Step 3 – Provide the soltuion
23. 23. Example Bitwise vs Logical • Example 1 If 1==1 AND 2==2 Logical And Bitwise And Step 1: 1==1 is true 1==1 is true=00000001 Step 2: 2 ==2 is true 2==2 is true=00000001 Step 3: True 00000001 And True 00000001 = True 00000001
24. 24. More Examples
25. 25. Bitwise vs Logical Operators Operator C Basic VB.Net (Logical) Logical AND && N/A AndAlso Logical OR || N/A OrElse Logical NOT ! N/A N/A (Bitwise) Bitwise AND & AND AND Bitwise OR | OR OR Bitwise XOR ^ XOR XOR Bitwise NOT ~ NOT NOT
26. 26. Back To VB 1 = 00000001 = True NOT = 11111110 = True -1 = 11111111 = True NOT = 00000000 = False
27. 27. Boolean In Bash Shell Scripting # if(true) then echo “works”; fi # works # # if(false) then echo “works”; fi # # if(test 1 –eq 1) then echo “works”; fi # works # # if(test 1 –eq 2) then echo “works”; fi # # echo test 1 –eq 1 # # test 1 –eq 1 # echo \$? # 0 # test 1 –eq 2 # echo \$? # 1 # # true # echo #? # 0 # false # echo #? # 1
28. 28. Simple Function Call Stack Code Function func1() { func2(); } Function func2() { return; } Base Pointer Stack Pointer
29. 29. Simple Function Call – Assembly Code Stack Code Function func1() { func2(); } Function func2() { //Some Code return; } Base Pointer Stack Pointer Assembly Code Func1: #label Jmp func2 Func2: #label Push BasePointer BasePointer = StackPointer #some code Pop BasePointer 9997 9995
30. 30. Simple Function Call – Step 1 Stack Code Function func1() { func2(); } Function func2() { //Some Code return; } Base Pointer Stack Pointer Assembly Code Func1: #label Jmp func2 Func2: #label Push BasePointer BasePointer = StackPointer #some code Pop BasePointer 9997 9994 9997
31. 31. Simple Function Call – Step 2 Stack Code Function func1() { func2(); } Function func2() { //Some Code return; } Base Pointer Stack Pointer Assembly Code Func1: #label Jmp func2 Func2: #label Push BasePointer BasePointer = StackPointer #some code Pop BasePointer 9997 9994
32. 32. Simple Function Call – Step 3 Stack Code Function func1() { func2(); } Function func2() { //Some Code return; } Base Pointer Stack Pointer Assembly Code Func1: #label Jmp func2 Func2: #label Push BasePointer BasePointer = StackPointer #some code Pop BasePointer 9995 9997
33. 33. Function Call Stack Code Function func1() { func2(5, 10); } Function func2(int x, int y) { int a; int[10] b; //Some code return; } Base Pointer Stack Pointer
34. 34. Function Call – Assembly Code Stack Code Function func1() { func2(5, 10); } Function func2(int x, int y) { int a; int[10] b; //Some code return; } Base Pointer Stack Pointer Assembly Code Func1: #label Push 10 Push 5 Jmp func2 Func2: #label Push BasePointer BasePointer = StackPointer Sub 1 Sub 10 #some code Add 10 Add 1 Pop BasePointer 9997 9995
35. 35. Function Call – Step 1 Stack Base Pointer Stack Pointer 9993 9997 Code Function func1() { func2(5, 10); } Function func2(int x, int y) { int a; int[10] b; //Some code return; } Assembly Code Func1: #label Push 10 Push 5 Jmp func2 Func2: #label Push BasePointer BasePointer = StackPointer Sub 1 Sub 10 #some code Add 10 Add 1 Pop BasePointer 10 5
36. 36. Function Call – Step 2 Stack Base Pointer Stack Pointer 9992 9997 Code Function func1() { func2(5, 10); } Function func2(int x, int y) { int a; int[10] b; //Some code return; } Assembly Code Func1: #label Push 10 Push 5 Jmp func2 Func2: #label Push BasePointer BasePointer = StackPointer Sub 1 Sub 10 #some code Add 10 Add 1 Pop BasePointer 10 5 9997
37. 37. Function Call – Step 3 Stack Base Pointer Stack Pointer 9992 Code Function func1() { func2(5, 10); } Function func2(int x, int y) { int a; int[10] b; //Some code return; } Assembly Code Func1: #label Push 10 Push 5 Jmp func2 Func2: #label Push BasePointer BasePointer = StackPointer Sub 1 Sub 10 #some code Add 10 Add 1 Pop BasePointer 10 5 9997
38. 38. Function Call – Step 4 Stack Base Pointer Stack Pointer 9992 Code Function func1() { func2(5, 10); } Function func2(int x, int y) { int a; int[10] b; //Some code return; } Assembly Code Func1: #label Push 10 Push 5 Jmp func2 Func2: #label Push BasePointer BasePointer = StackPointer Sub 1 Sub 10 #some code Add 10 Add 1 Pop BasePointer 10 5 9997 9981
39. 39. Function Call – Step 5 Stack Base Pointer Stack Pointer 9992 Code Function func1() { func2(5, 10); } Function func2(int x, int y) { int a; int[10] b; //Some code return; } Assembly Code Func1: #label Push 10 Push 5 Jmp func2 Func2: #label Push BasePointer BasePointer = StackPointer Sub 1 Sub 10 #some code Add 10 Add 1 Pop BasePointer 10 5 9997
40. 40. Function Call – Step 6 Stack Base Pointer Stack Pointer 9993 9997 Code Function func1() { func2(5, 10); } Function func2(int x, int y) { int a; int[10] b; //Some code return; } Assembly Code Func1: #label Push 10 Push 5 Jmp func2 Add 2 Func2: #label Push BasePointer BasePointer = StackPointer Sub 1 Sub 10 #some code Add 10 Add 1 Pop BasePointer 10 5
41. 41. Function Call – Step 6 Stack Base Pointer Stack Pointer 9995 9997 Code Function func1() { func2(5, 10); } Function func2(int x, int y) { int a; int[10] b; //Some code return; } Assembly Code Func1: #label Push 10 Push 5 Jmp func2 Add 2 StackPointer Func2: #label Push BasePointer BasePointer = StackPointer Sub 1 StackPointer Sub 10 StackPointer #some code Add 10 StackPointer Add 1 StackPointer Pop BasePointer
42. 42. Function call - summary 1. Arguments are copied (passed by value) on the stack from right to left (In the C calling convention) 2. The return address is pushed 3. The old base pointer is pushed 4. We make place for local variables (but is not intialized and contans “Garbage”) Call 1. Remove the space from the local variables (everything there is destroyed) 2. The old base pointer is restored 3. The return address is poped and jumped to. 4. Parameters are destroyed Return 1. All access to local variables or parameters are relative to the base pointer 2. i.e the compiler does not transalate “localVar” to “FF55”, but to “BasePointer - 1” Access
43. 43. The Heap Operating System Reserved .data Break (Heap) .text .bss S t a c k 0 FFFFFFFF C Code Int* myPtr = NULL; myPtr = alloc(sizeof(int)); If(myPtr == NULL) exit(1); //Some code free(myPtr); C++ Code Int* myPtr = NULL; myPtr = new(sizeof(int)); If(myPtr == NULL) exit(1); //Some code delete(myPtr); myPtr Points here after malloc and new ?????
44. 44. Heap vs Stack Summary • All types of objects can be allocated both on the stack and on the heap • To use the heap we must have a pointer and then call “malloc” or “new” • Objects created on the stack are automatically destroyed when the function exists (unless with the keyword static in C which actually creates in the data section) • Objects created on the heap, must be explicitly freed with “free”, or “malloc” – IF the pointer is destroyed then we are stuck (or if we just forgat)
45. 45. Copying .data a = 1 b = a c* = &a d* = c 00000001 00000001 Variable a @(01010101) Variable b @ (11110000)Code 01010101Variable c @ (11110011) 01010101Variable d @ (11111111) I n d i r e c t
46. 46. Copying And Then Changing .data a = 1 b = a b = 2 //Not affecting a c* = &a d* = c 00000001 00000010 Variable a @(01010101) Variable b @ (11110000)Code 01010101Variable c @ (11110011) 01010101Variable d @ (11111111) I n d i r e c t
47. 47. Copying Pointers And Then Changing Value .data a = 1 b = a b = 2 //Not affecting a c* = &a d* = c (*d) = 3 00000011 00000010 Variable a @(01010101) Variable b @ (11110000)Code 01010101Variable c @ (11110011) 01010101Variable d @ (11111111) I n d i r e c t
48. 48. Copying Pointers And Then Changing Pointer Value .data a = 1 b = a b = 2 //Not affecting a c* = &a d* = c (*d) = 3 d = &b (*d) = 4 00000011 00000100 Variable a @(01010101) Variable b @ (11110000)Code 01010101Variable c @ (11110011) 01010101Variable d @ (11111111) I n d i r e c t
49. 49. Copying Summary • Copying value variables, just copy the value, any change after to one does NOT affect to the other • Copying pointer variables, both point to the same value variable – Any change to the value variable via one of the pointers IS reflected to the other – Any changes to the pointer value of one is NOT reflected by the other pointer, and also not affect the original value variable
50. 50. Pointer vs Reference • Problems with pointers, stack and heap so far: – A pointer can be directly (by intention or accidently) changed to access any memory – Failure to free the allocated heap results in a memory leak – Forgetting to free the allocated heap till the function returns, then there is no longer a way to free it – Creating a large object on the stack is not worth, and of course not to pass by value
51. 51. References • Reference is the same as a pointer but without the ability to change the memory address, only possible to change pointing from one object of the same type to another Code object myVar = new object(); object myVar2 = myVar;//myVar2 points to the same object as myVar myVar = new object();//Now myVar points to a new object, while myVar2 points to the old • A primitive variable is in general a value type and cannot be NULL • A object is generally a reference and is NULL as long there is no call to “new()” • Now the compiler can keep track of what is referenced by what
52. 52. Garbage Collection • Garbage Collection means to free heap memory no longer in use • In COM and VB6 (based on COM) it is done by keeping a count of the variables that still reference the object – For each new reference it is incremented, and for each out of scope or null or just change in reference it is decremented – When the reference count reaches 0 it is deleted – Problem if two objects just reference each other, they will never be deleted • In Java and .Net the JVM and CLR keep track of objects that are not referenced by real references
53. 53. Null Pointer And DB Null • A pointer or reference not pointing anywhere is pointing to null • In C NULL is a constant with the value 0 • 0 == Null (in C) and null == null • NULL + “SomeString” = “SomeString” – (but not in Linq To SQL!!! WATCH OUT) • In Database it is different NULL is nothing • NULL = NULL is also NULL, use IS NULL • NULL + “someString” is NULL • WHERE NOT IN(NULL) is corrupting everything • SQL Server has an option “SET ANSI NULLS OFF”, and MySQl has an operator <=> to compare Nulls
54. 54. NaN • Division of an integer by 0 is an error • Division of a floating point by 0 results in Nan • Check for Nan by checking • This does not work for a database NULL IEEE NaN == NaN = false NaN != NaN = false Java NaN == NaN = false NaN != NaN = true If((myVar == myVar) == false) If(myVar.IsNaN()) NULL == NULL = NULL NULL != NULL = NULL
55. 55. Characters
56. 56. String
57. 57. String Quoting
58. 58. Pass By Value vs Pass By Reference
59. 59. Immutable Strings