C tutorial


Published on

Learn C in simple steps,as given in the ppt.
An easy way to learn nd fullfill ur skill's with 'C' in computer programming

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • In my first summer internship my task was to convert a numeric program from FORTRAN to C, knowing nothing about either language – I only knew Pascal and BASIC. The result was that I sort of learned C, but in a sort of voodoo way, never really being sure what was going on, but getting some things to work, at least some of the time. Since then I’ve gradually worked out how things really work, and really make sense, and become proficient…. But this was a slow process. Today I hope to help shortcut some of that process. How this is going to proceed. I have limited time here so I will go kind of fast. I may mention some things in passing that I won’t fully explain. However, afterwards you will have the .ppt file and the notes. All the stuff I mention in passing is in the slides and the notes. This way, even if you don’t “get” all the details the first time, you can go back and refer to the slides and notes.
  • Gcc compiler options: -Wall tells the compiler to generate ALL “warnings”. These warnings will often identify stupid mistakes. -g tells the compiler to generate debugging information If you don’t supply a –o option to set an output filename, it will create an executable called “a.out” ./? To run a program in the current directory use ./program ‘.’ is the current directory. Otherwise the shell will only run program that are in your “PATH” variable, which is usually “standard” paths like /bin and /usr/bin What if it doesn’t work? If your program compiles but doesn’t work, there are many debugging tools at your disposal. “ strace” traces all system calls made by your program. This will tell you when you call into the OS, e.g. open(), read(), write(), etc. “ ltrace” traces all standard library calls made by your program. “ gdb” is a source-level debugger. With this tool you can analyse core dumps, and step through your program line by line as it runs. “ valgrind” is an x86 virtual machine that is useful for catching memory errors such as use of freed memory, bad pointers, etc. Another common way to trace a program is “printf debugging”, where you put printf statements throughout your code to see what’s going on.
  • What do the <> mean? Include directives use <x.h> (brackets) to indicate that the compiler should look in a “standard” place, such as /usr/include/… They use “x.h” (double quotes) to indicate that it should look first in the current directory. Can your program have more than one .c file? A ‘.c’ file is called a “module”. Many programs are composed of several ‘.c’ files and libraries that are ‘linked’ together during the compile process.
  • Why preprocess? Having two phases to a process with different properties often helps to make a very flexible yet robust system. The underlying compiler is very simple and its behavior is easily understood. But because of that simplicity it can be hard to use. The preprocessor lets you “customize” the way your code “looks and feels” and avoid redunancy using a few simple facilities, without increasing the complexity of the underlying compiler. Macros can be used to make your code look clean and easy to understand. They can also be used for “evil”. What are continued lines? Continued lines are lines that need to be very long. When the preprocessor encounters a ‘\\’ (that is not inside a string), it will ignore the next character. If that character is a newline, then this “eats” the newline and joins the two lines into one. This is often used when defining macros with #define, because a macro must be all on one line. Be careful! If you have a space after a ‘\\’, it will eat the space and will NOT join the lines.
  • Include directives use <x.h> (brackets) to indicate that the compiler should look in a “standard” place, such as /usr/include/… They use “x.h” (double quotes) to indicate that it should look first in the current directory.
  • What’s 72? ASCII is the coding scheme that maps bytes to characters. Type ‘man ascii’ at the shell to get a listing of the mapping. Not always… Types such as int vary in size depending on the architecture. On a 32 bit or 64 bit platform, ints are 4 bytes (32 bits). On an 8 or 16 bit platform, an int is 2 bytes (16 bits). To be safe, it’s best to specify types exactly e.g. int32_t, int16_t., int8_t. Defined in #include <inttypes.h> Signed? Signedness is a common source of problems in C. It’s always clearest to specify signness explicitly (and size) using the [u]int[8,16,32]_t types. Signedness can introduce surprises when casting types because of “sign extension”. For example, char is signed (although it’s normally used as unsigned 0-255). If you cast char to int, it will be a signed integer and the sign will be extended. If you then cast that to unsigned, it turns a small negative number into an enormous int value (high bit set). That is, (uint32_t)(int)(char)(128) == 4294967168 (not 128!) One easy solution to this is to always use uint8_t rather than char for byte streams.
  • Symbol table: The “Symbol Table” is the mapping from symbol to address. You can extract this from a compiled program (if debugging is enabled aqt compile time with –g) using the nm utility. Declaration and definition mean slightly different things in C. Declaration is mapping a name to a type, but not to a memory location. Definition maps a name and type to a memory location (which could be the body of a function). Definitions can assign an initial value. Declarations can point to something that never actually gets defined, or is defined in a library external to your program. What names are legal? Symbols in C (function and variable names) must start with an alphabetic character or ‘_’. They may contain numeric digits as well but cannot start with a digit. Common naming strategies are ‘CamelCase’, ‘lowerUpper’, and ‘under_bar’. Some people use complicated schemes but I prefer under_bar. extern, static, and const are modifiers of variable declarations. ‘extern’ means the variable being declared is defined elsewhere in some other program module (.c file). ‘static’ means that the variable cannot be accessed outside of the current scope. ‘const’ means the variable’s value cannot be changed (directly).
  • Returns nothing.. “void” is used to denote that nothing is returned, or to indicate untyped data. What if you define a new “char b”? What happens to the old one? If you defile a variable in your local scope that has the same name as a variable in an enclosing scope, the local version takes precedence. Are definitions allowed in the middle of a block? Older versions of C and some derivatives like NesC only allow definitions at the beginning of a scope; d would be illegal. Modern C (gcc 3.0+) allows this, which can make the code easier to read as the definitions are closer to use. However, note that definitions are not always allowed in certain places, such as after a jump target, e.g. { goto target; /* … */ target: int z; /* error */ /* … */ }
  • Despite being syntactically correct, if (x=y) is generally to be avoided because it is confusing. Doing complicated things with ++ and – are also not generally recommended. It is almost never important from an efficiency perspective and it just makes the code harder to understand.
  • Need braces? Braces are only needed if your block contains a single statement. However, braces, even when not necessary, can increase the clarity of your code, especially when nesting ‘if’ statements. X ? Y : Z.. There is a shortcut to if then else that can be done in an expression: the ‘?:’ syntax. (cond ? alt1 : alt2) is an expression returning alt1 if cond, and alt2 if not cond. There is no truly equivalent form to this construct. Short circuit evaluation. Sometimes when evaluating an if conditional, the result is known before the evaluation completes. For example, consider if (X || Y || Z). If X is true, then Y and Z don’t matter. C will not evaluate Y and Z if X is true. This only matters in cases where there are side effects of evaluating the rest of the expression, for example if it contains function calls, or invalid operations. The converse evaluation rules apply to &&; if (X && Y && Z) will only evaluate Y and Z if X is true. Note, this is why it’s OK to write: char *x; if ((x != NULL) && (*x == ‘Z’)) … because if x is NULL (and therefore an invlalid pointer) the second clause dereferencing x will not be run. Note that this means that it’s not always possible to replace an if() with a function call because all args to a function are evaluated before calling the function. Detecting brace errors . Many text editors such as emacs include features that help you detect brace errors. Emacs will remind you of the opening brace whenever you close a brace, and will auto-indent your code according to braces – wrong indenting gives you a hint that something is wrong. In emacs, <tab> will autoindent the line you are on.. Keeping your code properly indented is a good way to detect a lot of different kinds of error.
  • Variables declared ‘static’ inside of a function are allocated only once, rather than being allocated in the stack frame when the function is called. Static variables therefore retain their value from one call to the next, but may cause problems for recursive calls, or if two threads call the function concurrently. Java users: Just to be aware: recursion doesn’t always work in Java because objects are always passed by reference (i.e they are not copied). In java, this recursive function would only work if it was a primitive type like int being passed.
  • What about other languages? Some languages such as scheme implement optimized “tail recursion”, which reuses the current stack frame when the recursive call constitutes the last use of the current stack frame. In these cases, recursion costs no more than iteration.
  • Java In java, primitive types like int are passed by value, but all objects are passed by reference (basically by pointer, but without any explicit awareness of the address). In C++, things operate like in C except that there is a syntax to pass arguments by reference, declaring the argument to be a “reference” type. For example, int f(int &x); C does not support this; you have to do it “manually” with pointers. But it’s not really fundamentally different anyway.
  • How should pointers be initialized? Always initialize pointers to 0 (or NULL). NULL is never a valid pointer value, but it is known to be invalid and means “no pointer set”.
  • Packing? The layout of a structure in memory is actually architecture-dependent. The order of the fields will always be preserved; however, some architectures will pad fields so that they are word-aligned. Word alignment is important for some processor architectures because they do not support unaligned memory access, e.g. accessing a 4 byte int that is not aligned to a 4 byte boundary. It is also usually more efficient to process values on word boundaries. So in general, structs are architecture-specific. For instance, on x86 platforms, all fields are word-aligned, so struct { uint16_t x; uint32_t y; }; Will have a hidden two bytes of padding between x and y. However, on an atmel atmega128 this padding will not be present. One way to address this is to predict it, and explicity add the padding fields. Another solution is to add __attribute__ ((“packed”)) after the structure declaration to force the compiler to generate code to handle unaligned accesses. This code will operate slower but this is often the easiest solution to dealing with cross-platform compatibility. The other common problem with cross-platform compatibility is endianness. X86 machines are little-endian; many modern processors such as the PXA (xscale) can select their endianness. The functions ntohl, htonl, etc. can be used to convert to a standard network byte order (which is big endian). =========== Why a zero length array? A zero length array is a typed “pointer target” that begins at the byte after the end of the struct. The zero length array takes up no space and does not change the result of sizeof(). This is very useful for referring to data following the struct in memory, for example if you have a packet header and a packet payload following the header: struct hdr { int src; int dst; uint8_t data[0]; }; If you have a buffer of data containing the packet and the payload: uint8_t buf[200]; int buf_len; /* cast the buffer to the header type (if it’s long enough!) */ if (buf_len >= sizeof(struct hdr)) { struct hdr *h = (struct hdr *)buf; /* now, h->data[0] is the first byte of the payload */ }
  • char x[] and char *x are subtlely different. First, what’s the same: both will evaluate to an address that is type char *. In both cases, x[n] == *(x+n) and x == &(x[0]). The difference lies in the fact that char x[10] allocates memory for that data and both &x and x always evaulates to the base address of the array, whereas char *x allocates only space for a pointer, and x simply evaulates to the value of that pointer. Thus, in the case of char x[10], &x == x by definition. However this is almost never true for char *x because x contains the address of the memory where the characters are stored, &(x[0]), and &x is the address of that pointer itself!
  • %m: When %m is included in a printf format string, it prints a string naming the most recent error. A global variable ‘errno’ always contains the most recent error value to have occurred, and the function strerr() converts that to a string. %m is shorthand for printf(“%s”, strerr(errno)) Emstar tips : Emstar includes glib which includes various useful functions and macros. One of these is g_new0(). g_new0(typename, count)  (typename *)calloc(count, sizeof(typename)); Emstar also provides elog() and elog_raw(), logging functions that integrate with emstar’s logging facilities. elog() works like printf, but takes one additional loglevel argument. You do not need to include a ‘\\n’ in elog() messages as it is implied. elog_raw() takes a buffer and a length and dumps the bytes out in a prettyprinted fashion. Emstar includes the useful “buf_t” library that manages automatically growing buffers. This is often the easiest way to allocate memory.
  • Reference counting is one approach to tracking allocated memory. However, there is no perfect scheme. The main problem with reference counting is that correctly handling cyclically linked structures is very difficult, and likely to result in mis-counting in one direction or the other. Garbage collection is another solution; this is used by Java and other languages. However, this often causes performance problems and requires a very strict control over typing in the language, which C can’t easily provide by its nature of having direct access to the low levels. Emstar uses a static reference model, where there is exactly one canonical reference to each object, and that reference will be automatically cleared when the object is destroyed. The only requirements are: the reference must be in stable allocation (static or dynamic, but you can’t move the memory) and you must always access the object via that single reference. We have found this scheme to be quite workable, as long as you are cognizant of these requirements.
  • What to do when malloc fails? Except in certain cases, it’s going to be very difficult to recover from this state because lots of functions will start failing and the correct way to handle these errors is not obvious. If you continue without your memory, it is likely that your program will get into some very seriously broken state. The main case where recovery IS possible is the case that the malloc() that failed is part of a single too-large transaction that can fail independently of the rest of the system. For example if the user asks your program to allocate a zillion bytes and you can’t, just abort that request and ask for the next request. But if your system is just generally out of memory, there’s not much you can do. In emstar, processes automatically respawn and the system is generally designed to handle module failures, so a slow memory leak that causes you to exit and restart may allow a fairly graceful recovery. Memmove memmove() is written so that it will copy in the correct order to be able to shift data in a buffer, that is, if the source and destination buffers overlap. This is not true of the memcpy() function. Use pointers as implied in-use flags! One clever way to reduce the number of state variables is to avoid the use of “in use” flags when there is also a pointer that can be NULL. That is, if the item is not allocated (and therefore a NULL pointer), it’s not in use. If there’s a pointer there, then you assume that it is valid, initialized, etc. By avoiding this redundancy of state varaibles, you avoid the various cases where the two variables are out of sync.. Pointer there but not in use.. Or in use but no pointer.
  • The difference between a macro and a static inline . These two constructs have some properties in common: they must be included in any module that uses them, they can’t be linked to, and they are generally inlined into the program. The key difference between them is that the macro is processed by the preprocessor whereas the static inline conforms to all of the semantics of any other function. The reason it must be included in each file is that it’s declared static, and the reason it’s more efficient is that it is inlined by default. Static inlines cause their arguments to be evaluated before being applied whereas macros are a pure text substitution with no evaluation. Arguments to static inlines are type-checked whereas arguments to macros can be any type and are substituted as raw text. Static inline invocations can invoke statements AND return a value, whereas a macro can EITHER return a value (if it’s an expression) OR invoke statements. A good example of the differences is the implementation of sqr(): #define SQR(a) ((a)*(a)) static inline float sqr(float a) { return a*a; } SQR will evaulate its argument twice, i.e. SQR(x++) will compute x*(x+1) sqr will evaliate its argument once, i.e. sqr(x++) will compute x*x SQR can compute the square of any numberic quantity, whereas sqr can only handle floats. However, sqr will return a more helpful error message if you attempt to square a struct. Since macros can do text manipulation, one nice way to use them is to use macros to generate static inlines. More on C constants . Check K&R for the details on the modifiers that allow you to correctly type constants. For example, 45L is a long int. Watch out for preceding 0’s, this means it will interpret it as octal. These issues most often crop up when you are trying to specify a constant that is larger than the default int type. Enums: For defining sets of integer values, enums are preferable. Enums enforce uniqueness within an enum and can automatically number your values sequentially. But macros are preferred for constants that define tuning parameters (that don’t want uniqueness or sequential values) and other types like floats and strings. Why expressions in parens : If you leave off the parens then your expression may introduce precedence surprises when combined in other expressions. For example: #define VALUE 2+45 int c = VALUE*3; What was meant was (2+45)*3 .. But what we got was 2+(45*3) It’s also a good idea to put parens around args of a macro, to avoid similar problems if one of the args is an expression: #define SQR(x) ((x)*(x)) Why use do{}while(0)? Multi-statement macros should be enclosed in do{}while(0) to avoid surprises when the macro is called as one statement in an if (): if (fail) DBG(“help\\n”); would cause only the first statement to be conditional.
  • Why return -1? Return values in C are usually 0 or positive for success and negative to signal failure. The errno codes (man errno) list some meanings that can be used for error return codes. When a pointer is returned, NULL signifies a failure. Errno can be set to indicate a reason for the failure (by just assigning to errno).
  • The recursive solution uses more stack space, but only a limited amount (max. 32 recursions), is easier to read and understand. Readability is valuable and not worth sacrificing to optimality unless there is a good reason to optimize. So I vote for recursion in this case.
  • C tutorial

    1. 1. A Quick Introduction to C Programming Lewis Girod CENS Systems Lab July 5, 2005 http://lecs.cs.ucla.edu/~girod/talks/c-tutorial.ppt
    2. 2. or,What I wish I had known about Cduring my first summer internship With extra info in the NOTES
    3. 3. High Level Question: Why is Software Hard?Answer(s):• Complexity: Every conditional (“if”) doubles number of paths through your code, every bit of state doubles possible states – Solution: reuse code with functions, avoid duplicate state variables• Mutability: Software is easy to change.. Great for rapid fixes .. And rapid breakage .. always one character away from a bug – Solution: tidy, readable code, easy to understand by inspection. Avoid code duplication; physically the same  logically the same• Flexibility: Programming problems can be solved in many different ways. Few hard constraints  plenty of “rope”. – Solution: discipline and idioms; don’t use all the rope
    4. 4. Writing and Running Programs #include <stdio.h> /* The simplest C Program */ int main(int argc, char **argv) { 1. Write text of program (source code) using an editor printf(“Hello Worldn”); return 0; such as emacs, save as file e.g. my_program.c } 2. Run the compiler to convert program from source to an “executable” or “binary”: $ gcc –Wall –g my_program.c –o my_program$ gcc -Wall –g my_program.c –o my_programtt.c: In function `main: -Wall –g ?tt.c:6: parse error before `xtt.c:5: parm types given both in parmlist and separatelytt.c:8: `x undeclared (first use in this function)tt.c:8: (Each undeclared identifier is reported only once 3-N. Compiler gives errors and warnings; edit sourcett.c:8: for each function it appears in.)tt.c:10: warning: control reaches end of non-void function file, fix it, and re-compilett.c: At top level:tt.c:11: parse error before `return N. Run it and see if it works  ./? $ ./my_program my_program Hello World What if it doesn’t work? $▌
    5. 5. C Syntax and Hello World #include inserts another file. “.h” files are called “header” files. They contain stuff needed to interface to libraries and code in other “.c” files. Can your program have What do the < > more than one .c file? mean? This is a comment. The compiler ignores this.#include <stdio.h> The main() function is always/* The simplest C Program */ where your program startsint main(int argc, char **argv) running.{ Blocks of code (“lexical printf(“Hello Worldn”); scopes”) are marked by { … } return 0;} Return ‘0’ from this function Print out a message. ‘n’ means “new line”.
    6. 6. A Quick Digression About the Compiler#include <stdio.h>/* The simplest C Program */int main(int argc, char **argv) Preprocess Compilation occurs in two steps:{ printf(“Hello Worldn”); “Preprocessing” and “Compiling” return 0;} Why ?__extension__ typedef unsigned long long int __dev_t; In Preprocessing, source code is “expanded” into a__extension__ typedef__extension__ typedef unsigned int unsigned int __uid_t; __gid_t; larger form that is simpler for the compiler to__extension__ typedef__extension__ typedef unsigned long int unsigned long long int __ino_t; __ino64_t; understand. Any line that starts with ‘#’ is a line__extension__ typedef__extension__ typedef unsigned int long int __nlink_t; __off_t; that is interpreted by the Preprocessor.__extension__ typedef long long int __off64_t;extern void flockfile (FILE *__stream) ;extern int ftrylockfile (FILE *__stream)extern void funlockfile (FILE *__stream) ; ; • Include files are “pasted in” (#include)int main(int argc, char **argv){ • Macros are “expanded” (#define) printf(“Hello Worldn”); return 0; • Comments are stripped out ( /* */ , // )} • Continued lines are joined ( ) ? my_program The compiler then converts the resulting text into Compile binary code the CPU can run directly.
    7. 7. OK, We’re Back.. What is a Function? A Function is a series of instructions to run. You pass Arguments to a function and it returns a Value. “main()” is a Function. It’s only special because it always gets called first when you run your program. Return type, or void Function Arguments#include <stdio.h>/* The simplest C Program */int main(int argc, char **argv){ Calling a Function: “printf()” is just another function, like main(). It’s defined printf(“Hello Worldn”); for you in a “library”, a collection of return 0; functions you can call from your program.} Returning a value
    8. 8. What is “Memory”? Addr ValueMemory is like a big table of numberedslots where bytes can be stored. 0 1The number of a slot is its Address. 2One byte Value can be stored in each slot. 3 72? 4 ‘H’ (72)Some “logical” data values span more than 5 ‘e’ (101)one slot, like the character string “Hellon” 6 ‘l’ (108)A Type names a logical meaning to a span 7 ‘l’ (108)of memory. Some simple types are: 8 ‘o’ (111) a single character (1 slot) 9 ‘n’ (10) char char [10] an array of 10 characters 10 ‘0’ (0) int signed 4 byte integer not always… 11 float 4 byte floating point int64_t signed 8 byte integer Signed?… 12
    9. 9. What is a Variable? symbol table? Symbol Addr ValueA Variable names a place in memory whereyou store a Value of a certain Type. 0 1You first Define a variable by giving it a 2name and specifying the type, and 3optionally an initial value declare vs define? x 4 ? y 5 ‘e’ (101)char x; Initial value of x is undefinedchar y=‘e’; 6 7 The compiler puts them Initial value 8 somewhere in memory. 9 Name What names are legal? 10Type is single character (char) 11 extern? static? const? 12
    10. 10. Multi-byte VariablesDifferent types consume different amounts Symbol Addr Valueof memory. Most architectures store data 0on “word boundaries”, or even multiples of 1the size of a primitive data type (int, char) 2 3char x;char y=‘e’; x 4 ?int z = 0x01020304; y 5 ‘e’ (101) 60x means the constant is paddingwritten in hex 7 z 8 4 9 3 An int consumes 4 bytes 10 2 11 1 12
    11. 11. Lexical Scoping (Returns nothing) void p(char x)Every Variable is Defined within some scope. {A Variable cannot be referenced by name char y; /* p,x */(a.k.a. Symbol) from outside of that scope. /* p,x,y */ char z; /* p,x,y,z */ }Lexical scopes are defined with curly braces { }. /* p */ char z; /* p,z */ The scope of Function Arguments is the void q(char a) complete body of the function. { char b; /* p,z,q,a,b */ The scope of Variables defined inside a { char b? function starts at the definition and ends at char c; the closing brace of the containing block } /* p,z,q,a,b,c */ legal? char d; The scope of Variables defined outside a /* p,z,q,a,b,d (not c) */ function starts at the definition and ends at } the end of the file. Called “Global” Vars. /* p,z,q */
    12. 12. Expressions and EvaluationExpressions combine Values using Operators, according to precedence.1 + 2 * 2  1 + 4  5(1 + 2) * 2  3 * 2  6Symbols are evaluated to their Values before being combined.int x=1;int y=2;x + y * y  x + 2 * 2  x + 4  1 + 4  5Comparison operators are used to compare values.In C, 0 means “false”, and any other value means “true”.int x=4;(x < 5)  (4 < 5)  <true>(x < 4)  (4 < 4)  0((x < 5) || (x < 4))  (<true> || (x < 4))  <true> Not evaluated because first clause was true
    13. 13. Comparison and Mathematical Operators== equal to The rules of precedence are clearly< less than defined but often difficult to remember<= less than or equal> greater than or non-intuitive. When in doubt, add>= greater than or equal parentheses to make it explicit. For!= not equal&& logical and oft-confused cases, the compiler will|| logical or give you a warning “Suggest parens! logical not around …” – do it!+ plus & bitwise and- minus | bitwise or Beware division:* mult ^ bitwise xor • If second argument is integer, the/ divide ~ bitwise not% modulo << shift left result will be integer (rounded): >> shift right 5 / 10  0 whereas 5 / 10.0  0.5 • Division by 0 will cause a FPE Don’t confuse & and &&.. 1 & 2  0 whereas 1 && 2  <true>
    14. 14. Assignment Operatorsx = y assign y to x x += y assign (x+y) to xx++ post-increment x x -= y assign (x-y) to x++x pre-increment x x *= y assign (x*y) to xx-- post-decrement x x /= y assign (x/y) to x--x pre-decrement x x %= y assign (x%y) to xNote the difference between ++x and x++:int x=5; int x=5;int y; int y;y = ++x; y = x++;/* x == 6, y == 6 */ /* x == 6, y == 5 */Don’t confuse = and ==! The compiler will warn “suggest parens”.int x=5; int x=5;if (x==6) /* false */ if (x=6) /* always true */{ { /* ... */ /* x is now 6 */ recommendation} }/* x is still 5 */ /* ... */
    15. 15. A More Complex Program: pow #include <stdio.h>“if” statement #include <inttypes.h> float pow(float x, uint32_t exp) /* if evaluated expression is not 0 */ { if (expression) { /* base case */ /* then execute this block */ Need braces? if (exp == 0) { } return 1.0; else { } /* otherwise execute this block */ X?Y:Z } /* “recursive” case */ Short-circuit eval? detecting brace errors return x*pow(x, exp – 1); } int main(int argc, char **argv)Tracing “pow()”: {• What does pow(5,0) do? float p; p = pow(10.0, 5);• What about pow(5,1)? printf(“p = %fn”, p); return 0;• “Induction” } Challenge: write pow() so it requires log(exp) iterations
    16. 16. The “Stack” #include <stdio.h>Recall lexical scoping. If a variable is valid #include <inttypes.h>“within the scope of a function”, what float pow(float x, uint32_t exp)happens when you call that function { /* base case */recursively? Is there more than one “exp”? if (exp == 0) { return 1.0; static }Yes. Each function call allocates a “stack /* “recursive” case */ Java?frame” where Variables within that function’s return x*pow(x, exp – 1);scope will reside. } int main(int argc, char **argv) { float x 5.0 float p; uint32_t exp 0 Return 1.0 p = pow(5.0, 1); printf(“p = %fn”, p); float x 5.0 return 0; } uint32_t exp 1 Return 5.0 int argc 1 char **argv 0x2342 float p 5.0 undefined Grows
    17. 17. Iterative pow(): the “while” loop Other languages? float pow(float x, uint exp)Problem: “recursion” eats stack space (in C). { int i=0;Each loop must allocate space for arguments float result=1.0;and local variables, because each new call while (i < exp) { result = result * x;creates a new “scope”. i++; } return result; }Solution: “while” loop. int main(int argc, char **argv) loop: { if (condition) { while (condition) { float p; statements; statements; p = pow(10.0, 5); goto loop; } printf(“p = %fn”, p); } return 0; }
    18. 18. The “for” loopThe “for” loop is just shorthand for this “while” loop structure.float pow(float x, uint exp) float pow(float x, uint exp){ { float result=1.0; float result=1.0; int i; int i; i=0; for (i=0; (i < exp); i++) { while (i < exp) { result = result * x; result = result * x; } i++; return result; } } return result;} int main(int argc, char **argv) {int main(int argc, char **argv) float p;{ p = pow(10.0, 5); float p; printf(“p = %fn”, p); p = pow(10.0, 5); return 0; printf(“p = %fn”, p); } return 0;}
    19. 19. Referencing Data from Other ScopesSo far, all of our examples all of the data values we haveused have been defined in our lexical scope float pow(float x, uint exp) { float result=1.0; int i; for (i=0; (i < exp); i++) { Nothing in this scope result = result * x; } return result; } int main(int argc, char **argv) { float p; Uses any of these variables p = pow(10.0, 5); printf(“p = %fn”, p); return 0; }
    20. 20. Can a function modify its arguments?What if we wanted to implement a function pow_assign() thatmodified its argument, so that these are equivalent: float p = 2.0; float p = 2.0; /* p is 2.0 here */ /* p is 2.0 here */ p = pow(p, 5); pow_assign(p, 5); /* p is 32.0 here */ /* p is 32.0 here */ Would this work? void pow_assign(float x, uint exp) { float result=1.0; int i; for (i=0; (i < exp); i++) { result = result * x; } x = result; }
    21. 21. NO! Remember the stack!void pow_assign(float x, uint exp) Java/C++?{ float result=1.0; In C, all arguments are int i; for (i=0; (i < exp); i++) { passed as values result = result * x; } x = result;} But, what if the argument is{ the address of a variable? float p=2.0; pow_assign(p, 5);} float x 32.0 2.0 uint32_t exp 5 float result 32.0 1.0 float p 2.0 Grows
    22. 22. Passing Addresses Symbol Addr ValueRecall our model for variables stored 0in memory 1 2What if we had a way to find out theaddress of a symbol, and a way to 3reference that memory location by char x 4 ‘H’ (72)address? char y 5 ‘e’ (101) address_of(y) == 5 6 memory_at[5] == 101 7 void f(address_of_char p) { 8 memory_at[p] = memory_at[p] - 32; } 9 10 char y = 101; /* y is 101 */ f(address_of(y)); /* i.e. f(5) */ 11 /* y is now 101-32 = 69 */ 12
    23. 23. “Pointers”This is exactly how “pointers” work. A “pointer type”: pointer to char “address of” or reference operator: & “memory_at” or dereference operator: * void f(address_of_char p) void f(char * p) { { memory_at[p] = memory_at[p] - 32; *p = *p - 32; } } char y = 101; /* y is 101 */ char y = 101; /* y is 101 */ f(address_of(y)); /* i.e. f(5) */ f(&y); /* i.e. f(5) */ /* y is now 101-32 = 69 */ /* y is now 101-32 = 69 */Pointers are used in C for many other purposes:• Passing large objects without copying them• Accessing dynamically allocated memory• Referring to functions
    24. 24. Pointer ValidityA Valid pointer is one that points to memory that your program controls.Using invalid pointers will cause non-deterministic behavior, and willoften cause Linux to kill your process (SEGV or Segmentation Fault).There are two general causes for these errors: How should pointers be initialized?• Program errors that set the pointer value to a strange number• Use of a pointer that was at one time valid, but later became invalid Will ptr be valid or invalid? char * get_pointer() { char x=0; return &x; } { char * ptr = get_pointer(); *ptr = 12; /* valid? */ }
    25. 25. Answer: Invalid!A pointer to a variable allocated on the stack becomes invalid whenthat variable goes out of scope and the stack frame is “popped”. Thepointer will point to an area of the memory that may later get reusedand rewritten. char * get_pointer() { char x=0; return &x; } But now, ptr points to a { location that’s no longer in char * ptr = get_pointer(); use, and will be reused the *ptr = 12; /* valid? */ other_function(); next time a function is called! } 101 int x charaverage Return 101 456603 12 0 100 char * ptr 101 ? Grows
    26. 26. More on TypesWe’ve seen a few types at this point: char, int, float, char *Types are important because:• They allow your program to impose logical structure on memory• They help the compiler tell when you’re making a mistakeIn the next slides we will discuss:• How to create logical layouts of different types (structs)• How to use arrays• How to parse C type names (there is a logic to it!)• How to create new types using typedef
    27. 27. Structures Packing?struct: a way to compose existing types into a structure #include <sys/time.h> struct timeval is defined in this header /* declare the struct */ struct my_struct { structs define a layout of typed fields int counter; float average; struct timeval timestamp; structs can contain other structs uint in_use:1; uint8_t data[0]; fields can specify specific bit widths }; Why? /* define an instance of my_struct */ A newly-defined structure is initialized struct my_struct x = { in_use: 1, using this syntax. All unset fields are 0. timestamp: { tv_sec: 200 } }; x.counter = 1; Fields are accessed using ‘.’ notation. x.average = sum / (float)(x.counter); struct my_struct * ptr = &x; A pointer to a struct. Fields are accessed ptr->counter = 2; (*ptr).counter = 3; /* equiv. */ using ‘->’ notation, or (*ptr).counter
    28. 28. Arrays Arrays in C are composed of a particular type, laid out in memory in a repeating pattern. Array elements are accessed by stepping forward in memory from the base of the array by a multiple of the element size./* define an array of 10 chars */ Brackets specify the count of elements.char x[5] = {‘t’,’e’,’s’,’t’,’0’}; Initial values optionally set in braces./* accessing element 0 */x[0] = ‘T’; Arrays in C are 0-indexed (here, 0..9)/* pointer arithmetic to get elt 3 */char elt3 = *(x+3); /* x[3] */ x[3] == *(x+3) == ‘t’ (NOT ‘s’!)/* x[0] evaluates to the first element; * x evaluates to the address of the What’s the difference Symbol Addr Value * first element, or &(x[0]) */ between char x[] and char *x? char x [0] 100 ‘t’/* 0-indexed for loop idiom */#define COUNT 10 char x [1] 101 ‘e’char y[COUNT];int i; For loop that iterates char x [2] 102 ‘s’for (i=0; i<COUNT; i++) { from 0 to COUNT-1. char x [3] 103 ‘t’ /* process y[i] */ printf(“%cn”, y[i]); Memorize it! char x [4] 104 ‘0’}
    29. 29. How to Parse and Define C TypesAt this point we have seen a few basic types, arrays, pointer types,and structures. So far we’ve glossed over how types are named.int x; /* int; */ typedef int T;int *x; /* pointer to int; */ typedef int *T;int x[10]; /* array of ints; */ typedef int T[10]; typedef definesint *x[10]; /* array of pointers to int; */ typedef int *T[10]; a new typeint (*x)[10]; /* pointer to array of ints; */ typedef int (*T)[10];C type names are parsed by starting at the type name and workingoutwards according to the rules of precedence: x is an array of pointers to int *x[10]; int Arrays are the primary source of confusion. When x is in doubt, use extra parens to int (*x)[10]; a pointer to clarify the expression. an array of int
    30. 30. Function Types For more details:The other confusing form is the function type. $ man qsortFor example, qsort: (a sort function in the standard library)void qsort(void *base, size_t nmemb, size_t size, The last argument is a int (*compar)(const void *, const void *)); comparison function/* function matching this type: */int cmp_function(const void *x, const void *y); const means the function/* typedef defining this type: */ is not allowed to modifytypedef int (*cmp_type) (const void *, const void *); memory via this pointer./* rewrite qsort prototype using our typedef */void qsort(void *base, size_t nmemb, size_t size, cmp_type compar); size_t is an unsigned intvoid * is a pointer to memory of unknown type.
    31. 31. Dynamic Memory AllocationSo far all of our examples have allocated variables statically bydefining them in our program. This allocates them in the stack.But, what if we want to allocate variables based on user input or otherdynamic inputs, at run-time? This requires dynamic allocation. sizeof() reports the size of a type in bytes For details:int * alloc_ints(size_t requested_count) $ man calloc{ calloc() allocates memory int * big_array; for N elements of size k big_array = (int *)calloc(requested_count, sizeof(int)); if (big_array == NULL) { printf(“can’t allocate %d ints: %mn”, requested_count); Returns NULL if can’t alloc return NULL; } %m ? Emstar tips /* now big_array[0] .. big_array[requested_count-1] are It’s OK to return this pointer. * valid and zeroed. */ return big_array; It will remain valid until it is} freed with free()
    32. 32. Caveats with Dynamic MemoryDynamic memory is useful. But it has several caveats: Whereas the stack is automatically reclaimed, dynamic allocations must be tracked and free()’d when they are no longer needed. With every allocation, be sure to plan how that memory will get freed. Losing track of memory is called a “memory leak”. Reference counting Whereas the compiler enforces that reclaimed stack space can no longer be reached, it is easy to accidentally keep a pointer to dynamic memory that has been freed. Whenever you free memory you must be certain that you will not try to use it again. It is safest to erase any pointers to it. Because dynamic memory always uses pointers, there is generally no way for the compiler to statically verify usage of dynamic memory. This means that errors that are detectable with static allocation are not with dynamic
    33. 33. Some Common Errors and Hintssizeof() can take a variable reference in place of a type name. This gurantees the rightallocation, but don’t accidentally allocate the sizeof() the pointer instead of the object!/* allocating a struct with malloc() */ malloc() allocates n bytesstruct my_struct *s = NULL;s = (struct my_struct *)malloc(sizeof(*s)); /* NOT sizeof(s)!! */if (s == NULL) { Why? printf(stderr, “no memory!”); Always check for NULL.. Even if you just exit(1). exit(1);} malloc() does not zero the memory,memset(s, 0, sizeof(*s)); so you should memset() it to 0./* another way to initialize an alloc’d structure: */struct my_struct init = { counter: 1, average: 2.5, in_use: 1};/* memmove(dst, src, size) (note, arg order like assignment) */memmove(s, &init, sizeof(init)); memmove is preferred because it is/* when you are done with it, free it! */ safe for shifting buffersfree(s); Why?s = NULL; Use pointers as implied in-use flags!
    34. 34. MacrosMacros can be a useful way to customize your interface to C andmake your code easier to read and less redundant. However, whenpossible, use a static inline function instead. What’s the difference between a macro and a static inline function?Macros and static inline functions must be included in any file thatuses them, usually via a header file. Common uses for macros: More on C/* Macros are used to define constants */ constants?#define FUDGE_FACTOR 45.6 Float constants must have a decimal#define MSEC_PER_SEC 1000 point, else they are type int#define INPUT_FILENAME “my_input_file” enums/* Macros are used to do constant arithmetic */ Why?#define TIMER_VAL (2*MSEC_PER_SEC) Put expressions in parens./* Macros are used to capture information from the compiler */#define DBG(args...) do { Multi-line macros need fprintf(stderr, “%s:%s:%d: “, __FUNCTION__, __FILE__, __LINENO__); args… grabs rest of args fprintf(stderr, args...); } while (0) Why? Enclose multi-statement macros in do{}while(0)/* ex. DBG(“error: %d”, errno); */
    35. 35. Macros and ReadabilitySometimes macros can be used to improve code readability… butmake sure what’s going on is obvious./* often best to define these types of macro right where they are used */#define CASE(str) if (strncasecmp(arg, str, strlen(str)) == 0)void parse_command(char *arg) void parse_command(char *arg){ { CASE(“help”) { if (strncasecmp(arg, “help”, strlen(“help”)) { /* print help */ /* print help */ } } CASE(“quit”) { if (strncasecmp(arg, “quit”, strlen(“quit”)) { exit(0); exit(0); } }} }/* and un-define them after use */#undef CASEMacros can be used to generate static inline functions. This is like a Cversion of a C++ template. See emstar/libmisc/include/queue.h for anexample of this technique.
    36. 36. Using “goto”Some schools of thought frown upon goto, but goto has its place. Agood philosophy is, always write code in the most expressive and clearway possible. If that involves using goto, then goto is not bad.An example is jumping to an error case from inside complex logic.The alternative is deeply nested and confusing “if” statements, whichare hard to read, maintain, and verify. Often additional logic and statevariables must be added, just to avoid goto. goto try_again; goto fail;
    37. 37. Unrolling a Failed Initialization using gotostate_t *initialize(){ state_t *initialize() /* allocate state struct */ { state_t *s = g_new0(state_t, 1); /* allocate state struct */ if (s) { state_t *s = g_new0(state_t, 1); /* allocate sub-structure */ if (s == NULL) goto free0; s->sub = g_new0(sub_t, 1); if (s->sub) { /* allocate sub-structure */ /* open file */ s->sub = g_new0(sub_t, 1); s->sub->fd = if (s->sub == NULL) goto free1; open(“/dev/null”, O_RDONLY); if (s->sub->fd >= 0) { /* open file */ /* success! */ s->sub->fd = } open(“/dev/null”, O_RDONLY); else { if (s->sub->fd < 0) goto free2; free(s->sub); free(s); /* success! */ s = NULL; return s; } } free2: else { free(s->sub); /* failed! */ free1: free(s); free(s); s = NULL; free0: } return NULL; } } return s;}
    38. 38. High Level Question: Why is Software Hard?Answer(s):• Complexity: Every conditional (“if”) doubles number of paths through your code, every bit of state doubles possible states – Solution: reuse code paths, avoid duplicate state variables• Mutability: Software is easy to change.. Great for rapid fixes .. And rapid breakage .. always one character away from a bug – Solution: tidy, readable code, easy to understand by inspection. Avoid code duplication; physically the same  logically the same• Flexibility: Programming problems can be solved in many different ways. Few hard constraints  plenty of “rope”. – Solution: discipline and idioms; don’t use all the rope
    39. 39. Addressing Complexity• Complexity: Every conditional (“if”) doubles number of paths through your code, every bit of state doubles possible states – Solution: reuse code paths, avoid duplicate state variablesreuse code pathsOn receive_packet: if queue full, drop packet else push packet, call run_queue On input, change our state as needed, andOn transmit_complete: call Run_queue. In all cases, Run_queue state=idle, call run_queue handles taking the next step…Run_queue: if state==idle && !queue empty pop packet off queue start transmit, state = busy
    40. 40. Addressing Complexity• Complexity: Every conditional (“if”) doubles number of paths through your code, every bit of state doubles possible states – Solution: reuse code paths, avoid duplicate state variablesavoid duplicate state variablesint transmit_busy; msg_t *packet_on_deck;msg_t *packet_on_deck; int start_transmit(msg_t *packet)int start_transmit(msg_t *packet) {{ if (packet_on_deck != NULL) return -1; if (transmit_busy) return -1; /* start transmit */ /* start transmit */ packet_on_deck = packet; packet_on_deck = packet; transmit_busy = 1; /* ... */ return 0; /* ... */ } return 0;} Why return -1?
    41. 41. Addressing Mutability• Mutability: Software is easy to change.. Great for rapid fixes .. And rapid breakage .. always one character away from a bug – Solution: tidy, readable code, easy to understand by inspection. Avoid code duplication; physically the same  logically the sameTidy code.. Indenting, good formatting, comments, meaningful variable andfunction names. Version control.. Learn how to use CVSAvoid duplication of anything that’s logically identical.struct pkt_hdr { int source; struct pkt_hdr { int dest; int source; int length; int dest; Otherwise when int length;};struct pkt { }; one changes, you int source; struct pkt { have to find and fix struct pkt_hdr hdr; int dest; int length; uint8_t payload[100]; all the other places uint8_t payload[100]; };};
    42. 42. Solutions to the pow() challenge question Recursive Iterativefloat pow(float x, uint exp){ float result; float pow(float x, uint exp) { /* base case */ float result = 1.0; if (exp == 0) return 1.0; int bit; for (bit = sizeof(exp)*8-1; /* x^(2*a) == x^a * x^a */ bit >= 0; bit--) { result = pow(x, exp >> 1); result *= result; result = result * result; if (exp & (1 << bit)) result *= x; /* x^(2*a+1) == x^(2*a) * x */ } if (exp & 1) result = result * x; return result; } return result;} Which is better? Why?