Data Structure in C# By: Shahzad Sarwar
Scope This presentation covers data structure for C# programmers. Basic data structure like Queue and Stacks are discussed as implementation provided by C#. This presentation will not cover advance Data structure like B+ Tree. In order to explore, self created data structure, see the following URLs. http://www.brpreiss.com/books/opus6/html/page10.html http://en.wikibooks.org/wiki/Computer_Science:Data_Structures:All_Chapters
Asymptotic analysis Asymptotic analysis:  Examines how the efficiency of a data structure changes as the data structure's size approaches infinity. With an array, this process takes time proportional to the number of elements in the array. With binary search trees or SkipLists, the time required is sub-linear. When searching large amounts of data, the data structure chosen can make a difference in the application's performance that can be visibly measured in seconds or even minutes. Big-Oh notation : The notation commonly used in asymptotic analysis To describe the performance of searching an array :  O ( n ). The  n  indicates that the number of steps required to search an array grows linearly as the size of the array grows. Examples:  O (log2  n ),  O ( n  log2  n ),  O ( n 2),  O (2n), The lower the term inside the parenthesis for large values of  n , the better the data structure's operation's performance.
Array(Linear, Direct Access, Homogeneous Data Structure) Contiguous memory. Must be of the same type. Can be directly accessed. Declared in C# bool [] booleanArray; Allocation of array booleanArray = new bool[10]; Generically: arrayName = new arrayType[allocationSize]; Array
Representation
Resizing array .  // Create an integer array with three elements int [] fib = new int[3]; fib[0] = 1; fib[1] = 1; fib[2] = 2; // Redimension message to a 10 element array int [] temp = new int[10]; // Copy the fib array to temp fib.CopyTo(temp, 0); // Assign temp to fib fib = temp; An algorithm called binary search to search the array in O(log n) running time. The  Array  class contains a static  BinarySearch() The binary search algorithm is known as a recursive algorith To search through an n-element single dimensional array was denoted O(n). A k-dimensional array has a search running time of O(nk).
Array Search For non-sorted, there is Linear time. Linear time since the amount of time required for the algorithm to run is linearly proportional to the length of the array. The CompareCustomDataType class provided below performs a comparison between two Student instances. It implements the IComparer interface and provides a Compare function that compares two Student instances by their Name property:  using System; using System.Collections; public class CompareCustomDataType : IComparer  {  public int Compare(Object x, Object y)  {  if (x == null) return -1;  if (y == null) return 1;  Student xStudent = (Student) x;  Student yStudent = (Student) y;  return String.Compare(xStudent.Name, yStudent.Name);  }  }
Code 'Create an array of 5 students Dim myArray(4) as Student myArray(0) = New Student(&quot;Scott&quot;, 24) myArray(1) = New Student(&quot;Jisun&quot;, 23) myArray(2) = New Student(&quot;Spacey&quot;, 21) myArray(3) = New Student(&quot;John&quot;, 23) myArray(4) = New Student(&quot;Josh&quot;, 20)  'Sort the array Array.Sort(myArray, New CompareCustomDataType) 'Determine if a particular student exists in the array Dim studentTemp as New Student() studentTemp.Name = &quot;Jisun&quot; location = Array.BinarySearch(myArray, studentTemp, New CompareCustomDataType) If location < 0 then  'Student Jisun does not exist in the array Else  'Student Jisun exists in the array at position location  End If
The ArrayList A Heterogeneous Self-Redimensioning Array (At no point do we have to concern ourselves with redimensioning the ArrayList.) ArrayList countDown = new ArrayList(); countDown.Add(5); countDown.Add(4); countDown.Add(3); countDown.Add(2); countDown.Add(1); countDown.Add(&quot;blast off!&quot;); countDown.Add(new ArrayList());
Representation
Advantages of Generics The main advantages of Generics :  Type-safety:  a developer using the TypeSafeList class can only add elements that are of the type or are derived from the type specified. For example, trying to add a string to the fib TypeSafeList in the example above would result in a compile-time error.  Performance:  Generics remove the need to type check at run-time, and eliminate the cost associated with boxing and unboxing.  Reusability:  Generics break the tight-coupling between a data structure and the application for which it was created. This provides a higher degree of reuse for data structures.
Queue FIFO The System.Collections.Queue Class Enqueue()  and  Dequeue()  methods Peek()  method. Contains()  method By default, the Queue's initial capacity is 32 elements, although this is customizable when calling the Queue's constructor. Since the Queue is maintained with an object array, variables of any type can be queued up. Example
Stack LIFO System.Collections.Stack Push(item) and Pop methods Default capacity is 10 elements, although this is customizable when calling the Stack's constructor. Since the Stack is maintained with an object array, variables of any type can be queued up. Like the ArrayList, when the Stack's internal array needs to be resized it is automatically increased by twice the initial size. Example
Analysis The Limitations of Ordinal Indexing Searching through  all  of the elements in the employee array—a  O ( n ) operation. A somewhat better approach is to sort the employees by their social security numbers, which reduces the asymptotic search time down to  O (log  n ). For reading or writing as  O (1). Ideally, what we would like to be able to do is access an employee's records in  O (1) time. One way to accomplish this is to build a  huge  array, with an entry for each possible social security number value. The disadvantage of this approach is its extreme waste—there are a total of 109, which is one  billion  (1,000,000,000)—different social security numbers. For a company with 1,000 employees, only 0.0001% of this array would be utilized.
Analysis Compressing Ordinal Indexing with a Hash Function One option would be to reduce the social security number span by only using the last four digits of an employee's social security number.
Analysis The mathematical transformation of the nine-digit social security number to a four-digit number is called  hashing . An array that uses hashing to compress its indexers space is referred to as a  hash table . A hash function is a function that performs this hashing. For the social security number example, our hash function, H, can be described as follows: H(x) = last four digits of x
Hash Table   The System.Collections.Hashtable Class Items are added to the Hashtable using the  Add()  method. // Add some values to the Hashtable, indexed by a string key ages.Add(&quot;Scott&quot;, 25); ages.Add(&quot;Sam&quot;, 6); ages.Add(&quot;Jisun&quot;, 25); // Access a particular key if (ages.ContainsKey(&quot;Scott&quot;)) { int scottsAge = (int) ages[&quot;Scott&quot;]; Console.WriteLine(&quot;Scott is &quot; + scottsAge.ToString()); } else Console.WriteLine(&quot;Scott is not in the hash table...&quot;); // Step through all items in the Hashtable foreach(string key in ages.Keys) Console.WriteLine(&quot;Value at ages[\&quot;&quot; + key + &quot;\&quot;] = &quot; + ages[key].ToString());
List List<> Is in essence an ArrayList, with the difference being that only one type of object can be stored. Program's speed and readability. List<int> intList = new List<int>();intList.Add(45);intList.Add(34);  int listValue = intList[0];  LinkedList<> Consists of &quot;nodes&quot; that carry three values: the object it is storing, a reference to a previous node, and a reference to the next node. LinkedList<int> list = new LinkedList<int>();list.AddLast(6);  list.First.Value or list.Last.Value
Dictionary<,> The Dictionary structure is like a List<> structure but the index can be set. Meaning instead of the indecies being automatically generated (0, 1, 2, 3, etc...) they are set by the user and can be of any type. Dictionary<string, int> myDictionary = new Dictionary<string, int>(); myDictionary.Add(&quot;one&quot;, 1);myDictionary.Add(&quot;twenty&quot;, 20);  int myInt = myDictionary[&quot;one&quot;];  //List<[same type as index]>List<string>  keyList = new List<string>(myDictionary.Keys); for (int i = 0; i < keyList.Count; i++){  int myInt = myDictionary[keyList[i]]; }
HashSet So first off, a Set data structure is a list with no duplicate values. It is more or less modeled after a mathematical set, which has only unique elements. Since a C#.NET HashSet is modeled after a mathematical set, certain functions were implemented such as Union, Intersection, IsSubsetOf, IsSupersetOf. A resulting difference between a HashSet and a List is that the HashSet's Add method is a boolean function, which returns true of an element was added and false if it wasn't (because it was not unique). Although a HashSet is a list of elements, it does not inherit the IList interface. Instead it just inherits the ICollection interface. What this means is elements inside a HashSet cannot be accessed through indices, only through an enumerator (which is an iterator).
Special DS SynchronizedCollection( T ) Generic Class Provides a thread-safe collection that contains objects of a type specified by the generic parameter as elements. Any public  static  ( Shared  in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe. SynchronizedKeyedCollection( K ,  T ) Generic Class Provides a thread-safe collection that contains objects of a type specified by a generic parameter and that are grouped by keys. Any public  static  ( Shared  in Visual Basic) members of this type are thread safe.  Any instance members are not guaranteed to be thread safe.
Special DS SynchronizedReadOnlyCollection( T ) Generic Class Provides a thread-safe, read-only collection that contains objects of a type specified by the generic parameter as elements. Any public  static  ( Shared  in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe. SortedDictionary( TKey ,  TValue ) Generic Class Represents a collection of key/value pairs that are sorted on the key.
Trees A binary tree:  is a special kind of tree, in which all nodes have at most two children. The .NET Framework does not contain a binary tree class Improving the Search Time with Binary Search Trees (BSTs) A  binary search tree  is a special kind of binary tree designed to improve the efficiency of searching through the contents of a binary tree. For any node  n , every descendant node's value in the left  subtree  of  n  is less than the value of  n , and every descendant node's value in the right  subtree  is greater than the value of  n .
Tree Traversal Traversal:  A traversal visits the nodes of a tree in a systematic manner Preorder Traversal  In a preorder traversal, a node is visited before its descendants. Postorder Traversal:  In a postorder traversal, a node is visited after its descendants Inorder Traversal: In an inorder traversal a node is visited after its left subtree and before its right subtree Example
Graph Graphs are composed of a set of nodes and edges, just like trees, but with graphs there are no rules for the connections between nodes. With graphs there is no concept of a root node, nor is there a concept of parents and children. Rather, a graph is just a collection of interconnected nodes. A tree is a special case of a graph. Graphs are a collection of nodes and edges, but there are different kinds of edges:  Directed versus undirected edges  Weighted versus unweighted edges  Graphs with bidirectional edges are said to be  undirected graphs. Graphs that use unidirectional edges are said to be  directed graphs .
Graph Associate some cost with the connection from one node to another. If self-edges are allowed, the total number of edges for a directed graph would be  n 2. A graph with close to the maximum number of edges is said to be  dense . If a graph has significantly less than  n 2 edges, the graph is said to be  sparse . For a connected, undirected graph, there exists some subset of the edges that connect all the nodes and does not introduce a cycle. Such a subset of edges would form a tree called a  spanning tree . A  minimum spanning tree , is the spanning tree with a minimum cost.
Examples
Usages of Graph: Finding a minimum spanning tree
Usages of Graph: Computing the shortest path from a source node to all other nodes.  Dijkstra's Algorithm
References: http://msdn.microsoft.com/en-us/library/ms364091(VS.80).aspx http://www.brpreiss.com/books/opus6/html/page10.html http://en.wikibooks.org/wiki/Computer_Science:Data_Structures:All_Chapters
A soft copy will be available at: http:// softarchitect.wordpress.com For Future discussion, join  http:// tech.groups.yahoo.com/group/SoftArchitect /

Data Structure In C#

  • 1.
    Data Structure inC# By: Shahzad Sarwar
  • 2.
    Scope This presentationcovers data structure for C# programmers. Basic data structure like Queue and Stacks are discussed as implementation provided by C#. This presentation will not cover advance Data structure like B+ Tree. In order to explore, self created data structure, see the following URLs. http://www.brpreiss.com/books/opus6/html/page10.html http://en.wikibooks.org/wiki/Computer_Science:Data_Structures:All_Chapters
  • 3.
    Asymptotic analysis Asymptoticanalysis: Examines how the efficiency of a data structure changes as the data structure's size approaches infinity. With an array, this process takes time proportional to the number of elements in the array. With binary search trees or SkipLists, the time required is sub-linear. When searching large amounts of data, the data structure chosen can make a difference in the application's performance that can be visibly measured in seconds or even minutes. Big-Oh notation : The notation commonly used in asymptotic analysis To describe the performance of searching an array : O ( n ). The n indicates that the number of steps required to search an array grows linearly as the size of the array grows. Examples: O (log2 n ), O ( n log2 n ), O ( n 2), O (2n), The lower the term inside the parenthesis for large values of n , the better the data structure's operation's performance.
  • 4.
    Array(Linear, Direct Access,Homogeneous Data Structure) Contiguous memory. Must be of the same type. Can be directly accessed. Declared in C# bool [] booleanArray; Allocation of array booleanArray = new bool[10]; Generically: arrayName = new arrayType[allocationSize]; Array
  • 5.
  • 6.
    Resizing array . // Create an integer array with three elements int [] fib = new int[3]; fib[0] = 1; fib[1] = 1; fib[2] = 2; // Redimension message to a 10 element array int [] temp = new int[10]; // Copy the fib array to temp fib.CopyTo(temp, 0); // Assign temp to fib fib = temp; An algorithm called binary search to search the array in O(log n) running time. The Array class contains a static BinarySearch() The binary search algorithm is known as a recursive algorith To search through an n-element single dimensional array was denoted O(n). A k-dimensional array has a search running time of O(nk).
  • 7.
    Array Search Fornon-sorted, there is Linear time. Linear time since the amount of time required for the algorithm to run is linearly proportional to the length of the array. The CompareCustomDataType class provided below performs a comparison between two Student instances. It implements the IComparer interface and provides a Compare function that compares two Student instances by their Name property: using System; using System.Collections; public class CompareCustomDataType : IComparer { public int Compare(Object x, Object y) { if (x == null) return -1; if (y == null) return 1; Student xStudent = (Student) x; Student yStudent = (Student) y; return String.Compare(xStudent.Name, yStudent.Name); } }
  • 8.
    Code 'Create anarray of 5 students Dim myArray(4) as Student myArray(0) = New Student(&quot;Scott&quot;, 24) myArray(1) = New Student(&quot;Jisun&quot;, 23) myArray(2) = New Student(&quot;Spacey&quot;, 21) myArray(3) = New Student(&quot;John&quot;, 23) myArray(4) = New Student(&quot;Josh&quot;, 20) 'Sort the array Array.Sort(myArray, New CompareCustomDataType) 'Determine if a particular student exists in the array Dim studentTemp as New Student() studentTemp.Name = &quot;Jisun&quot; location = Array.BinarySearch(myArray, studentTemp, New CompareCustomDataType) If location < 0 then 'Student Jisun does not exist in the array Else 'Student Jisun exists in the array at position location End If
  • 9.
    The ArrayList AHeterogeneous Self-Redimensioning Array (At no point do we have to concern ourselves with redimensioning the ArrayList.) ArrayList countDown = new ArrayList(); countDown.Add(5); countDown.Add(4); countDown.Add(3); countDown.Add(2); countDown.Add(1); countDown.Add(&quot;blast off!&quot;); countDown.Add(new ArrayList());
  • 10.
  • 11.
    Advantages of GenericsThe main advantages of Generics : Type-safety: a developer using the TypeSafeList class can only add elements that are of the type or are derived from the type specified. For example, trying to add a string to the fib TypeSafeList in the example above would result in a compile-time error. Performance: Generics remove the need to type check at run-time, and eliminate the cost associated with boxing and unboxing. Reusability: Generics break the tight-coupling between a data structure and the application for which it was created. This provides a higher degree of reuse for data structures.
  • 12.
    Queue FIFO TheSystem.Collections.Queue Class Enqueue() and Dequeue() methods Peek() method. Contains() method By default, the Queue's initial capacity is 32 elements, although this is customizable when calling the Queue's constructor. Since the Queue is maintained with an object array, variables of any type can be queued up. Example
  • 13.
    Stack LIFO System.Collections.StackPush(item) and Pop methods Default capacity is 10 elements, although this is customizable when calling the Stack's constructor. Since the Stack is maintained with an object array, variables of any type can be queued up. Like the ArrayList, when the Stack's internal array needs to be resized it is automatically increased by twice the initial size. Example
  • 14.
    Analysis The Limitationsof Ordinal Indexing Searching through all of the elements in the employee array—a O ( n ) operation. A somewhat better approach is to sort the employees by their social security numbers, which reduces the asymptotic search time down to O (log n ). For reading or writing as O (1). Ideally, what we would like to be able to do is access an employee's records in O (1) time. One way to accomplish this is to build a huge array, with an entry for each possible social security number value. The disadvantage of this approach is its extreme waste—there are a total of 109, which is one billion (1,000,000,000)—different social security numbers. For a company with 1,000 employees, only 0.0001% of this array would be utilized.
  • 15.
    Analysis Compressing OrdinalIndexing with a Hash Function One option would be to reduce the social security number span by only using the last four digits of an employee's social security number.
  • 16.
    Analysis The mathematicaltransformation of the nine-digit social security number to a four-digit number is called hashing . An array that uses hashing to compress its indexers space is referred to as a hash table . A hash function is a function that performs this hashing. For the social security number example, our hash function, H, can be described as follows: H(x) = last four digits of x
  • 17.
    Hash Table The System.Collections.Hashtable Class Items are added to the Hashtable using the Add() method. // Add some values to the Hashtable, indexed by a string key ages.Add(&quot;Scott&quot;, 25); ages.Add(&quot;Sam&quot;, 6); ages.Add(&quot;Jisun&quot;, 25); // Access a particular key if (ages.ContainsKey(&quot;Scott&quot;)) { int scottsAge = (int) ages[&quot;Scott&quot;]; Console.WriteLine(&quot;Scott is &quot; + scottsAge.ToString()); } else Console.WriteLine(&quot;Scott is not in the hash table...&quot;); // Step through all items in the Hashtable foreach(string key in ages.Keys) Console.WriteLine(&quot;Value at ages[\&quot;&quot; + key + &quot;\&quot;] = &quot; + ages[key].ToString());
  • 18.
    List List<> Isin essence an ArrayList, with the difference being that only one type of object can be stored. Program's speed and readability. List<int> intList = new List<int>();intList.Add(45);intList.Add(34); int listValue = intList[0]; LinkedList<> Consists of &quot;nodes&quot; that carry three values: the object it is storing, a reference to a previous node, and a reference to the next node. LinkedList<int> list = new LinkedList<int>();list.AddLast(6); list.First.Value or list.Last.Value
  • 19.
    Dictionary<,> The Dictionarystructure is like a List<> structure but the index can be set. Meaning instead of the indecies being automatically generated (0, 1, 2, 3, etc...) they are set by the user and can be of any type. Dictionary<string, int> myDictionary = new Dictionary<string, int>(); myDictionary.Add(&quot;one&quot;, 1);myDictionary.Add(&quot;twenty&quot;, 20); int myInt = myDictionary[&quot;one&quot;]; //List<[same type as index]>List<string> keyList = new List<string>(myDictionary.Keys); for (int i = 0; i < keyList.Count; i++){ int myInt = myDictionary[keyList[i]]; }
  • 20.
    HashSet So firstoff, a Set data structure is a list with no duplicate values. It is more or less modeled after a mathematical set, which has only unique elements. Since a C#.NET HashSet is modeled after a mathematical set, certain functions were implemented such as Union, Intersection, IsSubsetOf, IsSupersetOf. A resulting difference between a HashSet and a List is that the HashSet's Add method is a boolean function, which returns true of an element was added and false if it wasn't (because it was not unique). Although a HashSet is a list of elements, it does not inherit the IList interface. Instead it just inherits the ICollection interface. What this means is elements inside a HashSet cannot be accessed through indices, only through an enumerator (which is an iterator).
  • 21.
    Special DS SynchronizedCollection(T ) Generic Class Provides a thread-safe collection that contains objects of a type specified by the generic parameter as elements. Any public static ( Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe. SynchronizedKeyedCollection( K , T ) Generic Class Provides a thread-safe collection that contains objects of a type specified by a generic parameter and that are grouped by keys. Any public static ( Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.
  • 22.
    Special DS SynchronizedReadOnlyCollection(T ) Generic Class Provides a thread-safe, read-only collection that contains objects of a type specified by the generic parameter as elements. Any public static ( Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe. SortedDictionary( TKey , TValue ) Generic Class Represents a collection of key/value pairs that are sorted on the key.
  • 23.
    Trees A binarytree: is a special kind of tree, in which all nodes have at most two children. The .NET Framework does not contain a binary tree class Improving the Search Time with Binary Search Trees (BSTs) A binary search tree is a special kind of binary tree designed to improve the efficiency of searching through the contents of a binary tree. For any node n , every descendant node's value in the left subtree of n is less than the value of n , and every descendant node's value in the right subtree is greater than the value of n .
  • 24.
    Tree Traversal Traversal: A traversal visits the nodes of a tree in a systematic manner Preorder Traversal In a preorder traversal, a node is visited before its descendants. Postorder Traversal: In a postorder traversal, a node is visited after its descendants Inorder Traversal: In an inorder traversal a node is visited after its left subtree and before its right subtree Example
  • 25.
    Graph Graphs arecomposed of a set of nodes and edges, just like trees, but with graphs there are no rules for the connections between nodes. With graphs there is no concept of a root node, nor is there a concept of parents and children. Rather, a graph is just a collection of interconnected nodes. A tree is a special case of a graph. Graphs are a collection of nodes and edges, but there are different kinds of edges: Directed versus undirected edges Weighted versus unweighted edges Graphs with bidirectional edges are said to be undirected graphs. Graphs that use unidirectional edges are said to be directed graphs .
  • 26.
    Graph Associate somecost with the connection from one node to another. If self-edges are allowed, the total number of edges for a directed graph would be n 2. A graph with close to the maximum number of edges is said to be dense . If a graph has significantly less than n 2 edges, the graph is said to be sparse . For a connected, undirected graph, there exists some subset of the edges that connect all the nodes and does not introduce a cycle. Such a subset of edges would form a tree called a spanning tree . A minimum spanning tree , is the spanning tree with a minimum cost.
  • 27.
  • 28.
    Usages of Graph:Finding a minimum spanning tree
  • 29.
    Usages of Graph:Computing the shortest path from a source node to all other nodes. Dijkstra's Algorithm
  • 30.
  • 31.
    A soft copywill be available at: http:// softarchitect.wordpress.com For Future discussion, join http:// tech.groups.yahoo.com/group/SoftArchitect /