Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Safwan M. Shatnawi, Fawzi Albalooshi, Khaleel Rababah / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.1638-1644 Generating Timetable and Students schedule based on data mining techniquesSafwan M. Shatnawi1 Fawzi Albalooshi Khaleel Rababah Administrative and Technical Quality Assurance Authority for Administrative and Technical Programs Education and Training Programs College of Applied Studies (QAAET), College of Applied Studies University of Bahrain The Higher Education Review University of Bahrain Unit (HERU)AbstractIn this paper we introduce a new technique for applying meta-heuristic-based algorithms that aregenerating an Academic Programme timetable general purpose algorithms. The algorithms werebased on clustering method using data mining mainly based on three techniques; graph coloring;techniques for dynamically forming students constraint-programming; and integer programming.clusters; then we use a heuristic function to find The author further categorizes the algorithm intothe optimal solution. The FP-tree based one-staged, two-staged, and algorithms that allowgenerated clusters are used as an initial solution relaxations. In one-staged optimization algorithmsto the timetable problem; then we generate more hard and soft constraints are allowed to be violatedoptimized clusters using color mapping and the algorithm then aims to search for a solutionalgorithm; as a result a students timetables also that satisfies both as much as possible examples ofgenerated, all hard constraints are satisfied, and such algorithms are those reported by [3] and [4].soft constraints are considered while generating Two-staged optimization algorithms attempt tothe timetable and the student schedule. We satisfy the hard constraints until an optimization istested our approach against real data obtained reached. The soft constraints are then consideredfrom the College of Applied Studies in for optimization as much as possible. Examples ofUniversity of Bahrain. such algorithms are those proposed by [5] and [6].Keywords: Automated Timetabling, Course The third class of algorithms allows relaxations ofScheduling, data mining, FP-Tree, and Cluster, some constraints without violating the hard ones.Color mapping Examples of such are proposed [7] and [8]. In his survey [9] concentrated on the common solution 1. INTRODUCTION approaches including graph coloring, integer Course scheduling is one of the programming, network flow techniques, tabuchallenging time consuming problems facing search, rule-based approach, and constraint logicinstitutions belonging to the NP-complete class of programming.problems. Our main challenge is to be able to A 3-phase approach succeeded to become theautomatically time table two year associate degree winning entry in the International Timetablingprograms’ courses’ so that students belonging to Competition is presented by [10]. At first thedifferent programs can easily register for courses timetable is created using graph coloring andwith no timetable clashes for the semester they are maximum matching algorithm. In the second phasestudying for. Hard constraints are: the students can an attempt is made to satisfy more softwareonly be scheduled to one event in a time slot; event constraints using simulated annealing (SA), and inrooms meet all required features and their capacity the third phase the solution quality is improved byis respected; no more than one event is allowed per swapping individual events between time slotsroom and per timeslot. Soft constraints include: to using SA. Burke an active researcher in this fieldavoid scheduling classes in the last timeslot of the who published many articles on timetablingday; to avoid scheduling more than two classes in a presents a generally applicable approach forrow for a student; and avoid scheduling one class in constructing examinations and course timetablesa day for a student. using a generic hyper-heuristic approach [11]. TheA general survey paper by [1] investigates authors test their approach on a simulatedexamination and course timetabling providing up benchmark proposed by the Meta-heuristicto-date important information and citations for Network [12] for course timetabling problems andfurther research and possible implementations of compare it with other state of art approaches.automated timetabling for use in educational Reference [13] investigates automated timetablingsettings. A more comprehensive survey carried by to come-up with a solution for the Purdue[2] confirms that a general polynomial bounded University class timetabling problem. Healgorithm for solving time tabling problems cannot investigates a number of common solutions andbe found. However there has been large interest in presents an iterative forward search algorithm that 1638 | P a g e
  2. 2. Safwan M. Shatnawi, Fawzi Albalooshi, Khaleel Rababah / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.1638-1644became the basis of his winning entry in the 2007 progress towards their graduation requirements;timetabling competition [14] organized by the and timetable planners are given statistics ontimetabling research community. courses and sections requirements for the comingClustering technique is another approach used for semester.timetabling; clustering is the process of finding The main source of information in our aboveclasses of objects that share common characteristics mentioned work was the official university[15]. Clustering is mainly based on splitting the registration system. From it students’ academicevents into clusters or groups were each cluster is records and achievements, data were extracted andscheduled in the same timeslots without having processed for advising purposes. It was importantconflicts. Clustering methods satisfy the soft to maintain two new tables: the first holds details ofconstrains using additional optimizing rules for courses belonging to a program and the other holdsobtaining good solution, in order to use this details of courses’ prerequisites. Essentialapproach the courses are grouped into fixed clusters information for timetabling that was also generatedat early stages of the algorithm; the formation of in our same previous work was the generation ofthe cluster is done manually based on some course requirements for a coming semester basedpredefined rules which leads to poor quality on existing students’ achievements, for more detailstimetable [16]. please refer to [16].An FP-tree is a compressed representation of the In our earlier work other information was extractedinput data (transactions). It is constructed by from the university registration system for furtherreading the data set one transaction at a time and processing, but we used the resulted informationmapping each transaction onto a path in the FP-tree for the timetable generation approach described in[15]. FP-tree technique has been used in many this paper.applications such as clustering and document In order to obtain the set of courses a student willorganization. In this paper we customize a FP-tree enroll in a given next semester, we apply thealgorithm to dynamically generate clusters from the following procedure: for each student, we take thestudents transactions (courses to be registered for student’s transcript including the courses a studentthe coming semester) currently enrolled in and compare it against the program study plan, we get the list of not yet 2. WEEKLY TIME TABLE registered courses and failed courses needed to be The week consists of five study days each re-taken, then for each course in the resulted list westaring at eight in the morning till five in the compare it against student’s transcript to find if theevening. Lectures and laboratory sessions are student satisfied the prerequisites for that course.regularly timetabled either on Sunday (U), Tuesday The end result is the list of courses and number of(T), and Thursday (H) sessions or Monday (M) and students expected to enroll in them in the followingWednesday (W) sessions. A course can have one or semester.more sections depending on the number of studentsthat are expected to register for it and each section 3.2 Preparing students data to FP-Treecan take up to 25 students. For example if 50 Algorithmstudents are expect to register for a course two In order to prepare the data for FP-tree Algorithm,sections for the course need to be offered. They can first each next semester courses to be registered bybe at the same or different timings. A section is students are considered as transaction; so ourtimetabled at fixed times over the study week. For transactions are set of courses to be register next byexample, a three weekly lecture course can be all students; each student represent a transaction;timetabled nine to ten on UTH so that the start and the following model our approachend of sessions is at the same time on different days Let C = {c1,c2,…., cn} be set of all courses offeredof the week. Weekly contact hours for a course by the collegesection can vary but the norm is five (two lecture Let S={s1, s2,…,sn} be set of all students enrolledand three laboratory sessions) and three (three in the collegelecture sessions) Let T={t1, t2,…,tn} be set of all transactions in the college; where each ti={{si,cj}+}; cj belongs to C 3. DATA PREPROCESSING and registered by si belonging to S.3.1 Students Advising and Courses Statistics Before starting our modified FP-tree algorithm, weGeneration System prepare the students data by comparing student’s In previous work [16] we developed an transcript to his/her study plan, as a result, weonline advising system that can be utilized by obtained list of courses he/she can register in thestudents, advisors, and course timetable planners. following semester, each student representsStudents are given informative advice on which transaction for the FP-tree algorithm tj, and eachcourses to register for in the next semester and are course a student can register represents an item ci,informed of their remaining graduation in that transaction, then we calculate the number ofrequirements; advisors are able to see students’ students who are to register for each course, the list 1639 | P a g e
  3. 3. Safwan M. Shatnawi, Fawzi Albalooshi, Khaleel Rababah / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.1638-1644of courses is then sorted in descending order and Courses structure: Each course structure containscalled header table in the FP-tree algorithm, next the course code, credit hours, classes, and labcourses in every transaction tj are sorted in session information.descending order based on the number of students Students’ information: for the sake of our modifiedto register ci next semester based on the header FP-tree algorithm, we need only the student ID fortable index, now the transactions are ready to be processing the transactions; however we store theprocessed by the FP-tree algorithm; the following student basic information like name and programexample demonstrates the process of preparing the name for generating meaningful reports.students data: Timeslots: the aforementioned slots are groupedLet S ={20100001, 20100002, 20100003, into fixed group where each group will be mapped20100004, 20100005} be list of students in the to one level that is generated by applying ourcollege modified algorithmLet C={C1, C2, C3, C4, C5, C6} be list of coursesto be register in the following semester 4. TIME TABLE GENERATION PROCESSLet T ={ {20100001,C1,C2,C3}, {20100002, C2, 4.1 Phase I: FP-Tree generatingC3,C4}, {20100003, C2,C3,C6}, {20100004, C1, The algorithm scans the transactions andC2,C6},{20100005,C2,C5}} list of courses to be builds the FP-tree structure described earlier; figureregistered by each student. Table 1. represents the 1 describes the FP-tree algorithm used to generateheader table. FP-tree structure. To form the clusters weNext all transactions in T should be reordered processed the FP-tree generated by the original FP-according to the header table index, as a result, we tree algorithm in order to minimize the clustersobtained the following: generated by the original FP-tree algorithm, the T={ {20100001,C2,C3,C1}, algorithm used to cluster the FP-tree is described in{20100002,C2,C3,C4}, {20100003, C2,C3,C6}, figure 2. This algorithm is described in details{20100004,C2,C1,C6}, {20100005,C2,C5}}. Now in[17]. After modifying fp-tree we obtained thethe data is ready to be processed by the FP-tree clusters which are displayed in figure 3.algorithm. Table 1. Shows the FP-Tree header tablefor the aforementioned example 1) FP-Tree Results and analysis Table 1 FP Tree Header Table After applying FP-tree algorithm, we Course Count Course Count obtained clusters; where each cluster consists of a C1 2 C2 5 group of courses; clusters are not balanced, i.e. C2 5 C3 3 number of courses in each cluster (levels) are not equal, even more the number of students register in C3 3  C1 2 each course inside the cluster are not equal, C4 1 C4 1 moreover the same course can be found in different C5 1 C5 1 clusters with different attributes. If we say that n is C6 1 C6 1 the average student in each section; where N >103.3. Information Representation and N < room size, then any course in the clusterFP-Tree Structure: Students transactions are can have M students, where M is one of therepresented as tree structure; where each node in following cases:the tree contains the course to register for, the M < 10: means that this section cannot be openednumber of students who will register this course , and since the number of students are less than thelinks to the parent of the node, and an array of links minimum allocation for a section size, but we stillto the node’s children; while the original FP-tree can find another cluster that has the same coursedoesn’t care about the owner of the transaction and we can merge these two courses together to(student), we modify this structure since we need to open section, this requires the algorithm to searchkeep track of who will register this course to avoid for another course that can be merged with.conflicts in the latter processing of the FP-tree 10 < M < N: this section can be opened and noresults; so we add an array of students who will more processing is required.register the course next to each FP-tree node.Weekly Time Slots structure: As we mentioned 1. Scan the transaction database for the first time.earlier in section 2 the week is represented by 39 1.1 Find frequent items (single itemslots distributed as 27 one hour slots for UTH and patterns) and sort them12 one hour and half Slots for MW, we represented into a list L in decreasing supportthe week slots into two matrixes the first matrix for count.UTH slots and the second matrix for MW slots. 2. For each transaction ti  T, order its frequentSince the lab session is spanned over 3 hours, we items ci  C according to the order in L.divide the UTH slots into 3 parts and the MW slots 2.1 The algorithm makes a second scan overinto 3 parts also(as well). the database to construct FP-tree by 1640 | P a g e
  4. 4. Safwan M. Shatnawi, Fawzi Albalooshi, Khaleel Rababah / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.1638-1644 reading each transaction in step 2 as a be divided into two or more sections, as a result, we branch on the tree. need to move the generated sections into multipleFigure XX : FP-Tree Algorithm levels to fulfill the business soft constraint, whichM > N: this section should be split over two or states that no more than two sections should bemore sections; again we have to satisfy all hard scheduled at the same time slots, so we need toconstraints before splitting this section. move the new generated sections into differentThe generated clusters come from courses that is levels without causing any clashes, another issue indistributed over FP-tree levels where all courses in the generated clusters that the number of clustersa given level can be schedule in the same time slots may span over the available.without getting clashes, however as we mentionedabove, the number of students in each course mayA. void cluster (fptree) {1. For each Node in FP-tree starting from root in FP-tree level by level do: 1.1. For each CH1 in Node’s children do a. For each CH2 in Node’s children do If CH1->item == CH2->item then Move (CH2, CH1) b. Next CH2 1.2. Next CH1 1.3. For each CH in Node’s children do: 1.3.1. Let source = CH 1.3.2. Find CH->item header table entry 1.3.3. For each HL in header table links a. If (HL->node != source) && (HL->predecessor == source->parent ) then i. Move(source,HL->node) ii. source =HL->node b. End If 1.3.4. Next HL 1.4. Next CH2. Next Node}B. void Move (node source, node target) { 1. target -> item-count += source -> item-count 2. For each CH in source’s children do: 2.1. create a link from the target node to CH 2.2. CH->parent = target 2.3. delete source’s link in the header table 2.4. delete source 3. Next CH}Figure YY: FP-tree based clustering algorithm different paths in the FP-tree structure can beweek slots, so again we need to minimize the obtained by the following rules:number of clusters without causing any clashes. Let M be number of courses that are offered by theHence It is obvious that the FP-tree algorithm college; M any positive integercluster the students, but still it is not the optimal Let Z be the number of students registered in thesolution for the problem or even a solution to the college; Z any positive integerproblem; in other words the FP-tree algorithm just Let N be the maximum possible load; N <=M, thenredefines the problem in a way or another. the different students registration combinations willIt is guarantee that each course exists in any be (SRC) ={cluster’s level can be registered in the same time Case 1:without clashing, however we still can minimize if Z >the number of the cluster levels (depth of the M  ( M  1)  ( M  2).....( M  ( N  1)) thencluster) by applying a color mapping algorithm or M  N!M !  an appropriate searching algorithm; in this paper SRC=  N  ( M  N )! N !  we used color mapping algorithm. M  ( M  1)  ( M  2).....( M  ( N  1))The depth of each cluster is not guaranteed not to  Case 2 : N!exceed the maximum levels count (depth); where if Z <the maximum depth should not exceed the week’s M  ( M  1)  ( M  2).....( M  ( N  1)) thenslots; the depth of the clusters (levels) in phase II of N!this methodology are minimized and controlled The SRC = Z }number of different registration combination, i.e. 1641 | P a g e
  5. 5. Safwan M. Shatnawi, Fawzi Albalooshi, Khaleel Rababah / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.1638-1644Maximum FP-tree depth = N We obtained courses’ clusters where each clusterMaximum modified FP-tree level = M contains courses that can be registered in the sameMaximum levels in the week = time slots timeslots without causing clashes in the students’/maximum course slot timetable; heuristic function based on theOptimal number of cluster = N optimality guidelines described in section Figure 3: Obtained Clusters (optimality in timetabling) is applied to generate optimized timetable. Figure WW shows the screen snap shot for a given program timetable (form). In this phase we assigned room and lab for each section while satisfying the no conflict hard constraints and the number of students less than the room’s size which is an easy and direct process. 5. EXPERIMENTAL WORK AND ANALYSIS In order to evaluate our approach, we tested our approach using real data obtained from the College of applied studies; the sample contains information about 1270 students distributed among 8 academic programs; each student may register up to 5 courses selected among 83 Color mapping pseudo code Courses=N Colors =0 For each Level(Vertex) 2) Optimality in timetabling Color i= Li; Optimal solution for timetabling problem Colors ++; courses --;is domain dependent, so before searching for the For all Courses Ci  C in Li then Color i  Cioptimal solution we should first define optimality, Next levelin our study domain we have several parameters For each levelthat define optimality, of course the hard Students = number of students in Ciconstraints is the base to find the solution, but the Section size = Msoft constraints represent the optimal or desired If students > section size thensolution, following are some parameters for Split Ci into multiple sectionsdefining optimality: For each generated section Si  Number of internal time slots fragmentation in Find level Li; Li ∩ Si =0 then the student schedule ; minimize Color Li  Si;  Number of sections (classes) of the same level color course in each time slot; (minimize) Next level  Utilization of timeslots ; (maximize) Courses=0;  Utilization of resources (maximize) Colors=N;  Internal fragmentation in each resource; Figure 4. Color mapping Algorithm (minimize)  Total number of sections; (minimize) courses, Table 2 shows the statistics of the 3) Phase II: Processing modified FP-Tree for experimental work; the estimated required sections timetable generating are obtained based on the assumption that theStarting from the fact that the result of the modified section size is 25 students; first we obtained theFP-tree algorithm can further be optimized, we will courses to be register next for all students, then westart by minimizing the resulted cluster; a color prepared the transaction file where each studentmapping algorithm is used to minimize the clusters next semester courses represents a transaction andand the nodes (sections) in each cluster; the task of based on the number of students who will registerthis phase is to minimize the depth of the obtained each course we built the header table mentioned inclusters in phase I; and to form clusters to be section 4.1; next we apply FP-tree algorithm toprocessed by phase III; in this process if we found obtain the transactions tree; after that we minimizemultiple instances of a course in a level, we merge the number of clusters by applying out modifiedthese instances into one instance; again the formed algorithm mentioned in figure 2, the number ofcluster has the same properties described in section obtained clusters was 25 clusters; to minimize the4.1.1. Figure 4 shows the color mapping algorithm number of clusters we apply the color mappingused in this phase. algorithm mentioned in figure 4, as a result we 4) Phase III: Generating sections and obtained minimized number of clusters; for ourassigning timeslots. data we obtained 10 clusters; the final step is to 1642 | P a g e
  6. 6. Safwan M. Shatnawi, Fawzi Albalooshi, Khaleel Rababah / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.1638-1644generate sections and then assign each level to a prefixed time slots; figure 5 shows a snap shot forACCA221 section 1 class list, the time slots aregroup in a way that there is no clashes (overlaps)between any two groups of time slots, figure 6shows snap shot of one time table form; all hardconstraints were satisfied (no time clashes; roomcapacity; number of sections in a given time slot,...)through the use of constraints based approachedwere any section being generated should not violateany of the hard constraints. Table 2 Experimental statisticsItem Statistics GeneratedNumber of students 1270 1270Number of Courses 83 83 Figure 6. Screen Snapshot of one timetable form.Estimated required 167 183sections (optimal) ACKNOWLEGEMENTStudent Conflict 0 This project is supported by Deanship of Academic research at University of Bahrain , 6. CONCLUSION underproject number 23-2010 Timetable based on intended courses to beregistered next semester is generated, our approach Referencesgoes beyond time conflicts constraints only to [1] M. B., "University timetabling: Bridgingenable students to register all courses they want, the gap between research and practiceour approach is student oriented and aims to (invited paper)," in The 6th Int. Conf. onprovide the student with efficient(based on the the Practice and Theory of Automatedstudents definition) timetable, our algorithm has Timetabling (PATAT-2006), Brno, 2006.advantages over static clustering algorithms; where [2] L. R., "A Survey of Metaheuristic-basedthe clusters are formed and fixed before finding the Techniques for University Timetablingsolution; whereas our approach allows dynamic Problems," in OR Spectrum, 2008..formation of the clusters and minimize the search [3] C. M. and P. M., "A Multiobjective(solution) space for color mapping algorithm. since Genetic Algorithm for The Class/Teacher the obtained clusters less than the Timetabling Problem," Practice andavailable weekly timeslots, the students can have Theory of Automated Timetablingbetter timetable with minimum internal (PATAT) III, vol. 2079, pp. 3-17, 2001.fragmentation timeslots, which also makes the [4] D. G. L. and S. A., "Multi-neighbourhoodinstructors timetable efficient , the color mapping Local Search with Application to Coursealgorithm makes use of the generated clusters Timetabling," Practice and Theory ofwhich minimize the search space for the color Automated Timetabling (PATAT) IV, vol.mapping algorithm. 2740, pp. 262-275, 2003. [5] C. S. and T. J., "Grasping The Examination Scheduling Problem," Practice and Theory of Automated Timetabling (PATAT) IV, vol. 2740, pp. 233-244, 2003. [6] A. H. and L. A., "A Tabu Search Heuristic for a University Timetabling Problem," Metaheuristics: Progress as Real Problem Solvers, vol. 32, pp. 65-86, 2005. [7] E. E., "A Grouping Genetic Algorithm for Graph Colouring and Exam Timetabling,"Figure 5. Course /Student Enrollment Practice and Theory of Automated Timetabling (PATAT) III, vol. 2079, pp. 132-156, 2001. [8] C. P., W. T. and S. R., "Application of a Hybrid Multi-Objective Evolutionary Algorithm to The Uncapacitated Exam Proximity Problem," Practice and Theory of Automated Timetabling (PATAT) V, vol. 3616, pp. 294-312, 2005. 1643 | P a g e
  7. 7. Safwan M. Shatnawi, Fawzi Albalooshi, Khaleel Rababah / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.1638-1644[9] S. A., "A survey of Automated Timetabling," in Artificial Intelligence Review, 1999.[10] K. P., "The University Course Timetabling Problem with a 3-Phase Approach," Practice and Theory of Automated Timetabling (PATAT) V, vol. 3616, pp. 109-125, 2005.[11] B. E., M. B, M. A., P. S. and Q. R., "A Graph-based Hyper-Heuristic for Educational Timetabling Problems," European Journal of Operational Research, vol. 167, pp. 177-192, 2007.[12] [Online]. Available: http://www.metaheuristics.net/. [Accessed 11 August 2009].[13] T. Muller, Constraint-based timetabling. PhD thesis, Charles University in Prague, Faculty of Mathematics and Physics, 2005.[14] [Online]. Available: http://www.cs.qub.ac.uk/itc2007/ . [Accessed 11 August 2009].[15] Pang-Ning-Tan, M. Stienback and V. Kumar, Introduction to Data Mining, Addison Wesley, Pearson Education, 2005, pp. 363-370,487- 490.[16] F. Albalooshi and S. Shatnawi, "Online academic advising support," in IEEE International Joint Conferences on Computer, Information, and Systems Sciences, and Engineering (EIAE 09), 2009.[17] S. Safwan, A.-R. Khaleel and B.-I. Basel, "Applying a novel clustering technique based on FP-tree to university timetabling problem: A case study," in International Conference on Computer Engineering and Systems (ICCES), 2010.[18] G. M. White and P. W. Chan., "Towards the Construction of Optimal Examination Timetables," in INFOR 17, 1979. 1644 | P a g e