• Save
Visitation time scheduling
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Visitation time scheduling

on

  • 348 views

A case study showing how to approach a basic scheduling problem within the operations research field

A case study showing how to approach a basic scheduling problem within the operations research field

Statistics

Views

Total Views
348
Views on SlideShare
348
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Visitation time scheduling Presentation Transcript

  • 1. Visitation time scheduling Alfonso de la Fuente Ruiz 2013 Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 2. Content index  Scenario  The O.R. Problem  Initial considerations  First approach: Microsoft Excel  Importing data from CSV into MS Excel  Exploring the dataset  Data order by client  Vouching for data validity  Alternatives and decision making  Coding software and choosing tools  Microsoft Excel Macros  Open Office Suite: Calc  Structured Query Language  Open Office Suite: Base  Visual Studio Express  Oracle and PL/SQL  Using Transact-SQL in Microsoft SQL Server 2k+  Cleaning the data  Pseudocode for data cleaning  Result after data cleaning  PERT and GANTT  Scheduling schemes  Scheduling scheme chosen  Coding the scheme  Reporting output  ACID Compliant DBMS  ACID Compliancy in MS SQL Server (I)  ACID Compliancy in MS SQL Server (II)  ACID Compliancy in MS SQL Server (III)  Database design: a bird´s eye view  Database normalization  Database map, visually  Database map: Table definition  Database map: Procedures and functions  References  Conclusion Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 3. Scenario:  The small test project that was asked to be prepared is described in a PDF file (Portable Document Format) and the data required is in a CSV file (Comma Separated Values).  One natural week was given to find a solution and to prepare a presentation that was to be shown remotely to the UK. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 4. The O.R. problem  The problem, from the Operational Research perspective, constitutes a very simple case of “visitation time scheduling” with multiple clients and a single server which can attend only one petition at a time.  Therefore, a number of solution schemes are readily available, such as First-Come First Served, Priority Queues, Gantt techniques and others.  The difficulty of the problem seems to root not in the complexity of the algorithm coding stage, but in the data formatting stage (both for input and output) and at the database design stage.  The precise software tools to be used were left unspecified, so a large number of alternatives are all posible choices. SQL Server and PostgreSQL were suggested.  In our approach, we firstly will use Microsoft Excel in order to study the data and to perform basic filtering, after which we will consider a number of solutions from the software market. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 5. Initial considerations  This problem constitutes a typical Computer Science Project for Business or Engineering students during their first years at the university.  The students will usually be asked to solve this kind of problem during one term, having a couple of months (up to a semester depending upon academic pressure considerations) to solve it and to prepare a written Project alone or in small teams, to be handed-in at the end of it.  The preparation of the Project case involves careful design considerations, ranging from plagiarism avoidance to speeding up marking processes and exception control.  This kind of knowledge can also come in handy for real business applications at the SAME (Small And Medium-sized Enterprise) level or larger.  In most scenarios, just a subset of the information contained in these pages will be documented and presented to students or staff personnel so to avoid informational saturation and to enhance operational understanding. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 6. First approach: Microsoft Excel  Since SQL Server was the first option suggested, and there exists a very popular software package from Microsoft in the market (MSSQLS), in our first approach, we load the CSV data file in Microsoft Excel (2013 Spanish version) to have a look from it.  In order to do so, we need to import the data from the file, using the “Data/Import/From textfile…” feature.  There we will select the “simple.csv” file and to follow the assistant.  In the assistant wizard window we select delimitated data file type, with headers, Windows (ANSI) file origins, “Comma” (,) as the separator character, and “General” data type for every column so that Excel autodetects it. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 7. Importing data from CSV into MS Excel  As a result, we obtain a set of columns where the headers can show the “autofilter” option which we often utilize to order alphabetically or numerically.  Here we ordered the data by the “datetime_from” field, so that we can observe the information and assume some hypothesis over the contents.  We can easily observe several types of plausible anomalies in the data which force us to take some decision-taking at design time. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 8. Exploring the dataset  At this point, we depict the problem on a paper sheet to gain further insight before moving on to the software tools.  There we get some data schemes and timetabling that will be commented upon further on.  Among other stuff, we observe that the total time for all visitations does not exceed the total time available for service, under any set of assumptions, which is a good sign, for it means that we will be able to deal with the service without overbooking. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 9. Data order by client  We now apply a second ordering to the data over the client_id field.  We name the rows as c#t#, where hashes represent client number and task number for that particular client.  Therefore we obtain the following set: {c1t1,c1t2,c1t3,c1t4 ; c2t1,c2t2,c2t3,c2t4 ; c3t1 ; c4t1,c4t2,c4t3} id client_id datetime_from datetime_to Name Rep? Inv? >24h? 1 1 2013-01-01 09:00 2013-01-01 10:00 gary doades 0 0 0,00 8 1 2013-01-01 09:01 2013-01-01 09:00 gary doades 0 1 0,00 3 1 2013-01-01 09:45 2013-01-01 10:45 gary doades 0 0 0,00 6 1 2013-01-01 12:00 2013-01-01 12:30 gary doades 0 0 0,00 4 2 2013-01-01 23:00 2013-01-02 06:00 richard ward 0 0 1,00 5 2 2013-01-02 04:00 2013-01-02 04:15 richard ward 0 0 0,00 10 2 2013-01-02 05:00 2013-01-02 06:00 richard ward 0 0 0,00 11 2 2013-02-30 01:00 2013-02-30 02:00 richard ward 0 0 #¡VALOR! 7 3 2013-01-01 01:00 2013-01-01 02:00 natasha lunt 0 0 0,00 2 4 2013-01-01 01:00 2013-01-01 01:01 olivia groom-smith 1 0 0,00 9 4 2013-01-01 01:00 2013-01-01 01:01 olivia groom-smith 0 0 0,00 12 4 2013-01-01 18:00 2013-01-02 19:00 olivia groom-smith 0 0 1,00 Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 10. Vouching for data validity  In order to detect anomalies, we ordered the data by “datetime_from” and then a few quick tests were implemented in boolean logic:  REPETITION “Rep?”: IF(AND(C2=C3;D2=D3);1;0) Briefly checks whether two visitation frames are repeated in consecutive rows. Instances #2, #9 for Olivia Groom-Smith are. Obviously not aplicable to the last row.  INVERSION “Inv?”: IF([@[datetime_from]]>=[@[datetime_to]];1;0) Checks whether the end time strictly happens after the beginning. Instance #8 for Gary Doades does not.  MORE THAN ONE DAY “>24h?”: =DAYS([@[datetime_to]];[@[datetime_from]]) Checks to see whether a visitation begins and ends in different days. Instances #12, #4 do, where #12 lasts for more than 24 hours and #4 does not (just 7 hours).  Instance #11 also returns an error code because the date format is not correct, as February does not have 30 days. id client_id datetime_from datetime_to Name Rep? Inv? >24h? 7 3 2013-01-01 01:00 2013-01-01 02:00 natasha lunt 0 0 0,00 2 4 2013-01-01 01:00 2013-01-01 01:01 olivia groom- smith 1 0 0,00 9 4 2013-01-01 01:00 2013-01-01 01:01 olivia groom- smith 0 0 0,00 1 1 2013-01-01 09:00 2013-01-01 10:00 gary doades 0 0 0,00 8 1 2013-01-01 09:01 2013-01-01 09:00 gary doades 0 1 0,00 3 1 2013-01-01 09:45 2013-01-01 10:45 gary doades 0 0 0,00 6 1 2013-01-01 12:00 2013-01-01 12:30 gary doades 0 0 0,00 12 4 2013-01-01 18:00 2013-01-02 19:00 olivia groom- smith 0 0 1,00 4 2 2013-01-01 23:00 2013-01-02 06:00 richard ward 0 0 1,00 5 2 2013-01-02 04:00 2013-01-02 04:15 richard ward 0 0 0,00 10 2 2013-01-02 05:00 2013-01-02 06:00 richard ward 0 0 0,00 11 2 2013-02-30 01:00 2013-02-30 02:00 richard ward 0 0 #¡VALOR! Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 11. Alternatives and decision making  The first observation that we made is that these data show some conflicts that require decision-taking:  There are 4 clients (customers) and 12 tasks a priori  Task c1t2 defines a visitation to end before it begins. This could only be understood as a reverse visitation (server visiting client) or as a quantum effect.  We assume that those two alternatives lie outside of the scope for the problem. Removed those, choice is to either exchange times or to remove the reservation row  Some tasks already show overlap within the order given a priori, thus rearrangement is required, such as c1t1 and c1t3  Task c2t1 occurs overnight, causing it to begin and end in different day dates.  Task c4t4 occurs in a different month than all other, being a possible outlier or mistaken data. Furthermore, the date is not correct, since February cannot have 30 days.  The course of action here could either be to remove the whole row or to correct the month to January.  Since no certainty exists that this table must contain data from a single month, the whole row will be treated as invalid.  Client #3 has only one visitation task defined for her, being the only one with a single visitation  Tasks c4t1 and t4t2 are repeated, so one of them could be deleted or either they could be arranged by their id number. Furthermore, they only last for one minute, being possibly outliers or mistakes.  Task c4t3 lasts for more than 24 hours, being a possible outlier of mistake. Thus, it also exhibits the outlook of c2t1 because it occurs overnight.  The output will be an array set of, at most, max_id (12) elements from which conflicting rows are to be deleted. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 12. Coding software and choosing tools  There is a large number of alternatives being readily available in the market that provide the software framework needed to deal with these kinds of problems.  Among them, we can name but just a few: Microsoft Excel Macros, MS SQL Server, MS Visual Studio Express, MS Access, MS Project, Open Office Base, MySQL, SAS (Statistical Software Analysis) GANTT module, Visual Basic, MicroGPSS, FORTRAN, Borland C++, Delphi, Java, PHP,…  From here on we show a brief selection of choice among those tools. Usually the decisión is taken out of convenience, with criteria such as availability (having the software package already installed and configured on the machine) but there exist multiple choices, all valid solutions.  Whenever posible, specialised freeware 4GT (Fourth Generation Techniques) will be used, being generally considered cheaper, most efficient, optimizing internal computations and of a higher abstraction level, thus greatly simplifying coding operations.  Finally we will deal with the database design issue according to analogous principles. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 13. Open Office Suite: Calc  In case no budget is allocated for software licensing, universities and other organizations often make usage of the OpenOffice suite for teaching and operational applications.  Open Office offers a range of solutions, such as the “Calc” spreadsheet program and the “Base” database management program.  Here we can observe how, upon importing the data into OpenOffice Calc in an analogous way as we did in Excel, the wrong “February 30th” data is immediately detected.  Some other tools (such as OpenProj) will automatically detect a mistake in the data and assign the next available date for the field (+2 days towards March the 2nd). Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 14. Microsoft Excel Macros  Another option is to record a macro from Microsoft Excel.  In order to do so, we need to activate the “Developer” tab.  Recording a macro is a straightforward process, but the source code syntax and aspects are quite complex in case we had to ammend anything in the code.  To keep the code as readable as possible, we can use some other mean.  The logical course of action seems to be to use SQL code in order to get to the required scheduling solution. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 15. Structured Query Language  Structured Query Language (SQL) code is the market for solved these kinds of problems.  Therefore, some SQL programming expertise is assumed in order to get a solution.
  • 16. Open Office Suite: Base  Open Office Base can be used to process the data and to query the table for the output requested, in the same way that the Microsoft Access software package would.  In OO Base, we can quickly create the table that we need, with the advantage that it is open source software and implements SQL.  To do so, we first need to specify the field names and types. Finally we would need to populate it with actual data from OO Calc or MS Excel. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 17. Visual Studio Express  One other Microsoft Tool that can be used is Visual Studio Express (demo available for free download).  Here we can observe how VSE also detects the invalidity of one of the dates (February the 30th).  Visual Studio Express can also be used to process the data and to query the table for the output requested.  It also implements SQL and is designed for seamless data Exchange with Microsoft SQL servers. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 18. Oracle and PL/SQL  Oracle is a very powerful tool that larger organizations, such as city councils or international corporations use. It has its own language extensión for database management: PL/SQL  PL/SQL stands for "Procedural Language Extensions to SQL." PL/SQL extends SQL by adding programming structures and subroutines available in any high-level language.  The syntax and capabilities are very similar to those in T-SQL and other derivatives of standard SQL.  Many Oracle applications are built using client-server architecture. The Oracle database resides on the server. The program that makes requests against this database resides on the client machine. This program can be written in C, Java, or PL/SQL.  Because PL/SQL is just like any other programming language, it has syntax and rules that determine how programming statements work together. It is important for you to realize that PL/SQL is not a stand-alone programming language. PL/SQL is a part of the Oracle RDBMS, and it can reside in two environments, the client and the server. As a result, it is very easy to move PL/SQL modules between server-side and client- side applications.  Oracle also supplies a reduced command-line SQL extension called SQL+. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 19. Using Transact-SQL in Microsoft SQL Server 2k+  Microsoft SQL server 2000 (and above) is one of the most popular software tools used to solve these kind of problems at the business level, wherever encountering high numbers of tables and instances.  MSSQLS uses a powerful extension of standard SQL originally developed by Sybase, called Transact-SQL. T-SQL code can be bundled into a variety of software applications: web pages, Visual Basic, Visual C# and so on.  New MS SQL Server versions such as 2005 indeed work with CSV files and are interoperable with all of Visual Studio, MSExcel and MSProject features and functionality.  MS SQL Server requires a moderate investment in licensing.  To the right you can see an example (cfr. bib.) where you can read how to use the ORDER and GROUP BY statements in T-SQL to aggregate data.  For our exercise it constitutes a very useful tool to design code that orders the preprocessed visitation list by starting date and returns results ordered by client, once a scheduling scheme has been agreed upon and implemented. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 20. Cleaning the data  As we have observed several irregularities within the input data, we need to clean those by deletion of all rows affected.  To do so, we can either use the built-in tools of the software package of our choice, or to write-up some code to do it for us.  Given that the amount of instances (rows) in our table is very small, we choose to clean it by hand (with the software packages built-in tools) with the target of speeding up the process.  If the amount of instances was higher (say dozens, hundreds or even millions of registers), we should necessarily code a clean-up routine for this task.  According to the validity analysis performed at a previous stage, and given the time available and scope, we choose to simplify as much as posible by completely removing any instances that show any of the following conflicts:  REPETITION: All reservations must be DISTINCT, so second and further identical reservations are deleted. Only the one with the lowest id is kept.  INVERSION: Reservations with null or negative time lapses are deleted.  MORE THAN 24 HOURS SERVICE TIME: Reservations that span over more tan one day are deleted only if the total service time is greater than 24 hours. Otherwise they are kept, assuming they occur over a night shift. We will also keep visitations lasting for just one minute, assuming they represent a quick status check.  WRONG DATA INPUT: Reservations with a wrong date or any other piece of data in any field are deleted. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 21. Pseudocode for data cleaning  Since no tool was specified within the problem´s requests, having a wide range of options including several variants and extensions of SQL, we will use pseudocode to show how to program the main scheduling routine.  When later a tool has been chosen, we may easily translate this pseudocode into the grammar of the language of choice, without any loss of generality.  We asume that a few simple subroutines are provided by the language for order, deletion and so on.  We asume ROWS (for short) is a table that is to contain the RESERVATIONS  ROWS := SELECT DISTINCT FROM RESERVATIONS Removes duplicates (but obviously for ‘rows.id’, the master key)  ORDER ROWS BY DATETIME_FROM Orders all rows by starting time  FOR ID IN ROWS LOOP: For every distinct row repeat:  IF DATE(ROWS[ID].DATETIME_FROM) < 0 All invalid dates should return a negative  THEN DELETE(ROWS[ID]) Cleans wrongly timestamped rows  IF ( DATE(ROWS[ID].DATETIME_FROM) >= DATE(ROWS[ID].DATETIME_TO) )  THEN DELETE(ROWS[ID]) Cleans rows with non positive visitation time spans  IF ( DAYS(ROWS[ID].DATETIME_TO - DATE(ROWS[ID].DATETIME_FROM ) >= 0  THEN DELETE(ROWS[ID]) Cleans rows with visitation lasting for one day or more.  END LOOP End of loop  COMMIT_WRITE(ROWS,RESERVATIONS) Replaces all initial rows with the result of this cleaning routine Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 22. Result after data cleaning  Subsets to be substracted:  Repetition candidate subsets: {c4t1,c4t2}. Choice subset: {c4t2}  Inversion: One negative time lapse {c1t2}  >24 hours: {c4t3}  Wrong input: date out of margins (February 30th) {c2t4}  Substraction set: {c1t2,c2t4,c4t2,c4t3}  We end up with 8 instances after cleaning: {c1t1,c1t2,c1t3,c1t4, c2t1,c2t2,c2t3,c2t4, c3t1, c4t1,c4t2,c4t3} – {c1t2,c2t4,c4t2,c4t3} = {c1t1,c1t3,c1t4, c2t1,c2t2,c2t3, c3t1, c4t1}  Or, according to the master key “Reservation ‘id’”: {1,8,3,6,4,5,10,11,7,2,9,12} - {8,9,11,12} = {7,2,1,3,6,4,5,10} Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 23. PERT and GANTT  Program Evaluation Review Techniques (PERT) are a set of tools for Project Management that are commonly use in scheduling environments.  The most widely known of these is the GANTT bar chart where we can define tasks to be executed in parallel, serialized or with interdependencies.  There are again a number of tools that can read an input, generate a Gantt chart and apply scheduling schemes to the data, such as Microsoft Project, GanttProject, OpenProj and several others. Or we can just use a general purpose RDBMS with SQL. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 24. Scheduling schemes  After we cleaned the data, there are several issues come to our mind that we should consider to deal with the scheduling of the visitations, of which we name but just a few among the most relevant:  We could want all of the visitations to be scheduled as soon as posible.  The first visitation occurs at 9:00 am, so we could schedule all of the reservations to be atended only during office hours.  We could also want to add breaks for meals, resting times, service maintenance or other managerial reasons. We asume none.  Some visitations occur overnight, so we can decide to schedule all visitations anytime during the day and over night  We could want to reschedule as few reservations as posible, or to have all visitations for the same client being served together, one right after another, so that each client came only once.  We could want to simplify:  To consider the earliest reservation starting time as the beginning and then queue all others right behind according: first, to their starting time, and second (if there were more tan one) by other criteria  Other possible criteria are: visitation duration, client id, alphabetical by name, or any other priority scheme. For the sake of simplicity we choose the plain vanilla reservation id (the table´s master key) Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 25. Scheduling scheme chosen  Since there exists a number of combinations for these and other criteria, that result in very different scheduling schemes. The choice is usually to be made among them according to the meta-knowledge that we have of the problem’s environment (being it a hospital, a supermarket, a computer´s CPU…). This was also the case at the data clean-up stage.  Since the problem was submitted decontextualised, we are somewhat free to choose here. Our scheduling scheme is defined as follows:  The earliest reservation with the lowest ‘id’ will be scheduled as the first one.  All others will follow without any time lapses, according to their starting time, and in case of conflict, to their reservation id. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 26. Coding the scheme  Alike before, we use pseudocode to show a simple scheduling routine:  We asume all ROWS have consecutive ID master keys after the COMMIT in the cleaning routine.  ROWS := SELECT ALL FROM RESERVATIONS Loads data from the Reservations table  ORDER ROWS BY DATETIME_FROM Orders all rows by starting time  ORDER ROWS BY ID Orders all rows by the master key  FOR I=ID FROM ROWS[FIRST] TO ROWS[LAST-1] LOOP: For every row but the last one, repeat with index ‘i’:  TIMESPAN := ROWS[I+1].DATETIME_FROM - ROWS[I+1].DATETIME_FROM Calculates duration for the next task  ROWS[I+1].DATETIME_FROM := ROWS[I].DATETIME_TO Set all tasks to start right after the previous one ends  ROWS[I+1].DATETIME_TO := ROWS[I].DATETIME_TO + TIMESPAN Set termination time for all tasks  END LOOP End of loop  COMMIT_WRITE(ROWS,VISITATIONS) Overwrites the VISITATIONS table with the result of this scheduling routine Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 27. Reporting output  After scheduling we code a reporting routine in the same fashion as before:  We asume VIS (for short) is to contain the final output from VISITATIONS.  ORDER VISITATIONS BY DATETIME_FROM Orders all rows by starting time  ORDER VISITATIONS BY CLIENT_ID Performs a second ordering by client  VIS := SELECT FROM VISITATIONS: Loads several columns from the ordered Visitations table  VIS.ID  VIS.CLIENT_ID  VIS.NAME  VIS.DATETIME_FROM  VIS.DATETIME_TO  COMMIT WRITE(VIS,FILE(”.Output.csv”;#CSV)) Writes the result of this query in an archive in the comma- separated values format. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 28. ACID Compliant DBMS  In computer science, ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably.  In the context of databases, a single logical operation on the data is called a transaction.  This approach has many advantages and only slight disadvantages when treating really huge databases (say Terabytes of data) in real time environments. In those rare environments, a NoQSL approach might be preferred.  As we will see in the following reads, Microsoft´s SQL Server Express software solution will ensure ACID compliancy. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 29. ACID Compliancy in MS SQL Server (I) Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 30. ACID Compliancy in MS SQL Server (II) Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 31. ACID Compliancy in MS SQL Server (III) Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 32. Database design: a bird´s eye view  At this point, we again depict the problem in a paper sheet to gain further insight before continuing the database creation and management issues.  The database is thought of as part of a reservation system that receives online reservation requests, process them by scheduling acording to the scheme and produces a visitation table. It also allows to manage individually each of the visitators (just one instance for our example), clients, reservations and visitations.  We expanded the basic functionality of the software by adding the possibility of having more tan one agent of a visitations, dubbed “visitator”.  It will contain four tables: Visitators, Clients, Reservations and Visitations.  It will implement one “Reschedule” function and three procedures: Edit_clients, Edit_visitators and Edit_Reservations. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 33. Database normalization  Databse normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency.  Normalization usually involves dividing large tables into smaller (and less redundant) tables and defininf relationships between them.  The objective is to isolate data so that additions, deletions and modifications of a field can be made in just one table and then propagated through the rest of the database using the defined relationships.  The Normal Forms (NF) of relational database theory provide criteria for determining a table´s degree of immunity against logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less vulnerable it is.  For OLAP (Online Analytical Processing) applications, such as data mining tools, it might be preferred to use a lower normal form because they are primarily “read only” databases that tend to extract accumulated historical data, whereas transaction intensive applications will usually opt for a higher normal form.  For small problems like this one, usually 1NF, 2NF or 3NF are the only ones being used. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 34. Database map: Table definition  The database will implement the following four tables: Visitators, Clients, Reservations and Visitations  The tables contain the fields specified below. An asterisk (*) is added after the primary key identifier for each of the tables.  VISITATORS: v_id (*), v_name  CLIENTS: client_id (*), name  RESERVATIONS: id (*), v_id, client_id, datetime_from, datetime_to  VISITATIONS: V_id (*), v_id, client_id, datetime_from, datetime_to, Rescheduled  NOTES:  The field for the client name has been moved out from the reservations table because having the client_id, this field is redundant. A table has been created to contain all of the clients´names associated to their client_id.  The field for for the client name has been moved out from the visitations table for the same reason. In case we need to print a report containing the visitations as scheduled, a query will be able to access the Clients table to retrieve the piece of data.  The visitator´s name has been moved out of reservations for analogous reasons. A visitators table has been created.  The visitator´s identificator “v_id” has been added to the reservations and to the visitations table so to be allow to choose among several of these.  Rescheduled is a boolean field that has been added to keep track of rescheduling operations. Any visitation that undergoes a change in any other field for reservations rescheduling purposes will be marked with a TRUE value. FALSE otherwise. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 35. Database map: Procedures and functions  The database will implement three procedures and one function that will be called from any of the former.  The function “RESCHEDULE” will read the table of Reservations and any other needed and will only write the table of Visitations. Its purpose will be to reschedule all rows according to the scheme previously defined.  There will exist four procedures:  EDIT_CLIENTS: Reads and writes the Clients table. Writes the table of Reservations. Finally calls the Reschedule function. It is used to modify any information concerning some particular client instance, such as the name field, in all of the registers. It is also used to remove a client with all of its reservations (and therefore its visitations).  EDIT VISITATORS: Reads and writes the Visitators table. Writes the table of Reservations. Finally calls the Reschedule functions. It is used to modify any information concerning some particular visitator instance, such as the name field, in all of the registers. It is also used to remove a visitator with all of its reservations (and therefore its visitations).  EDIT RESERVATIONS: Reads and writes the Reservations table. Finally call the Reschedule functions. It is used to edit any piece of data concerning a reservation, such as the visitator, the client or the dates and times arranged. It is also used to delete a reservation.  NOTES:  Only the RESCHEDULE function can Access the Visitations list, being this considered the single most valuable source of output reports from the program´s execution.  It may occur that upon the deletion of any or all of the Reservations, some garbage data remains stored at the Clients and Visitators tables. That´s why we need specific procedures to edit those. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 36. Database map, visually Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)  Blue boxes for tables, Green disks for procedures. Arrows for data/operations fluxes. PERT
  • 37. References  "Microsoft SQL Server 2005 New Features" by Michael Oatley. McGraw-Hill/Osborne 2005 (288 pages). ISBN:0072227761  “SQL Server 2000: Stored Procedure Programming” by Dejan Sunderic and Tom Woodhead. Osborne Database Professional’s Library  “Microsoft Excel 2007 VBA (Macros). Premier Training Limited (London)  “Macros Visual Basic para Excel” by José Pedro García Sabater. ROGLE – Universitat Politècnica de València.  “Microsoft SQL Server 2005 Express Edition for Dummies” by Robert Schneider. Wiley Publishing, Inc.  “Oracle PL/SQL by Example” by VV.AA. Pearson Education as Prentice Hall Professional Technical Reference. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 38. Any questions?  alfonsodelafuenteruiz@yahoo.es  http://creativecommons.org/licenses/by-nc-sa/3.0/legalcode  Please excuse any errata.  Thanks for your attention Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)