3/9/2010Figen Bilir ©1
Project  Overview: AllWorksThe SSIS project was for a fictitious construction company called AllWorks. Project was designed and build a SQL Server 2005 database to track employee and customer information, timesheet and labor rates  data, as well as job order information, job materials, and customer invoices.  In the client project scenario, AllWorks currently stores this information in Excel Spreadsheets, XML files, and CSV files.Extract data from diverse files (*.csv, *.xls, etc.)Transform data as required by business and SQL database requirementsLoad data into a local SQL AllWorksDBStudent database3/9/2010Figen Bilir ©2
Database Diagram3/9/2010Figen Bilir ©3
Source Data Review3/9/2010Figen Bilir ©4
Employee Master PackageThis package loads the employee data from an Excel Spreadsheet, Employees.XLS. Data conversion takes place to verify that the data is in a useable format. A log is created to quantify the number of updates and inserts which are sent to a specific destination. The records are sent to an email recipient who will tell them if the package was a success along with the package name, user who performed action and record results. In the event of a failure of the package during execution an error would be sent to specific address alerting them of the failure.3/9/2010Figen Bilir ©5
Sendmail Setup & Outcome3/9/2010Figen Bilir ©6
Employee Rate PackageThis package loads the employee rate data from an Excel Spreadsheet, Employees.XLS. Data conversion takes place to verify that the data is in a useable format. All employee records are validated in Lookup task, invalid ones are logged to a CSV file. A log is created to quantify the number of updates, inserts and invalid records which are sent to a specific destination. The records are sent to an email recipient who will tell them if the package was a success along with the package name, user who performed action and record results. In the event of a failure of the package during execution an error would be sent to specific address alerting them of the failure.3/9/2010Figen Bilir ©7
Client Master PackageDue to the constraint between dbo.Client and dbo.County tables based on CountyPK as a FK on Clients table, first County table has been populated to the database. That’s why Data Flow for County Data was generated first, and then Data Flow Client Data has been achieved.From the source file for the current package, County Definitions Worksheet within the same file has been utilized in order to populate dbo.County table in the database.3/9/2010Figen Bilir ©8
Client Master Package Cont’dInvalidated CountyIDs has been set up to be written to the Log File above based on check from the County table since it already runs prior to Data Flow Client Data.3/9/2010Figen Bilir ©9
Client Groupings Master PackageThe Clientgeographies.XLS spreadsheet has been in normalized form, after the data conversion, the incoming data has been aggregated based on groupingno and groupingname.3/9/2010Figen Bilir ©10
Client Groupings to Client Xref Table PackageThere are 3 lookups to validate the incoming data. First, ClientID is checked to validate the AccountKey, then, GroupingNo is validated in the ClientsGrouping and lastly GroupingID and ClientID is validated against ClientGroupingsXClients table in order to handle to insert in the next step. 3/9/2010Figen Bilir ©11
Project Job Master PackageDue to the FK relationship between Clients and JobMaster Table, ClientPK has to be validated with a lookup. If there are any invalid Clients, they are written to the CSV Log file specified in Flat File Connection Manager.3/9/2010Figen Bilir ©12
Project Job Time Sheets (Labor) Package3/9/2010Figen Bilir ©13This package loads data from several CSV JobTimeSheet files into SQL Server 2005 database. The  content of its Data Flow task is shown on the next two slides.
Project Job Time Sheets (Labor) Package Cont’dIn the data flow, data is read from the CSV files each time and EmployeeID and JobMasterID is validated with the Lookups from Employee and JobMaster Tables respectively after data conversion take places.3/9/2010Figen Bilir ©14
Project Job Time Sheets (Labor) Package Cont’dPackage reads the TimeSheet data from several CSV files and inserts new rows or update existing row if data is different.Data Flow for Load Job Time Sheet control flow has a Foreach Loop Container that loops through each file and processes the job time sheet files. In order to accumulate the row count a script has been written to for the total row counts and file counts. 3/9/2010Figen Bilir ©15
Project Job Time Sheets (Labor) Package Cont’dThe script is written to accumulate the total count for given variables in ReadOnly and ReadWrite Variables are used in the definition of the mail message.3/9/2010Figen Bilir ©16
Master PackageThis is the main package that  launches the execution of all ETL packages from SQL Server 2005.After the successful execution of all ETL packages, it launches the database maintenance tasks starting with the database shrinking, indexes building, statistics update and database backup task.Upon completion the whole package a notification of the successful email is sent. If any maintenance task fails, “Unsuccessful Email” is sent from each task.3/9/2010Figen Bilir ©17
SQL Server Agent JobAll packages were deployed to the (local) SQL Server and a job -Execute SSIS Student Project- in SQL Server Agent was set up to run this Master Package nightly at 12:00AM.3/9/2010Figen Bilir ©18

Bilir's Business Intelligence Portfolio SSIS Project

  • 1.
  • 2.
    Project Overview:AllWorksThe SSIS project was for a fictitious construction company called AllWorks. Project was designed and build a SQL Server 2005 database to track employee and customer information, timesheet and labor rates data, as well as job order information, job materials, and customer invoices. In the client project scenario, AllWorks currently stores this information in Excel Spreadsheets, XML files, and CSV files.Extract data from diverse files (*.csv, *.xls, etc.)Transform data as required by business and SQL database requirementsLoad data into a local SQL AllWorksDBStudent database3/9/2010Figen Bilir ©2
  • 3.
  • 4.
  • 5.
    Employee Master PackageThispackage loads the employee data from an Excel Spreadsheet, Employees.XLS. Data conversion takes place to verify that the data is in a useable format. A log is created to quantify the number of updates and inserts which are sent to a specific destination. The records are sent to an email recipient who will tell them if the package was a success along with the package name, user who performed action and record results. In the event of a failure of the package during execution an error would be sent to specific address alerting them of the failure.3/9/2010Figen Bilir ©5
  • 6.
    Sendmail Setup &Outcome3/9/2010Figen Bilir ©6
  • 7.
    Employee Rate PackageThispackage loads the employee rate data from an Excel Spreadsheet, Employees.XLS. Data conversion takes place to verify that the data is in a useable format. All employee records are validated in Lookup task, invalid ones are logged to a CSV file. A log is created to quantify the number of updates, inserts and invalid records which are sent to a specific destination. The records are sent to an email recipient who will tell them if the package was a success along with the package name, user who performed action and record results. In the event of a failure of the package during execution an error would be sent to specific address alerting them of the failure.3/9/2010Figen Bilir ©7
  • 8.
    Client Master PackageDueto the constraint between dbo.Client and dbo.County tables based on CountyPK as a FK on Clients table, first County table has been populated to the database. That’s why Data Flow for County Data was generated first, and then Data Flow Client Data has been achieved.From the source file for the current package, County Definitions Worksheet within the same file has been utilized in order to populate dbo.County table in the database.3/9/2010Figen Bilir ©8
  • 9.
    Client Master PackageCont’dInvalidated CountyIDs has been set up to be written to the Log File above based on check from the County table since it already runs prior to Data Flow Client Data.3/9/2010Figen Bilir ©9
  • 10.
    Client Groupings MasterPackageThe Clientgeographies.XLS spreadsheet has been in normalized form, after the data conversion, the incoming data has been aggregated based on groupingno and groupingname.3/9/2010Figen Bilir ©10
  • 11.
    Client Groupings toClient Xref Table PackageThere are 3 lookups to validate the incoming data. First, ClientID is checked to validate the AccountKey, then, GroupingNo is validated in the ClientsGrouping and lastly GroupingID and ClientID is validated against ClientGroupingsXClients table in order to handle to insert in the next step. 3/9/2010Figen Bilir ©11
  • 12.
    Project Job MasterPackageDue to the FK relationship between Clients and JobMaster Table, ClientPK has to be validated with a lookup. If there are any invalid Clients, they are written to the CSV Log file specified in Flat File Connection Manager.3/9/2010Figen Bilir ©12
  • 13.
    Project Job TimeSheets (Labor) Package3/9/2010Figen Bilir ©13This package loads data from several CSV JobTimeSheet files into SQL Server 2005 database. The content of its Data Flow task is shown on the next two slides.
  • 14.
    Project Job TimeSheets (Labor) Package Cont’dIn the data flow, data is read from the CSV files each time and EmployeeID and JobMasterID is validated with the Lookups from Employee and JobMaster Tables respectively after data conversion take places.3/9/2010Figen Bilir ©14
  • 15.
    Project Job TimeSheets (Labor) Package Cont’dPackage reads the TimeSheet data from several CSV files and inserts new rows or update existing row if data is different.Data Flow for Load Job Time Sheet control flow has a Foreach Loop Container that loops through each file and processes the job time sheet files. In order to accumulate the row count a script has been written to for the total row counts and file counts. 3/9/2010Figen Bilir ©15
  • 16.
    Project Job TimeSheets (Labor) Package Cont’dThe script is written to accumulate the total count for given variables in ReadOnly and ReadWrite Variables are used in the definition of the mail message.3/9/2010Figen Bilir ©16
  • 17.
    Master PackageThis isthe main package that launches the execution of all ETL packages from SQL Server 2005.After the successful execution of all ETL packages, it launches the database maintenance tasks starting with the database shrinking, indexes building, statistics update and database backup task.Upon completion the whole package a notification of the successful email is sent. If any maintenance task fails, “Unsuccessful Email” is sent from each task.3/9/2010Figen Bilir ©17
  • 18.
    SQL Server AgentJobAll packages were deployed to the (local) SQL Server and a job -Execute SSIS Student Project- in SQL Server Agent was set up to run this Master Package nightly at 12:00AM.3/9/2010Figen Bilir ©18