DMDW 8. Student Presentation - Groovy to MongoDB


8. ETL Project by Maximilian Butterer

Published in: Technology
  1. 1. DMDW - ETL-Project <ul><li>the groovy-way </li></ul><ul><li>by Maximilian Butterer </li></ul>
  2. 2. What was the Job <ul><li>Extract the data from room-plan-file (excel) </li></ul><ul><li>Transform it into new structure (Datatypes) </li></ul><ul><li>Load it into new target (e.g. database-system) </li></ul><ul><li>create kind of documentation (how-to) or </li></ul><ul><li>present it to you </li></ul>
  3. 3. #1: Export the data <ul><li>exporting data from excel is pretty easy </li></ul><ul><li>so i exported hole file as csv-File </li></ul><ul><li>1st Problem: Comma is semicolon </li></ul><ul><li>2nd problem: Encoding is not utf-8 </li></ul>
  4. 4. #2: a helper <ul><li>because csv is plaintext it ‘ s easy to parse </li></ul><ul><li>i created a groovy-script for converting </li></ul><ul><li>my solution is not the cleanest way but some kind the easiest </li></ul><ul><li>3rd problem: Data is not consistent / there are corrupted data-sets </li></ul>
  5. 5. #3: Put it into <ul><li>Having all the objects parsed and converted </li></ul><ul><li>Just put all the stuff into the database </li></ul><ul><li>4th Problem: how? </li></ul>
  6. 6. How i solved the problems <ul><li>Parsing a file with groovy is easy as 1,2,3 </li></ul><ul><li>Each line i had to split up because of the semicolons </li></ul><ul><li>to convert the date-String into a real date we have to trick </li></ul>
  7. 7. Let ‘ s have a look behind <ul><li>Look into the code </li></ul><ul><li>how does the code work </li></ul><ul><li>questions </li></ul>