Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DMDW 8. Student Presentation - Groovy to MongoDB


Published on

8. ETL Project by Maximilian Butterer

Published in: Technology
  • Be the first to comment

  • Be the first to like this

DMDW 8. Student Presentation - Groovy to MongoDB

  1. 1. DMDW - ETL-Project <ul><li>the groovy-way </li></ul><ul><li>by Maximilian Butterer </li></ul>
  2. 2. What was the Job <ul><li>Extract the data from room-plan-file (excel) </li></ul><ul><li>Transform it into new structure (Datatypes) </li></ul><ul><li>Load it into new target (e.g. database-system) </li></ul><ul><li>create kind of documentation (how-to) or </li></ul><ul><li>present it to you </li></ul>
  3. 3. #1: Export the data <ul><li>exporting data from excel is pretty easy </li></ul><ul><li>so i exported hole file as csv-File </li></ul><ul><li>1st Problem: Comma is semicolon </li></ul><ul><li>2nd problem: Encoding is not utf-8 </li></ul>
  4. 4. #2: a helper <ul><li>because csv is plaintext it ‘ s easy to parse </li></ul><ul><li>i created a groovy-script for converting </li></ul><ul><li>my solution is not the cleanest way but some kind the easiest </li></ul><ul><li>3rd problem: Data is not consistent / there are corrupted data-sets </li></ul>
  5. 5. #3: Put it into <ul><li>Having all the objects parsed and converted </li></ul><ul><li>Just put all the stuff into the database </li></ul><ul><li>4th Problem: how? </li></ul>
  6. 6. How i solved the problems <ul><li>Parsing a file with groovy is easy as 1,2,3 </li></ul><ul><li>Each line i had to split up because of the semicolons </li></ul><ul><li>to convert the date-String into a real date we have to trick </li></ul>
  7. 7. Let ‘ s have a look behind <ul><li>Look into the code </li></ul><ul><li>how does the code work </li></ul><ul><li>questions </li></ul>