Your SlideShare is downloading. ×
  • Like
  • Save
InfoZoom Tipps & Tricks – Automated data preparation via command line
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

InfoZoom Tipps & Tricks – Automated data preparation via command line

  • 78 views
Published

Data preparation and cleansing is a tiresome topic! …

Data preparation and cleansing is a tiresome topic!
It takes up a lot of time, but needs to be considered to conduct a meaningful analysis and deliver correct analysis reports. In order to automate this data preparation process for your own raw data, InfoZoom can be executed via command line.

Published in Business , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
78
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. InfoZoom Tips & Tricks – Part 3 Automated data preparation via command line
  • 2. About corma GmbH Stops suspects by: analytical investigations operative investigations Saves time by: online research online monitoring Increases efficiency and saves money by: data analytics global intelligence solutions 2
  • 3. Sample scenario Automation of the following steps 1. Import raw data in to a predefined InfoZoom template 2. Cleanse raw data via predefined queries 3. Create and save several data extracts as CSV files from the cleansed file for the import into a database Note: All steps need to be conducted manually, before creating the command line! • Create InfoZoom template for raw data incl. o Attribute groups, formulas, analysis cubes, queries • Report output o Can be implemented into the queries (optional)  Possible formats: Excel table, CSV file, TXT file, InfoZoom file 3
  • 4. Step 1 Create InfoZoom template for raw data (in this example: CSV file) 4
  • 5. Step 2 Create queries to cleanse the raw data • e.g. delete all characters except of numbers from phone numbers • Or delete all spaces at the beginning and end and use upper case 5
  • 6. Step 3 Save data extracts as CSV files via queries • Perform selection: o Exclude blank data entries for CSV data extracts 6
  • 7. Step 4 Command line parameters • • • • • • • • Open text editor and save file as *.cmd In the first line, copy the path of the InfoZoom.exe on the „C“ drive Perform InfoZoom in the background o Command: -invisible Open predefined template o Template name (if the name contains spaces, it needs to be enclosed by quotation marks) Import raw data in the previously opened template o Command: -insert -d ";" (-d = delimiter, needs to be enclosed by quotation marks: semicolon, circumflex etc.) Execute predefined queries o Command: -query „query name“(if the name contains spaces, it needs to be enclosed by quotation marks) Save selection as CSV file o Command: -saveObjectsAscsv , „ORGA URL.csv“ (delimiter and name of CSV file) Close InfoZoom in the background o Command: -exit 7
  • 8. Step 5 Assemble command line InfoZoom.exe -invisible Sample_Data.fot –insert –d „^“ Sample_Data.csv –query Country_cleansing -query Exclude_blank_Orga -saveObjectsAscsv , ORGA.csv -query Exclude_Blank_URL -saveObjectsAscsv , „ORGA URL.csv“ -query Exclude_blank_Address -saveObjectsAscsv , „ORGA ADDRESS.csv“ -query Exclude_blank_Contact -saveObjectsAscsv , „ORGA CONTACT.csv“ -exit Legend Commands Template and query names Raw data Delimiter for new created CSV files Names of new created CSV files 8
  • 9. Result Command line in text editor • All commands needs to be in one line without any line breaks! • The command line will be executed by double click on the *.cmd file DOS window shows executed commands • • Result: Created CSV files Preparation time without command line: apprx. 4 hours Preparation time with command line: apprx. 30 minutes 9
  • 10. InfoZoom Trainings InfoZoom online trainings • IZ50 InfoZoom Web-Starter-Seminar • IZ51 InfoZoom Web-Expert-Seminar o Overview of all training dates can be found here: http://infozoom-online-training.de/content/infozoomonline-training-trainings.html 10
  • 11. Thank You! infozoom@corma.de +49 (2161) 277850 corma GmbH · Heinz-Nixdorf-Straße 22 · D-41179 Mönchengladbach · Tel: +49 2161 277 85 - 0 · E-Mail: mail@corma.de · Web: www.corma.de, www.blog.corma.de 11