• Save
InfoZoom Tipps & Tricks – Automated data preparation via command line
Upcoming SlideShare
Loading in...5
×
 

InfoZoom Tipps & Tricks – Automated data preparation via command line

on

  • 166 views

Data preparation and cleansing is a tiresome topic! ...

Data preparation and cleansing is a tiresome topic!
It takes up a lot of time, but needs to be considered to conduct a meaningful analysis and deliver correct analysis reports. In order to automate this data preparation process for your own raw data, InfoZoom can be executed via command line.

Statistics

Views

Total Views
166
Views on SlideShare
152
Embed Views
14

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 14

http://blog.corma.de 14

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

InfoZoom Tipps & Tricks – Automated data preparation via command line InfoZoom Tipps & Tricks – Automated data preparation via command line Presentation Transcript

  • InfoZoom Tips & Tricks – Part 3 Automated data preparation via command line
  • About corma GmbH Stops suspects by: analytical investigations operative investigations Saves time by: online research online monitoring Increases efficiency and saves money by: data analytics global intelligence solutions 2
  • Sample scenario Automation of the following steps 1. Import raw data in to a predefined InfoZoom template 2. Cleanse raw data via predefined queries 3. Create and save several data extracts as CSV files from the cleansed file for the import into a database Note: All steps need to be conducted manually, before creating the command line! • Create InfoZoom template for raw data incl. o Attribute groups, formulas, analysis cubes, queries • Report output o Can be implemented into the queries (optional)  Possible formats: Excel table, CSV file, TXT file, InfoZoom file 3
  • Step 1 Create InfoZoom template for raw data (in this example: CSV file) 4
  • Step 2 Create queries to cleanse the raw data • e.g. delete all characters except of numbers from phone numbers • Or delete all spaces at the beginning and end and use upper case 5
  • Step 3 Save data extracts as CSV files via queries • Perform selection: o Exclude blank data entries for CSV data extracts 6
  • Step 4 Command line parameters • • • • • • • • Open text editor and save file as *.cmd In the first line, copy the path of the InfoZoom.exe on the „C“ drive Perform InfoZoom in the background o Command: -invisible Open predefined template o Template name (if the name contains spaces, it needs to be enclosed by quotation marks) Import raw data in the previously opened template o Command: -insert -d ";" (-d = delimiter, needs to be enclosed by quotation marks: semicolon, circumflex etc.) Execute predefined queries o Command: -query „query name“(if the name contains spaces, it needs to be enclosed by quotation marks) Save selection as CSV file o Command: -saveObjectsAscsv , „ORGA URL.csv“ (delimiter and name of CSV file) Close InfoZoom in the background o Command: -exit 7
  • Step 5 Assemble command line InfoZoom.exe -invisible Sample_Data.fot –insert –d „^“ Sample_Data.csv –query Country_cleansing -query Exclude_blank_Orga -saveObjectsAscsv , ORGA.csv -query Exclude_Blank_URL -saveObjectsAscsv , „ORGA URL.csv“ -query Exclude_blank_Address -saveObjectsAscsv , „ORGA ADDRESS.csv“ -query Exclude_blank_Contact -saveObjectsAscsv , „ORGA CONTACT.csv“ -exit Legend Commands Template and query names Raw data Delimiter for new created CSV files Names of new created CSV files 8
  • Result Command line in text editor • All commands needs to be in one line without any line breaks! • The command line will be executed by double click on the *.cmd file DOS window shows executed commands • • Result: Created CSV files Preparation time without command line: apprx. 4 hours Preparation time with command line: apprx. 30 minutes 9
  • InfoZoom Trainings InfoZoom online trainings • IZ50 InfoZoom Web-Starter-Seminar • IZ51 InfoZoom Web-Expert-Seminar o Overview of all training dates can be found here: http://infozoom-online-training.de/content/infozoomonline-training-trainings.html 10
  • Thank You! infozoom@corma.de +49 (2161) 277850 corma GmbH · Heinz-Nixdorf-Straße 22 · D-41179 Mönchengladbach · Tel: +49 2161 277 85 - 0 · E-Mail: mail@corma.de · Web: www.corma.de, www.blog.corma.de 11