• Save
Java Data Migration with Data Pipeline
Upcoming SlideShare
Loading in...5
×
 

Java Data Migration with Data Pipeline

on

  • 853 views

 

Statistics

Views

Total Views
853
Views on SlideShare
337
Embed Views
516

Actions

Likes
0
Downloads
0
Comments
0

5 Embeds 516

http://northconcepts.com 360
http://localhost 143
http://www.northconcepts.com 10
http://www.365dailyjournal.com 2
http://translate.googleusercontent.com 1

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Java Data Migration with Data Pipeline Java Data Migration with Data Pipeline Presentation Transcript

  • North Concepts Inc. Toronto Java Users Group - May 30, 2013With Data PipelineDele Taylor @ North Concepts
  • North Concepts Inc.What is Data PipelineHow Does it WorkData FormatsData TransformationsData ConversionsCustomizationCode Generator
  • North Concepts Inc.Java library / frameworkConvert / transform / transfer dataMostly streamingMulti-threading supportExpression languageFreemiumhttp://northconcepts.com/downloads/http://northconcepts.com/tjug/ View slide
  • North Concepts Inc.XML Convert Filter JdbcRenamePipes & filters / decorator patternChain of readers, writers, & operationsLike java.io or command lineRecords instead of bytes/characters View slide
  • North Concepts Inc.DataReader+open()+close()+read():RecordDataWriter+open()+close()+write(Record)Field-name-type-valueRecordproduces consumesXmlReader JdbcWriter0..*
  • North Concepts Inc.INTLONGBYTESHORTDOUBLEFLOATBOOLEANSTRINGCHARDATETIMEDATETIMEBLOBUNDEFINED
  • North Concepts Inc.DataReader reader = new CSVReader(new File("customers.csv"));DataWriter writer = new JdbcWriter(connection, "CUSTOMER");JobTemplate.DEFAULT.transfer(reader, writer);
  • North Concepts Inc.Format Streaming Read WriteCSV   Excel  Fixed Width / Fixed Length   In-Memory  Java Beans JDBC   Native   PDF RTF (Word) Template (FreeMarker)  Web Server Logs  XML (XPath)   
  • North Concepts Inc.DataReader reader = new XmlReader(file).addField("id", "//transactions/txn/@id").addField("name", "//transactions/txn/name/text()").addField("price", "//transactions/txn/price/text()").addRecordBreak("//transactions/txn");reader.open();try {Record record;while ((record = reader.read()) != null) {System.out.println(record);}} finally {reader.close();}
  • North Concepts Inc.Filter Validate Transform SortExcludeFieldsIncludeFieldsLookupRenameFieldCopy FieldAssignFieldCalculatedFieldDeMux Meter ThrottleRemoveDuplicatesSequence Aggregate AsyncJDBCMulti-WriterMulti-Writer
  • North Concepts Inc.ProxyReader ProxyWriterDataReader DataWriter-nestedDataReader#interceptRecord(Record)-nestedDataWriter#interceptRecord(Record)FilteringReader AsyncWriter
  • North Concepts Inc.DataReader reader = new CSVReader(new File(sourceFile)).setFieldSeparator(|).setFieldNamesInFirstRow(true);reader = new FilteringReader(reader).add(new FieldFilter("email").addRule(new PatternMatch(".*.com")));reader = new TransformingReader(reader).add(new IncludeFields("email", "fname", "lname"));reader = new TransformingReader(reader).add(new RenameField("fname", "first_name")).add(new RenameField("lname", "last_name"));DataWriter writer = new FixedWidthWriter(new File(targetFile)).addFields(64).addFields(20).addFields(20).setFieldNamesInFirstRow(true);JobTemplate.DEFAULT.transfer(reader, writer);
  • North Concepts Inc.Type Conversions• xxx-to-date• number-to-xxx• string-to-xxx• xxx-to-string• null-to-xxx• roundString Manipulation• insert/append/prepend/delete• substring/left/right• trim/trim-left/trim-right• pad-left/pad-right• replace-range/replace-string• uppercase/lowercaseBasicFieldTransformerFluent interface  chain conversions
  • North Concepts Inc.reader = new TransformingReader(reader).add(new BasicFieldTransformer("id").stringToLong()).add(new BasicFieldTransformer("price").nullToValue("0").stringToDouble()).add(new BasicFieldTransformer("date").stringToDate("YYYY-MM-dd").nullToValue(new Date()));
  • North Concepts Inc.FormatsTransformationsData ConversionsFilters / ValidationsLookups
  • North Concepts Inc.Subclass – DataReader / DataWriterDataReaderCustomReader+open()+close()#readImpl():RecordDataWriterCustomWriter+open()+close()#writeImpl(Record)
  • North Concepts Inc.public class PropertiesReader extends DataReader {private final Properties properties;private Enumeration<Object> keys;public PropertiesReader(Properties properties) {this.properties = properties;}@Overridepublic void open() throws DataException {super.open();keys = properties.keys();}@Overridepublic void close() throws DataException {keys = null;super.close();}@Overrideprotected Record readImpl() throws Throwable {if (!keys.hasMoreElements()) {return null;}Object key = keys.nextElement();Object value = properties.get(key);Record record = new Record();record.addField().setName("name").setValue(key);record.addField().setName("value").setValue(value);return record;}}tx=Transaction IDsymbol=Symboldob=Birth DateString sourceFile = "data/input/custom-reader-08.properties";Properties properties = new Properties();properties.load(new FileReader(sourceFile));DataReader reader = new PropertiesReader(properties);DataWriter writer = new StreamWriter(System.out);JobTemplate.DEFAULT.transfer(reader, writer);-----------------------------------------------0 - Record (MODIFIED) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String}-----------------------------------------------1 - Record (MODIFIED) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String}-----------------------------------------------2 - Record (MODIFIED) {0:[name]:STRING=[dob]:String1:[value]:STRING=[Birth Date]:String}
  • North Concepts Inc.Option 1 – Subclass Proxy Reader/WriterProxyReaderCustomReader#interceptRecord(Record)ProxyWriterCustomWriter#interceptRecord(Record)DataReader DataWriter-nestedDataReader#interceptRecord(Record)-nestedDataWriter#interceptRecord(Record)
  • North Concepts Inc.public class I18NReader extends ProxyReader {private final Locale[] locales;public I18NReader(DataReader nestedDataReader, Locale... locales) {super(nestedDataReader);this.locales = locales;}@Overrideprotected Record interceptRecord(Record record) throwsThrowable {for (Locale locale : locales) {Record copy = (Record) record.clone();copy.getField("locale", true).setValue(locale.toString());push(copy);}return null;}}reader = new I18NReader(reader,Locale.ENGLISH, Locale.FRENCH);----------------------------------------------0 - Record (NEW) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String2:[locale]:STRING=[en]:String}----------------------------------------------1 - Record (NEW) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String2:[locale]:STRING=[fr]:String}----------------------------------------------2 - Record (NEW) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String2:[locale]:STRING=[en]:String}----------------------------------------------3 - Record (NEW) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String2:[locale]:STRING=[fr]:String}----------------------------------------------4 - Record (NEW) {0:[name]:STRING=[dob]:String1:[value]:STRING=[Birth Date]:String2:[locale]:STRING=[en]:String}...
  • North Concepts Inc.Option 2 – Subclass TransformerProxyReaderCustomTransformer+transform(Record):booleanDataReaderTransformingReader-filter-transformersTransformer+transform(Record):boolean
  • North Concepts Inc.public class I18NTransformer extends Transformer {private final Locale[] locales;public I18NTransformer(Locale ... locales) {this.locales = locales;}@Overridepublic boolean transform(Record record) throws Throwable {for (Locale locale : locales) {Record copy = (Record) record.clone();copy.getField("locale", true).setValue(locale.toString());getReader().push(copy);record.delete();}return true;}}reader = new TransformingReader(reader).add(new I18NTransformer(Locale.ENGLISH, Locale.FRENCH));
  • North Concepts Inc.Subclass FieldTransformerCustomFieldTransformer#transformField(Field)+toString()TransformingReaderTransformerFieldTransformer+FieldTransformer(String)#transformField(Field)
  • North Concepts Inc.public class BytesToStringTransformer extends FieldTransformer {public BytesToStringTransformer(String name) {super(name);}@Overrideprotected void transformField(Field field) throws Throwable {Object value = field.getValue();if (value == null) {field.setNull(FieldType.STRING);return;}if (value instanceof byte[]) {byte[] bytes = (byte[]) value;String s = new BASE64Encoder().encode(bytes);field.setValue(s);} else {throw new DataException("field is not a byte array").set("field000", field);}}@Overridepublic String toString() {return "converting field "" + getName() + "" from bytes to string";}}reader = new TransformingReader(reader).add(new I18NTransformer(Locale.ENGLISH, Locale.FRENCH)).add(new BytesToStringTransformer("value"));
  • North Concepts Inc.-----------------------------------------------0 recordsException in thread "main" com.northconcepts.datapipeline.core.DataException: transformation [converting field"value" from bytes to string] failed on record 0; field is not a byte array-------------------------------DataEndpoint.description=[null]DataEndpoint.state=[OPENED]DataEndpoint.thread=[main]DataEndpoint.timestamp=[2013.05.26-22:21:31.908]DataReader.bufferSize=[0]DataReader.recordCount=[1]FieldTransformer.field=[[value]:STRING=[Transaction ID]:String]TransformingReader.filter=[null]TransformingReader.transformer=[converting field "value" from bytes to string]TransformingReader.transformerClass=[class com.northconcepts.tjug20130530.BytesToStringTransformer]TransformingReader.transformerIndex=[1]field000=[[value]:STRING=[Transaction ID]:String]fieldName=[value]-------------------------------at com.northconcepts.tjug20130530.BytesToStringTransformer.transformField(BytesToStringTransformer.java:30)at com.northconcepts.datapipeline.transform.FieldTransformer.transform(FieldTransformer.java:31)at com.northconcepts.datapipeline.transform.TransformingReader.transformRecord(TransformingReader.java:122)at com.northconcepts.datapipeline.transform.TransformingReader.interceptRecord(TransformingReader.java:110)at com.northconcepts.datapipeline.core.ProxyReader.readImpl(ProxyReader.java:86)at com.northconcepts.datapipeline.core.DataReader.read(DataReader.java:141)at com.northconcepts.datapipeline.job.JobTemplateImpl.doTransfer(JobTemplateImpl.java:67)at com.northconcepts.datapipeline.job.JobTemplateImpl.transfer(JobTemplateImpl.java:27)at com.northconcepts.tjug20130530.Main11_BytesToStringTransformer.main(Main11_BytesToStringTransformer.java:28)
  • North Concepts Inc.Subclass FilterProxyReaderCustomFilter+allow(Record):booleanDataReaderFilteringReader-filtersFilter+allow(Record):booleanValidatingReader-exceptionOnFailure#discard(Record, Filter)
  • North Concepts Inc.public class I18NFilter extends Filter {private final Locale[] locales;public I18NFilter(Locale ... locales) {this.locales = locales;}@Overridepublic boolean allow(Record record) {for (Locale locale : locales) {Field field = record.getField("locale", true);if (locale.toString().equals(field.getValueAsString())) {return true;}}return false;}}reader = new FilteringReader(reader).add(new I18NFilter(Locale.FRENCH));---------------------------------------------0 - Record (NEW) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String2:[locale]:STRING=[fr]:String}---------------------------------------------1 - Record (NEW) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String2:[locale]:STRING=[fr]:String}---------------------------------------------2 - Record (NEW) {0:[name]:STRING=[dob]:String1:[value]:STRING=[Birth Date]:String2:[locale]:STRING=[fr]:String}---------------------------------------------3 records
  • North Concepts Inc.Subclass LookupCustomLookup+get(Object...)TransformingReaderTransformerLookupTransformer-fields:FieldList-lookup:Lookup-overwriteFields:Boolean#join(Record, RecordList, List<?>)#noResults(Record, List<?>)#tooManyResults(Record, List<?>, RecordList)Lookup+get(Object...)+get(List<?>)
  • North Concepts Inc.public class TranslationLookup extends Lookup {private final String sourceLanguage;public TranslationLookup(String sourceLanguage) {this.sourceLanguage = sourceLanguage;}public RecordList get(Object ... keys) {if (keys == null || keys.length != 2) {throw new DataException("invalid arguments").set("arguments.expected", 2).set("arguments.found", keys);}String targetLanguage = keys[0].toString();String sourcePhrase = keys[1].toString();String targetPhrase = getTranslation(targetLanguage, sourcePhrase);Record record = new Record();record.addField().setName("targetPhrase").setValue(targetPhrase);return new RecordList(record);}private String getTranslation(String targetLanguage, String sourcePhrase) {if (sourceLanguage.equals(targetLanguage)) {return sourcePhrase;}// TODO: use translation service or DBreturn sourcePhrase + "--" + targetLanguage;}}-----------------------------------------------0 - Record (NEW) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String2:[locale]:STRING=[en]:String3:[targetPhrase]:STRING=[Transaction ID]:String}-----------------------------------------------1 - Record (NEW) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String2:[locale]:STRING=[fr]:String3:[targetPhrase]:STRING=[Transaction ID--fr]:String}-----------------------------------------------2 - Record (NEW) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String2:[locale]:STRING=[en]:String3:[targetPhrase]:STRING=[Symbol]:String}-----------------------------------------------3 - Record (NEW) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String2:[locale]:STRING=[fr]:String3:[targetPhrase]:STRING=[Symbol--fr]:String}-----------------------------------------------4 - Record (NEW) {0:[name]:STRING=[dob]:String1:[value]:STRING=[Birth Date]:String2:[locale]:STRING=[en]:String3:[targetPhrase]:STRING=[Birth Date]:String}-----------------------------------------------5 - Record (NEW) {0:[name]:STRING=[dob]:String1:[value]:STRING=[Birth Date]:String2:[locale]:STRING=[fr]:String...reader = new TransformingReader(reader).add(new LookupTransformer(new FieldList("locale", "value"),new TranslationLookup(Locale.ENGLISH.toString())));
  • North Concepts Inc.http://northconcepts.com/data-pipeline/builder/
  • North Concepts Inc.Download at http://NorthConcepts.com/tjug/