Java Data Migration with Data Pipeline

1,092 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,092
On SlideShare
0
From Embeds
0
Number of Embeds
516
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Java Data Migration with Data Pipeline

  1. 1. North Concepts Inc. Toronto Java Users Group - May 30, 2013With Data PipelineDele Taylor @ North Concepts
  2. 2. North Concepts Inc.What is Data PipelineHow Does it WorkData FormatsData TransformationsData ConversionsCustomizationCode Generator
  3. 3. North Concepts Inc.Java library / frameworkConvert / transform / transfer dataMostly streamingMulti-threading supportExpression languageFreemiumhttp://northconcepts.com/downloads/http://northconcepts.com/tjug/
  4. 4. North Concepts Inc.XML Convert Filter JdbcRenamePipes & filters / decorator patternChain of readers, writers, & operationsLike java.io or command lineRecords instead of bytes/characters
  5. 5. North Concepts Inc.DataReader+open()+close()+read():RecordDataWriter+open()+close()+write(Record)Field-name-type-valueRecordproduces consumesXmlReader JdbcWriter0..*
  6. 6. North Concepts Inc.INTLONGBYTESHORTDOUBLEFLOATBOOLEANSTRINGCHARDATETIMEDATETIMEBLOBUNDEFINED
  7. 7. North Concepts Inc.DataReader reader = new CSVReader(new File("customers.csv"));DataWriter writer = new JdbcWriter(connection, "CUSTOMER");JobTemplate.DEFAULT.transfer(reader, writer);
  8. 8. North Concepts Inc.Format Streaming Read WriteCSV   Excel  Fixed Width / Fixed Length   In-Memory  Java Beans JDBC   Native   PDF RTF (Word) Template (FreeMarker)  Web Server Logs  XML (XPath)   
  9. 9. North Concepts Inc.DataReader reader = new XmlReader(file).addField("id", "//transactions/txn/@id").addField("name", "//transactions/txn/name/text()").addField("price", "//transactions/txn/price/text()").addRecordBreak("//transactions/txn");reader.open();try {Record record;while ((record = reader.read()) != null) {System.out.println(record);}} finally {reader.close();}
  10. 10. North Concepts Inc.Filter Validate Transform SortExcludeFieldsIncludeFieldsLookupRenameFieldCopy FieldAssignFieldCalculatedFieldDeMux Meter ThrottleRemoveDuplicatesSequence Aggregate AsyncJDBCMulti-WriterMulti-Writer
  11. 11. North Concepts Inc.ProxyReader ProxyWriterDataReader DataWriter-nestedDataReader#interceptRecord(Record)-nestedDataWriter#interceptRecord(Record)FilteringReader AsyncWriter
  12. 12. North Concepts Inc.DataReader reader = new CSVReader(new File(sourceFile)).setFieldSeparator(|).setFieldNamesInFirstRow(true);reader = new FilteringReader(reader).add(new FieldFilter("email").addRule(new PatternMatch(".*.com")));reader = new TransformingReader(reader).add(new IncludeFields("email", "fname", "lname"));reader = new TransformingReader(reader).add(new RenameField("fname", "first_name")).add(new RenameField("lname", "last_name"));DataWriter writer = new FixedWidthWriter(new File(targetFile)).addFields(64).addFields(20).addFields(20).setFieldNamesInFirstRow(true);JobTemplate.DEFAULT.transfer(reader, writer);
  13. 13. North Concepts Inc.Type Conversions• xxx-to-date• number-to-xxx• string-to-xxx• xxx-to-string• null-to-xxx• roundString Manipulation• insert/append/prepend/delete• substring/left/right• trim/trim-left/trim-right• pad-left/pad-right• replace-range/replace-string• uppercase/lowercaseBasicFieldTransformerFluent interface  chain conversions
  14. 14. North Concepts Inc.reader = new TransformingReader(reader).add(new BasicFieldTransformer("id").stringToLong()).add(new BasicFieldTransformer("price").nullToValue("0").stringToDouble()).add(new BasicFieldTransformer("date").stringToDate("YYYY-MM-dd").nullToValue(new Date()));
  15. 15. North Concepts Inc.FormatsTransformationsData ConversionsFilters / ValidationsLookups
  16. 16. North Concepts Inc.Subclass – DataReader / DataWriterDataReaderCustomReader+open()+close()#readImpl():RecordDataWriterCustomWriter+open()+close()#writeImpl(Record)
  17. 17. North Concepts Inc.public class PropertiesReader extends DataReader {private final Properties properties;private Enumeration<Object> keys;public PropertiesReader(Properties properties) {this.properties = properties;}@Overridepublic void open() throws DataException {super.open();keys = properties.keys();}@Overridepublic void close() throws DataException {keys = null;super.close();}@Overrideprotected Record readImpl() throws Throwable {if (!keys.hasMoreElements()) {return null;}Object key = keys.nextElement();Object value = properties.get(key);Record record = new Record();record.addField().setName("name").setValue(key);record.addField().setName("value").setValue(value);return record;}}tx=Transaction IDsymbol=Symboldob=Birth DateString sourceFile = "data/input/custom-reader-08.properties";Properties properties = new Properties();properties.load(new FileReader(sourceFile));DataReader reader = new PropertiesReader(properties);DataWriter writer = new StreamWriter(System.out);JobTemplate.DEFAULT.transfer(reader, writer);-----------------------------------------------0 - Record (MODIFIED) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String}-----------------------------------------------1 - Record (MODIFIED) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String}-----------------------------------------------2 - Record (MODIFIED) {0:[name]:STRING=[dob]:String1:[value]:STRING=[Birth Date]:String}
  18. 18. North Concepts Inc.Option 1 – Subclass Proxy Reader/WriterProxyReaderCustomReader#interceptRecord(Record)ProxyWriterCustomWriter#interceptRecord(Record)DataReader DataWriter-nestedDataReader#interceptRecord(Record)-nestedDataWriter#interceptRecord(Record)
  19. 19. North Concepts Inc.public class I18NReader extends ProxyReader {private final Locale[] locales;public I18NReader(DataReader nestedDataReader, Locale... locales) {super(nestedDataReader);this.locales = locales;}@Overrideprotected Record interceptRecord(Record record) throwsThrowable {for (Locale locale : locales) {Record copy = (Record) record.clone();copy.getField("locale", true).setValue(locale.toString());push(copy);}return null;}}reader = new I18NReader(reader,Locale.ENGLISH, Locale.FRENCH);----------------------------------------------0 - Record (NEW) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String2:[locale]:STRING=[en]:String}----------------------------------------------1 - Record (NEW) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String2:[locale]:STRING=[fr]:String}----------------------------------------------2 - Record (NEW) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String2:[locale]:STRING=[en]:String}----------------------------------------------3 - Record (NEW) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String2:[locale]:STRING=[fr]:String}----------------------------------------------4 - Record (NEW) {0:[name]:STRING=[dob]:String1:[value]:STRING=[Birth Date]:String2:[locale]:STRING=[en]:String}...
  20. 20. North Concepts Inc.Option 2 – Subclass TransformerProxyReaderCustomTransformer+transform(Record):booleanDataReaderTransformingReader-filter-transformersTransformer+transform(Record):boolean
  21. 21. North Concepts Inc.public class I18NTransformer extends Transformer {private final Locale[] locales;public I18NTransformer(Locale ... locales) {this.locales = locales;}@Overridepublic boolean transform(Record record) throws Throwable {for (Locale locale : locales) {Record copy = (Record) record.clone();copy.getField("locale", true).setValue(locale.toString());getReader().push(copy);record.delete();}return true;}}reader = new TransformingReader(reader).add(new I18NTransformer(Locale.ENGLISH, Locale.FRENCH));
  22. 22. North Concepts Inc.Subclass FieldTransformerCustomFieldTransformer#transformField(Field)+toString()TransformingReaderTransformerFieldTransformer+FieldTransformer(String)#transformField(Field)
  23. 23. North Concepts Inc.public class BytesToStringTransformer extends FieldTransformer {public BytesToStringTransformer(String name) {super(name);}@Overrideprotected void transformField(Field field) throws Throwable {Object value = field.getValue();if (value == null) {field.setNull(FieldType.STRING);return;}if (value instanceof byte[]) {byte[] bytes = (byte[]) value;String s = new BASE64Encoder().encode(bytes);field.setValue(s);} else {throw new DataException("field is not a byte array").set("field000", field);}}@Overridepublic String toString() {return "converting field "" + getName() + "" from bytes to string";}}reader = new TransformingReader(reader).add(new I18NTransformer(Locale.ENGLISH, Locale.FRENCH)).add(new BytesToStringTransformer("value"));
  24. 24. North Concepts Inc.-----------------------------------------------0 recordsException in thread "main" com.northconcepts.datapipeline.core.DataException: transformation [converting field"value" from bytes to string] failed on record 0; field is not a byte array-------------------------------DataEndpoint.description=[null]DataEndpoint.state=[OPENED]DataEndpoint.thread=[main]DataEndpoint.timestamp=[2013.05.26-22:21:31.908]DataReader.bufferSize=[0]DataReader.recordCount=[1]FieldTransformer.field=[[value]:STRING=[Transaction ID]:String]TransformingReader.filter=[null]TransformingReader.transformer=[converting field "value" from bytes to string]TransformingReader.transformerClass=[class com.northconcepts.tjug20130530.BytesToStringTransformer]TransformingReader.transformerIndex=[1]field000=[[value]:STRING=[Transaction ID]:String]fieldName=[value]-------------------------------at com.northconcepts.tjug20130530.BytesToStringTransformer.transformField(BytesToStringTransformer.java:30)at com.northconcepts.datapipeline.transform.FieldTransformer.transform(FieldTransformer.java:31)at com.northconcepts.datapipeline.transform.TransformingReader.transformRecord(TransformingReader.java:122)at com.northconcepts.datapipeline.transform.TransformingReader.interceptRecord(TransformingReader.java:110)at com.northconcepts.datapipeline.core.ProxyReader.readImpl(ProxyReader.java:86)at com.northconcepts.datapipeline.core.DataReader.read(DataReader.java:141)at com.northconcepts.datapipeline.job.JobTemplateImpl.doTransfer(JobTemplateImpl.java:67)at com.northconcepts.datapipeline.job.JobTemplateImpl.transfer(JobTemplateImpl.java:27)at com.northconcepts.tjug20130530.Main11_BytesToStringTransformer.main(Main11_BytesToStringTransformer.java:28)
  25. 25. North Concepts Inc.Subclass FilterProxyReaderCustomFilter+allow(Record):booleanDataReaderFilteringReader-filtersFilter+allow(Record):booleanValidatingReader-exceptionOnFailure#discard(Record, Filter)
  26. 26. North Concepts Inc.public class I18NFilter extends Filter {private final Locale[] locales;public I18NFilter(Locale ... locales) {this.locales = locales;}@Overridepublic boolean allow(Record record) {for (Locale locale : locales) {Field field = record.getField("locale", true);if (locale.toString().equals(field.getValueAsString())) {return true;}}return false;}}reader = new FilteringReader(reader).add(new I18NFilter(Locale.FRENCH));---------------------------------------------0 - Record (NEW) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String2:[locale]:STRING=[fr]:String}---------------------------------------------1 - Record (NEW) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String2:[locale]:STRING=[fr]:String}---------------------------------------------2 - Record (NEW) {0:[name]:STRING=[dob]:String1:[value]:STRING=[Birth Date]:String2:[locale]:STRING=[fr]:String}---------------------------------------------3 records
  27. 27. North Concepts Inc.Subclass LookupCustomLookup+get(Object...)TransformingReaderTransformerLookupTransformer-fields:FieldList-lookup:Lookup-overwriteFields:Boolean#join(Record, RecordList, List<?>)#noResults(Record, List<?>)#tooManyResults(Record, List<?>, RecordList)Lookup+get(Object...)+get(List<?>)
  28. 28. North Concepts Inc.public class TranslationLookup extends Lookup {private final String sourceLanguage;public TranslationLookup(String sourceLanguage) {this.sourceLanguage = sourceLanguage;}public RecordList get(Object ... keys) {if (keys == null || keys.length != 2) {throw new DataException("invalid arguments").set("arguments.expected", 2).set("arguments.found", keys);}String targetLanguage = keys[0].toString();String sourcePhrase = keys[1].toString();String targetPhrase = getTranslation(targetLanguage, sourcePhrase);Record record = new Record();record.addField().setName("targetPhrase").setValue(targetPhrase);return new RecordList(record);}private String getTranslation(String targetLanguage, String sourcePhrase) {if (sourceLanguage.equals(targetLanguage)) {return sourcePhrase;}// TODO: use translation service or DBreturn sourcePhrase + "--" + targetLanguage;}}-----------------------------------------------0 - Record (NEW) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String2:[locale]:STRING=[en]:String3:[targetPhrase]:STRING=[Transaction ID]:String}-----------------------------------------------1 - Record (NEW) {0:[name]:STRING=[tx]:String1:[value]:STRING=[Transaction ID]:String2:[locale]:STRING=[fr]:String3:[targetPhrase]:STRING=[Transaction ID--fr]:String}-----------------------------------------------2 - Record (NEW) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String2:[locale]:STRING=[en]:String3:[targetPhrase]:STRING=[Symbol]:String}-----------------------------------------------3 - Record (NEW) {0:[name]:STRING=[symbol]:String1:[value]:STRING=[Symbol]:String2:[locale]:STRING=[fr]:String3:[targetPhrase]:STRING=[Symbol--fr]:String}-----------------------------------------------4 - Record (NEW) {0:[name]:STRING=[dob]:String1:[value]:STRING=[Birth Date]:String2:[locale]:STRING=[en]:String3:[targetPhrase]:STRING=[Birth Date]:String}-----------------------------------------------5 - Record (NEW) {0:[name]:STRING=[dob]:String1:[value]:STRING=[Birth Date]:String2:[locale]:STRING=[fr]:String...reader = new TransformingReader(reader).add(new LookupTransformer(new FieldList("locale", "value"),new TranslationLookup(Locale.ENGLISH.toString())));
  29. 29. North Concepts Inc.http://northconcepts.com/data-pipeline/builder/
  30. 30. North Concepts Inc.Download at http://NorthConcepts.com/tjug/

×