Data Integration with server side Mashups

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Group

    Data Integration with server side Mashups - Presentation Transcript

    1. Data Integration with Server Side Mashups Juergen Brendel Principal Software Engineer OSDC 2007, Brisbane
    2. Agenda The SnapLogic project • • Client-side mashups • Problems and solutions • Data integration with SnapLogic Data Integration with Server Side Mashups Slide 2 OSDC 2007, Brisbane
    3. The SnapLogic project • Founded 2005, data integration background • Vision: – Reusable data integration resources – REST – Web-based GUI – Programmatic interface – Open Source • Python... Why not? • www.snaplogic.com Data Integration with Server Side Mashups Slide 3 OSDC 2007, Brisbane
    4. What's a mashup? • A 'Web 2.0 kind of thing' • Combine, aggregate, visualise – Multiple sources – Multiple dimensions • Typically on the client side – Browser – Ajax Data Integration with Server Side Mashups Slide 4 OSDC 2007, Brisbane
    5. Self-made mashups • Hand coded • Mashup editors – GUI mashup-logic editor – Wiki-style – Hosted Data Integration with Server Side Mashups Slide 5 OSDC 2007, Brisbane
    6. Benefits for the enterprise? nal Enable knowledge io uat ns ! Sit workers !!! atio c ppli a Avoi d th IT b ottle e neck !! Yeah, right... Data Integration with Server Side Mashups Slide 6 OSDC 2007, Brisbane
    7. Problems with client-side mashups Skill • • Internal data often not web-friendly • Maintenance • Security Performance • Data Integration with Server Side Mashups Slide 7 OSDC 2007, Brisbane
    8. Solution: Server-side mashups • Flexible access • Security • Performance Data Integration with Server Side Mashups Slide 8 OSDC 2007, Brisbane
    9. SnapLogic data integration philosophy Clearly defined, REST resources • • Data reuse and integration • Pipelines • Framework for resource specific scripting Open source and community • Data Integration with Server Side Mashups Slide 9 OSDC 2007, Brisbane
    10. Example: Resources HTTP://server1.example.com/customer_list Databases SnapLogic Server Files Client HTTP HTTP Request and Component Applications Response Atom / RSS Resource JSON Definition • Resource Name • HTTP://server1.example.com/customer_list • SQL Query or filename • Credentials • Parameters Data Integration with Server Side Mashups Slide 10 OSDC 2007, Brisbane
    11. Example: Pipelines HTTP://server1.example.com/processed_customer_list Databases SnapLogic Server Files Client HTTP HTTP Request and Component Component Component Applications Response Atom / RSS Resource Resource Resource JSON Definition Definition Definition Read Geocode Sort Data Integration with Server Side Mashups Slide 11 OSDC 2007, Brisbane
    12. A simple pipeline: Filtering leads Data Integration with Server Side Mashups Slide 12 OSDC 2007, Brisbane
    13. Linking fields in a pipeline Data Integration with Server Side Mashups Slide 13 OSDC 2007, Brisbane
    14. Reusing a pipeline as a resource Data Integration with Server Side Mashups Slide 14 OSDC 2007, Brisbane
    15. Reusing a pipeline as a resource Data Integration with Server Side Mashups Slide 15 OSDC 2007, Brisbane
    16. Reusing a pipeline as a resource Data Integration with Server Side Mashups Slide 16 OSDC 2007, Brisbane
    17. Adding new components For access logic • • For data transformations • Independent of data format • Currently written in Python Data Integration with Server Side Mashups Slide 17 OSDC 2007, Brisbane
    18. A simple processing component 1: class IncreaseSalary(DataComponent): 2: 3: def init(self): 4: '''Called when the component is started.''' 5: self.increase = float(self.moduleProperties['percent_increase']) 6: 7: def processRecord(self, record): 8: '''Called for every record.''' 9: record.fields['salary'] *= (1 + self.increase/100) 10: self.writeRecord(record) Data Integration with Server Side Mashups Slide 18 OSDC 2007, Brisbane
    19. An Apache log file reader 1: class LogReader(DataComponent): 2: 3: def startReading(self): 4: '''Called when component does not have input stream.''' 5: logfile = open(self._filename, 'rbU') 6: format = self.moduleProperties['log_format'] 7: 8: if format == 'COMMON': 9: p = apachelog.parser(apachelog.formats['common']) 10: elif ... 11: 12: # Read all lines in the logfile 13: for line in logile: 14: out_rec = Record(self.getSingleOutputView()) 15: raw_rec = p.parse(line) 16: out_rec.fields['remote_host'] = raw_rec['%h'] 17: out_rec.fields['client_id'] = raw_rec['%l'] 18: out_rec.fields['user'] = raw_rec['%u'] 19: out_rec.fields['server_status'] = int(raw_rec['%>s']) 20: out_rec.fields['bytes'] = int(raw_rec['%b']) 21: ... 22: 23: self.writeRecord(out_rec) Data Integration with Server Side Mashups Slide 19 OSDC 2007, Brisbane
    20. Programmatic access • GUI is nice, but still limiting • SnapScript: An API library • Python, PHP, more to come Data Integration with Server Side Mashups Slide 20 OSDC 2007, Brisbane
    21. Creating a resource 1: # Create a new resource 2: staff_res_def = Resource(component='SnapLogic.Components.CsvRead') 3: staff_res_def.props.URI = '/SnapLogic/Resources/Staff' 4: staff_res_def.props.description = 'Read the from the employee file' 5: staff_res_def.props.title = 'Staff' 6: staff_res_def.props.delimiter = '$?{DELIMITER}' 7: staff_res_def.props.filename = '$?{INPUTFILE}' 8: staff_res_def.props.parameters = ( 9: ('INPUTFILE', Param.Required, ''), 10: ('DELIMITER', Param.Optional, ',') 11: ) 12: 13: # Define the output view of the resource 14: staff_res_def.props.outputview.output1 = ( 15: ('Last_Name', 'string', 'Employee last name'), 16: ('First_Name', 'string', 'Employee first Name'), 17: ('Salary', 'number', 'Annual income') 18: ) Data Integration with Server Side Mashups Slide 21 OSDC 2007, Brisbane
    22. Creating a pipeline 1: # Create a new pipeline 2: p = Pipeline() 3: p.props.URI = '/SnapLogic/Pipelines/empl_salary_inc' 4: p.props.title = 'Employee_Salary_Increase' 5: 6: # Select the resources in the pipeline 7: p.resources.Staff = staff_res_def.instance() 8: p.resources.PayRaise = increase_salary_res_def.instance() 9: 10: # Link the resources in the pipeline 11: link = ( 12: ('Last_Name', 'last'), 13: ('First_Name', 'first'), 14: ('Salary', 'salary') 15: ) 16: p.linkViews('Staff', 'output1', 'Salary_Increaser', 'input1', link) Data Integration with Server Side Mashups Slide 22 OSDC 2007, Brisbane
    23. Pipeline parameters 1: # Define the user-visible parameters of the pipeline 2: p.props.parameters = ( 3: ('INCREASE', Param.Required, ''), 4: ) 5: 6: # Map values to the parameters of the pipeline's resources 7: p.props.parammap = ( 8: (Param.Parameter, 'INCREASE', 'PayRaise', 'PERC_INCREASE'), 9: (Param.Constant, 'file://foo/staff.csv', 'Staff', 'INPUTFILE') 10: ) 11: 12: # Confirm correctness and publish as a new resource 13: p.check() 14: p.saveToServer(connection) Data Integration with Server Side Mashups Slide 23 OSDC 2007, Brisbane
    24. The end Any questions? jbrendel@snaplogic.org Data Integration with Server Side Mashups Slide 24 OSDC 2007, Brisbane

    + jbrendeljbrendel, 3 years ago

    custom

    1033 views, 0 favs, 0 embeds more stats

    The open source SnapLogic data integration framewor more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 1033
      • 1033 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 44
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Groups / Events