Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

2005 07 19 IVT Integration Techniques


Published on

July 2005 Presentation to the IVT ELN conference in Dublin covering Integration techniques

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

2005 07 19 IVT Integration Techniques

  1. 1. Integration Techniques for ELNs Simon Coles Co-founder & CTO
  2. 2. Integration Techniques for ELNs • My background • Why do we need to integrate ELNs? • Why kinds of integration do we need to do? • What prerequisites are there? • Some examples of technologies and techniques • Summary • You can download copies of this presentation from our web site 2
  3. 3. My background • MEng in Information Systems Engineering • First “ELN” was a consulting project for Kodak • Started in 1996 • Completely electronic, fully integrated • Thousands of users, worldwide • This grew into Amphora • Merged with PatentPad in 2003 • Paper or electronic records according to legal preference • Scientists still get an “Electronic” system • Partner with a wide variety of “ELN” vendors • Member of CENSA, working on long term records, serving on Steering Team 3
  4. 4. Experience • Primarily in ELNs for discovery • Where patents are a major concern • I am sure some of this is relevant to regulated areas, but that’s not my focus • Work a lot with other “ELN” vendors • Seldom do you buy one system • Which means we end up seeing a lot of integration! • In a variety of industries, all sizes of deployment • Pharma • Biotech • Chemicals • Customers around the world, offices in the US & the UK 4
  5. 5. What’s an ELN? • The term “ELN” is now used to described a wide variety of systems • Science specific • Reaction planning tools, Cheminformatics databases, structure drawing tools • Analysis packages, LIMS • Workflow tools • General • Knowledge/Document Management • Scientific data management • Laptop/Tablet computers 5
  6. 6. Observations • The term “ELN” • Is so ambiguous it can mean almost anything (especially to a marketing person) • Doesn’t help us much from a systems architecture perspective • A company is unlikely to have just one system that could be called an “ELN” • Those ELNs will need to integrate with your existing & future systems • Your needs will change with time, so you need to be able to protect your investment • In data • In tools • In processes 6
  7. 7. Deconstructing “ELN” • At first sight an ELN project success can look very complex • ELN functionality can be split into two dimensions • Some aspects are common to everyone • Other requirements are specific to a particular group of scientists • Splitting out the functionality into these dimensions really helps to keep you sane “Broad” aspects Security, Collaboration, Patent Protection etc. A B C D 7
  8. 8. Benefits • The corporate functions (Legal, Records, etc.) can buy/provide a system that provides a service to the niche-specific systems • Meet corporate requirements for records etc. • Provide a cross-discipline collaboration • The individual niches can buy/find systems to support their specific needs • Leverage existing investments • Justified according to the benefits they bring • Removes any need to balance competing requirements • Reduce the need • Systems can be acquired/purchased in a phased approach tailored to the needs & requirements of the business • Life is a lot less stressful 8
  9. 9. Different levels of abstraction The “Experiment” is generally the boundary between Broad Vs Deep systems “Broad” aspects Projects Experiments Reports Raw Data A B C D 9
  10. 10. Types of integration Broad/Deep boundary “Broad” aspects is often exposed as network-level services which are relatively standardized A B C D Integrations between different niche systems is generally custom 10
  11. 11. What prerequisites are there? • From your ELN product(s) • Open Interfaces • Open Data • Plumbing • Various technologies, some simple, some more complex • Expertise - often in-house, sometimes consultants • Good news - the Open Source movement is really helpful • Tools & techniques • Drive for openness • Remember: you need to ask your vendor for all of the “Open” stuff before you sign the order 11
  12. 12. Open Interfaces • What’s an “Interface”? • Where one system “prods” another to do something • Or get some information out • Or put some information in • Generally some data is passed back & forth • What’s “open”? • Something you can use without undue burden or barrier • This covers both commercial and technical aspects • Concerns are very similar to those involved with Open Data 12
  13. 13. Open Data • This is currently a bit of a blind spot for purchasers of IT systems • Unfortunately, Open Data is absolutely critical • For long term records • For your ability to build up an integrated system • To protect your IP (partly from a patent perspective, but mainly from a re-use aspect) • To maintain a balanced relationship with your vendors • This absolutely needs to be part of the ELN purchasing process 13
  14. 14. “Good” (open) file formats • Publicly documented • Legally unencumbered • No patents, copyright concerns etc. • Any patents or copyright must be in the public domain • Ideally, self documenting (XML is a good start) • Degrade gracefully • If you can’t the data, at least you can see a picture • Based on more open, primitive formats where possible • At least two implementations of readers, one of which is Open Source • Widely used (W3C or IETF standards are good signs) 14
  15. 15. Data formats for the long term • Good • For text: Plain ASCII, Unicode, HTML, possibly RTF • For graphics: PNG, SVG • For structured data: XML • To preserve appearance: PDF • Worry about • Storing files in databases • The database file format is probably undocumented • Store objects on the file system and use the database to point to them • Anything that is proprietary - there’s no excuse for it, and it dramatically increases your risk • Binary files generally • Mixing content in files (e.g. embedding XML in PDF) • Proprietary digital signatures 15
  16. 16. IP concerns & data formats • Companies have always used Proprietary Data Formats as a competitive weapon • Companies are waking up to the use of IP tools (licenses, patents, copyrights) to reinforce their control over data formats • Just because a format is published doesn’t mean it is open • The Microsoft Office XML formats are a particularly bad example • Right now it looks positively radioactive • They’re being very careful what they say which indicates to me they’re planning something • story=20050330133833843 • (see section: 4. Dissecting Microsoft’s “Patent License”) 16
  17. 17. Standards • There are so many to choose from! • Two key ways of generating “Standards” • De Facto - dominant supplier/format • De Jure - committee based • Who gets to “bless” a standard? • What makes a “good standard” • De Jure process has difficulty keeping up with the real world • De Facto process has risk of lock-in • Pragmatic approach • Expect your suppliers to use open file formats • If there is an acceptable standard, use it • Make sure you are using the right kind of format for each purpose 17
  18. 18. Technologies and techniques • There are a wide variety of tools you can use to integrate IT systems • Tight Vs Loose coupling • Synchronous Vs Asynchronous • Text Vs Binary • Proprietary Vs Open • Simple Vs Complex • As a rule • Loose is cheaper than Tight coupling • Asynchronous is easier to manage than Synchronous • Text is easier to work with, and more flexible than Binary • Open interfaces are always better than Proprietary • Simple are better Complex approaches 18
  19. 19. Considerations when picking tools • Use stable interfaces • Get a commitment from the vendor about what they’ll keep stable across version upgrades • Use public, documented interfaces • Sample code is really really useful • Pick language-neutral interfaces where possible • Platform-neutrality • Doesn’t worry (too much) about locking yourself into Windows on the client • But if you lock yourself to Windows on the server, it is going to hurt 19
  20. 20. Glue Languages • There are a number of really useful “Glue” languages around • Python (and Jython, and other relatives) • Perl (although I have some concerns about maintainability) • Groovy, Beanshell, etc. • All of them • Play well with XML, http, SOAP etc. • Play well OLE • Are cross platform • My personal preference is Python • You can learn it in a matter of hours • You can read other people’s code • It does everything I need it to do 20
  21. 21. Cool stuff • SOAP/Web Servers • Valuable in many areas • But don’t treat it as a religion • There are lighter alternatives which bring most of the benefits for much less effort • The whole WS-* effort seems to have got out of control • REST (XML over http) - a lighter alternative to SOAP • File swapping (generally, in XML) • HTTP GET/POST • Wonderfully easy to debug! • Very flexible 21
  22. 22. Nice things to see • Integration points exposed as stable URLs • For example, our PatentSafe product, we have committed to stable URL formats to • Submit a record via http (content & metadata) • Get a record for display to the user • These can be used by other systems • And also embedded in Word documents... • Lack of wheel re-invention • e.g. LDAP is The One True place for user information • e.g. RSS/Atom is The One True alerting mechanism • Example code • In multiple languages 22
  23. 23. Here be dragons • OLE - some times it is unavoidable (e.g. UI stuff), but avoid it when you can • Tight coupling • Buggy • Proprietary • Reduces your platform options • File format issues are awful • Version-to-version compatabilty is “interesting” • Direct database access • Tight coupling • Difficult to guarantee system integrity • If you wrote both systems you might want to do this 23
  24. 24. Open Source • Definitely one to watch • Not the “Free” lunch you might think, but a pragmatic business too • Examples • Linux • Postgres • JBoss,Tomcat etc. • Ghostscript • Open Source is part of everyone’s infrastructure • Make sure you can run your systems on a variety of platforms 24
  25. 25. Why? • Good for records • Gives you top-to-bottom control • Good for TCO • We’re finding the Open Source infrastructure easier to setup and reliable than proprietary alternatives • Enables a better solution • Transparent systems mean you can do things the original designers didn't think of • This is especially important for ELNs 25
  26. 26. Other stuff to watch • XML generally (what did we ever do without it) • Jabber (as computer messaging and IM framework) • Portals & Portlets • Especially JSR168,WSRP • Remember you may well want to portalize any useful application • AJAX • Google is my hero • You can build usable, functional Web Applications • If you haven’t seen GMail I can send you an “invite” • VMWare - virtualize your world • Wow • Great for serve consolidation, great for testing, great for development • Wikis • Beginning to turn into a lightweight application environment 26
  27. 27. Trends to watch • File format nasties • Closed/Private interfaces • Unlikely to be stable • DMCA and other copyright legislation 27
  28. 28. Summary • You’ll be assembling an “ELN System” from a series of components • Some you have, some you’ll build, some you’ll buy • Get the open stuff before you sign the deal • Open, documented, stable interfaces • Open file formats • Use open, loosely coupled approaches where possible • If you can, keep the capability to own the integration issues in-house 28
  29. 29. Contact information • Web site: • EMail: • Phone (US): (513) 697 4764 • Phone (UK): +44 (0)845 2300160 x2001 • AIM: • Skype: sjcoles 29