2005 07 19 IVT Integration Techniques


Published on

July 2005 Presentation to the IVT ELN conference in Dublin covering Integration techniques

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

2005 07 19 IVT Integration Techniques

  1. 1. Integration Techniques for ELNs Simon Coles Co-founder & CTO
  2. 2. Integration Techniques for ELNs • My background • Why do we need to integrate ELNs? • Why kinds of integration do we need to do? • What prerequisites are there? • Some examples of technologies and techniques • Summary • You can download copies of this presentation from our web site http://www.amphora-research.com/ 2
  3. 3. My background • MEng in Information Systems Engineering • First “ELN” was a consulting project for Kodak • Started in 1996 • Completely electronic, fully integrated • Thousands of users, worldwide • This grew into Amphora • Merged with PatentPad in 2003 • Paper or electronic records according to legal preference • Scientists still get an “Electronic” system • Partner with a wide variety of “ELN” vendors • Member of CENSA, working on long term records, serving on Steering Team http://www.amphora-research.com/ 3
  4. 4. Experience • Primarily in ELNs for discovery • Where patents are a major concern • I am sure some of this is relevant to regulated areas, but that’s not my focus • Work a lot with other “ELN” vendors • Seldom do you buy one system • Which means we end up seeing a lot of integration! • In a variety of industries, all sizes of deployment • Pharma • Biotech • Chemicals • Customers around the world, offices in the US & the UK http://www.amphora-research.com/ 4
  5. 5. What’s an ELN? • The term “ELN” is now used to described a wide variety of systems • Science specific • Reaction planning tools, Cheminformatics databases, structure drawing tools • Analysis packages, LIMS • Workflow tools • General • Knowledge/Document Management • Scientific data management • Laptop/Tablet computers http://www.amphora-research.com/ 5
  6. 6. Observations • The term “ELN” • Is so ambiguous it can mean almost anything (especially to a marketing person) • Doesn’t help us much from a systems architecture perspective • A company is unlikely to have just one system that could be called an “ELN” • Those ELNs will need to integrate with your existing & future systems • Your needs will change with time, so you need to be able to protect your investment • In data • In tools • In processes http://www.amphora-research.com/ 6
  7. 7. Deconstructing “ELN” • At first sight an ELN project success can look very complex • ELN functionality can be split into two dimensions • Some aspects are common to everyone • Other requirements are specific to a particular group of scientists • Splitting out the functionality into these dimensions really helps to keep you sane “Broad” aspects Security, Collaboration, Patent Protection etc. A B C D http://www.amphora-research.com/ 7
  8. 8. Benefits • The corporate functions (Legal, Records, etc.) can buy/provide a system that provides a service to the niche-specific systems • Meet corporate requirements for records etc. • Provide a cross-discipline collaboration • The individual niches can buy/find systems to support their specific needs • Leverage existing investments • Justified according to the benefits they bring • Removes any need to balance competing requirements • Reduce the need • Systems can be acquired/purchased in a phased approach tailored to the needs & requirements of the business • Life is a lot less stressful http://www.amphora-research.com/ 8
  9. 9. Different levels of abstraction The “Experiment” is generally the boundary between Broad Vs Deep systems “Broad” aspects Projects Experiments Reports Raw Data A B C D http://www.amphora-research.com/ 9
  10. 10. Types of integration Broad/Deep boundary “Broad” aspects is often exposed as network-level services which are relatively standardized A B C D Integrations between different niche systems is generally custom http://www.amphora-research.com/ 10
  11. 11. What prerequisites are there? • From your ELN product(s) • Open Interfaces • Open Data • Plumbing • Various technologies, some simple, some more complex • Expertise - often in-house, sometimes consultants • Good news - the Open Source movement is really helpful • Tools & techniques • Drive for openness • Remember: you need to ask your vendor for all of the “Open” stuff before you sign the order http://www.amphora-research.com/ 11
  12. 12. Open Interfaces • What’s an “Interface”? • Where one system “prods” another to do something • Or get some information out • Or put some information in • Generally some data is passed back & forth • What’s “open”? • Something you can use without undue burden or barrier • This covers both commercial and technical aspects • Concerns are very similar to those involved with Open Data http://www.amphora-research.com/ 12
  13. 13. Open Data • This is currently a bit of a blind spot for purchasers of IT systems • Unfortunately, Open Data is absolutely critical • For long term records • For your ability to build up an integrated system • To protect your IP (partly from a patent perspective, but mainly from a re-use aspect) • To maintain a balanced relationship with your vendors • This absolutely needs to be part of the ELN purchasing process http://www.amphora-research.com/ 13
  14. 14. “Good” (open) file formats • Publicly documented • Legally unencumbered • No patents, copyright concerns etc. • Any patents or copyright must be in the public domain • Ideally, self documenting (XML is a good start) • Degrade gracefully • If you can’t the data, at least you can see a picture • Based on more open, primitive formats where possible • At least two implementations of readers, one of which is Open Source • Widely used (W3C or IETF standards are good signs) http://www.amphora-research.com/ 14
  15. 15. Data formats for the long term • Good • For text: Plain ASCII, Unicode, HTML, possibly RTF • For graphics: PNG, SVG • For structured data: XML • To preserve appearance: PDF • Worry about • Storing files in databases • The database file format is probably undocumented • Store objects on the file system and use the database to point to them • Anything that is proprietary - there’s no excuse for it, and it dramatically increases your risk • Binary files generally • Mixing content in files (e.g. embedding XML in PDF) • Proprietary digital signatures http://www.amphora-research.com/ 15
  16. 16. IP concerns & data formats • Companies have always used Proprietary Data Formats as a competitive weapon • Companies are waking up to the use of IP tools (licenses, patents, copyrights) to reinforce their control over data formats • Just because a format is published doesn’t mean it is open • The Microsoft Office XML formats are a particularly bad example • Right now it looks positively radioactive • They’re being very careful what they say which indicates to me they’re planning something • http://www.groklaw.net/article.php? story=20050330133833843 • (see section: 4. Dissecting Microsoft’s “Patent License”) http://www.amphora-research.com/ 16
  17. 17. Standards • There are so many to choose from! • Two key ways of generating “Standards” • De Facto - dominant supplier/format • De Jure - committee based • Who gets to “bless” a standard? • What makes a “good standard” • De Jure process has difficulty keeping up with the real world • De Facto process has risk of lock-in • Pragmatic approach • Expect your suppliers to use open file formats • If there is an acceptable standard, use it • Make sure you are using the right kind of format for each purpose http://www.amphora-research.com/ 17
  18. 18. Technologies and techniques • There are a wide variety of tools you can use to integrate IT systems • Tight Vs Loose coupling • Synchronous Vs Asynchronous • Text Vs Binary • Proprietary Vs Open • Simple Vs Complex • As a rule • Loose is cheaper than Tight coupling • Asynchronous is easier to manage than Synchronous • Text is easier to work with, and more flexible than Binary • Open interfaces are always better than Proprietary • Simple are better Complex approaches http://www.amphora-research.com/ 18
  19. 19. Considerations when picking tools • Use stable interfaces • Get a commitment from the vendor about what they’ll keep stable across version upgrades • Use public, documented interfaces • Sample code is really really useful • Pick language-neutral interfaces where possible • Platform-neutrality • Doesn’t worry (too much) about locking yourself into Windows on the client • But if you lock yourself to Windows on the server, it is going to hurt http://www.amphora-research.com/ 19
  20. 20. Glue Languages • There are a number of really useful “Glue” languages around • Python (and Jython, and other relatives) • Perl (although I have some concerns about maintainability) • Groovy, Beanshell, etc. • All of them • Play well with XML, http, SOAP etc. • Play well OLE • Are cross platform • My personal preference is Python • You can learn it in a matter of hours • You can read other people’s code • It does everything I need it to do http://www.amphora-research.com/ 20
  21. 21. Cool stuff • SOAP/Web Servers • Valuable in many areas • But don’t treat it as a religion • There are lighter alternatives which bring most of the benefits for much less effort • The whole WS-* effort seems to have got out of control • REST (XML over http) - a lighter alternative to SOAP • File swapping (generally, in XML) • HTTP GET/POST • Wonderfully easy to debug! • Very flexible http://www.amphora-research.com/ 21
  22. 22. Nice things to see • Integration points exposed as stable URLs • For example, our PatentSafe product, we have committed to stable URL formats to • Submit a record via http (content & metadata) • Get a record for display to the user • These can be used by other systems • And also embedded in Word documents... • Lack of wheel re-invention • e.g. LDAP is The One True place for user information • e.g. RSS/Atom is The One True alerting mechanism • Example code • In multiple languages http://www.amphora-research.com/ 22
  23. 23. Here be dragons • OLE - some times it is unavoidable (e.g. UI stuff), but avoid it when you can • Tight coupling • Buggy • Proprietary • Reduces your platform options • File format issues are awful • Version-to-version compatabilty is “interesting” • Direct database access • Tight coupling • Difficult to guarantee system integrity • If you wrote both systems you might want to do this http://www.amphora-research.com/ 23
  24. 24. Open Source • Definitely one to watch • Not the “Free” lunch you might think, but a pragmatic business too • Examples • Linux • Postgres • JBoss,Tomcat etc. • Ghostscript • Open Source is part of everyone’s infrastructure • Make sure you can run your systems on a variety of platforms http://www.amphora-research.com/ 24
  25. 25. Why? • Good for records • Gives you top-to-bottom control • Good for TCO • We’re finding the Open Source infrastructure easier to setup and reliable than proprietary alternatives • Enables a better solution • Transparent systems mean you can do things the original designers didn't think of • This is especially important for ELNs http://www.amphora-research.com/ 25
  26. 26. Other stuff to watch • XML generally (what did we ever do without it) • Jabber (as computer messaging and IM framework) • Portals & Portlets • Especially JSR168,WSRP • Remember you may well want to portalize any useful application • AJAX • Google is my hero • You can build usable, functional Web Applications • If you haven’t seen GMail I can send you an “invite” • VMWare - virtualize your world • Wow • Great for serve consolidation, great for testing, great for development • Wikis • Beginning to turn into a lightweight application environment http://www.amphora-research.com/ 26
  27. 27. Trends to watch • File format nasties • Closed/Private interfaces • Unlikely to be stable • DMCA and other copyright legislation http://www.amphora-research.com/ 27
  28. 28. Summary • You’ll be assembling an “ELN System” from a series of components • Some you have, some you’ll build, some you’ll buy • Get the open stuff before you sign the deal • Open, documented, stable interfaces • Open file formats • Use open, loosely coupled approaches where possible • If you can, keep the capability to own the integration issues in-house http://www.amphora-research.com/ 28
  29. 29. Contact information • Web site: http://www.amphora-research.com • EMail: simonc@amphora-research.com • Phone (US): (513) 697 4764 • Phone (UK): +44 (0)845 2300160 x2001 • AIM: simoncoles@mac.com • Skype: sjcoles http://www.amphora-research.com/ 29