ELN Architecture


                        Simon Coles
            President & CTO, Amphora Research Systems
So...

    •   You’re on holiday one day
    •   Doing your normal thing
    •   And then you get the call...
    •   they...
http://www.amphora-research.com/
3
ELN architecture

    •   Hopefully
         •   I am not going to self-destruct
         •   Your project won’t be as exc...
Introduction

    •   About me
         •   Started working with ELNs in ‘96
         •   President & Co-founder of Amphor...
This presentation

    •   You can download a copy of this presentation from
        our web site




    http://www.ampho...
Why does architecture matter?

    •   A good architecture can help
         •   Integrate “Best of breed” tools with exis...
ELN architecture

    •   Major issues
         •   Diversity & Flexibility
         •   Project size/Justification/ROI
   ...
Diversity & Flexibility

    •   “Science” covers a wide variety of activity
    •   Each of these is served by its own in...
Dealing with change

     •   Build on other projects & integrate
          •   if it can be done within another project, ...
Loosely-Coupled
                Systems Keep You
                      Sane


     http://www.amphora-research.com/
11
Project size/Justification/ROI

     •   Two approaches
     •   Either attempt to justify the whole ELN in one go
        ...
Phased ELNs

     •   Historically this was very difficult to do with ELNs
          •   Record keeping
          •   Integ...
Creating & Preserving Evidence for Patents


     •   Specialized area with very specific (and unique)
         considerati...
Paper or Electronic?

      •            The choice often comes down to
                    •   Comfort
                  ...
Long term access to ELN content

     •   Partly this is records management issue
     •   But there’s a heavy technical c...
“Good” (open) file formats

     •   Publicly documented
     •   Legally unencumbered
          •   No patents, copyright ...
Data formats for the long term

     •   Good
          •   For text: Plain ASCII, Unicode, HTML, possibly RTF
          •...
IP concerns & data formats

     •   Companies have always used Proprietary Data
         Formats as a competitive weapon
...
Standards

     •   There are so many to choose from!
     •   Two key ways of generating “Standards”
          •   De Fac...
Records considerations

     •   Not all the “Stuff” that’s generated during the
         research process is the same
   ...
Scalability

     •   Geographical space
          •   In wide area networks, latency becomes the most
              notic...
Latency

     •   The science-specific “Deep” systems
          •   Often highly interactive
                •    Lots of r...
Web-based systems

     •   “Web based” has become a bit of a marketing tool
          •   Generally thin clients offer a ...
How your network can help you

     •   There’s a whole load of useful network services
         and Interfaces that large...
ELN architecture

     •   Major issues
          •   Diversity & Flexibility
          •   Project size/Justification/ROI
...
Integration methods

     •   RPC-like mechanisms
          •   Service Oriented Architecture
          •   SOAP
         ...
Open Source

     •   Definitely one to watch
     •   Not the “Free” lunch you might think, but a
         pragmatic busin...
Why?

     •   Good for records
          •   Gives you top-to-bottom control
     •   Good for TCO
          •   We’re fin...
Data point

     •   This is just our experience offering people
         alternatives for the server portion
     •   200...
In the lab

     •   ELN use in the lab is a hard problem
     •   Tablets, Laptops, Palmtops etc. doesn’t seem to be
    ...
Ones to watch

     •   Technology
          •   XML generally
          •   Web Services
          •   Bluetooth and WiFi...
Upcoming SlideShare
Loading in...5
×

2005 04 05 SRI ELN Architecture

313

Published on

General presentation on ELN architecture delivered to SRI ELN conference in April 2005. Covers a lot of generic stuff.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
313
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

2005 04 05 SRI ELN Architecture

  1. 1. ELN Architecture Simon Coles President & CTO, Amphora Research Systems
  2. 2. So... • You’re on holiday one day • Doing your normal thing • And then you get the call... • they want an ELN! http://www.amphora-research.com/ 2
  3. 3. http://www.amphora-research.com/ 3
  4. 4. ELN architecture • Hopefully • I am not going to self-destruct • Your project won’t be as exciting • Your task is to • Deliver a state-of-the-art ELN system • In tight timescales • With limited budget • In the real world • That the users like • And will serve you for many years http://www.amphora-research.com/ 4
  5. 5. Introduction • About me • Started working with ELNs in ‘96 • President & Co-founder of Amphora • IT background • First ELN was enterprise-scale ELN for Kodak • Worldwide, 1,000’s of users, diverse user base • Completely Electronic Records (no paper) • After a long & windy road • New products, lots more deployments, many industries • Certain amount of realism about ELN implementation • Provide Patent Evidence Creation & Preservation Systems • Work with a wide variety of “ELN” systems etc. • Now based in the US & UK http://www.amphora-research.com/ 5
  6. 6. This presentation • You can download a copy of this presentation from our web site http://www.amphora-research.com/ 6
  7. 7. Why does architecture matter? • A good architecture can help • Integrate “Best of breed” tools with existing investments • Allow you to split the project into manageable pieces • Ensure you don’t get “captured” by the vendor • Help your system withstand the ravages of time • Keep your TCO down • A bad architecture will hurt • Reliability, Scalability problems • Reduce your options going forward • Force you into “Big bang” project • Some random thoughts on architecture http://www.amphora-research.com/ 7
  8. 8. ELN architecture • Major issues • Diversity & Flexibility • Project size/Justification/ROI • Creating & Preserving Evidence for Patents • Need for long term access to ELN contents • Scalability • Web-based systems • How your network can help you • Trends • Integration methods • Open Source • In the lab • Ones to watch http://www.amphora-research.com/ 8
  9. 9. Diversity & Flexibility • “Science” covers a wide variety of activity • Each of these is served by its own industry • Improvements in each area needs to happen at its own pace • Things change • Different techniques • New data types • Another R&D centre • New devices for use in the lab • The very essence of “Research” is to change the way you work • How do we design an ELN which can accommodate these changes? http://www.amphora-research.com/ 9
  10. 10. Dealing with change • Build on other projects & integrate • if it can be done within another project, then do so • Keeps your life simpler and more focused, clear aims • Those other projects can proceed according to the rhythm and needs of the specific area • Where possible employ loose coupling between systems • Message passing reduces implementation complexity • SOAP/OLE/XML etc. http://www.amphora-research.com/ 10
  11. 11. Loosely-Coupled Systems Keep You Sane http://www.amphora-research.com/ 11
  12. 12. Project size/Justification/ROI • Two approaches • Either attempt to justify the whole ELN in one go (“Big bang”) • Or Phased • Divide the project into phases • Each involves a smaller investment (risk) • With a corresponding payoff • Move forward at a pace that’s comfortable for the business http://www.amphora-research.com/ 12
  13. 13. Phased ELNs • Historically this was very difficult to do with ELNs • Record keeping • Integration with other systems • Needs to be designed into the project (& product) from the start • Patent evidence creation/preservation system • Generic science-neutral platform (can often be your existing IT infrastructure) • Integrate/collaborate with discipline-specific software • When you can do it, makes a huge difference • Can start at a departmental level if needed • Asking the business to take a small risk each time http://www.amphora-research.com/ 13
  14. 14. Creating & Preserving Evidence for Patents • Specialized area with very specific (and unique) considerations • Best done separately from science-specific ELN tools • Hard to reconcile requirements of science and records in one system • You’ll often have a number of science-focused systems, yet want only one Patent evidence system • Run by a small group of people who know they’ll end up in court • Reduce risks & discovery costs • You can have an “Electronic” notebook for the scientist and still create a paper record http://www.amphora-research.com/ 14
  15. 15. Paper or Electronic? • The choice often comes down to • Comfort • Practicality • Cost Paper System Cost Electronic 10 100 500 1000 http://www.amphora-research.com/ 15
  16. 16. Long term access to ELN content • Partly this is records management issue • But there’s a heavy technical component • What format you store your data in • How you store your data • Metadata • You need to make Open Data formats part of your purchasing requirements http://www.amphora-research.com/ 16
  17. 17. “Good” (open) file formats • Publicly documented • Legally unencumbered • No patents, copyright concerns etc. • Any patents or copyright must be in the public domain • Ideally, self documenting (XML is a good start) • Degrade gracefully • If you can’t the data, at least you can see a picture • Based on more open, primitive formats where possible • At least two implementations of readers, one of which is Open Source • Widely used (W3C or IETF standards are good signs) http://www.amphora-research.com/ 17
  18. 18. Data formats for the long term • Good • For text: Plain ASCII, Unicode, HTML, possibly RTF • For graphics: PNG, SVG • For structured data: XML • To preserve appearance: PDF • Worry about • Storing files in databases • The database file format is probably undocumented • Store objects on the file system and use the database to point to them • Anything that is proprietary - there’s no excuse for it, and it dramatically increases your risk • Binary files generally • Mixing content in files (e.g. embedding XML in PDF) • Proprietary digital signatures http://www.amphora-research.com/ 18
  19. 19. IP concerns & data formats • Companies have always used Proprietary Data Formats as a competitive weapon • Companies are waking up to the use of IP tools (licenses, patents, copyrights) to reinforce their control over data formats • Just because a format is published doesn’t mean it is open • The Microsoft Office XML formats are a particularly bad example • Right now it looks positively radioactive • They’re being very careful what they say which indicates to me they’re planning something • http://www.groklaw.net/article.php? story=20050330133833843 • (see section: 4. Dissecting Microsoft’s “Patent License”) http://www.amphora-research.com/ 19
  20. 20. Standards • There are so many to choose from! • Two key ways of generating “Standards” • De Facto - dominant supplier/format • De Jure - committee based • Who gets to “bless” a standard? • What makes a “good standard” • De Jure process has difficulty keeping up with the real world • De Facto process has risk of lock-in • Pragmatic approach • Expect your suppliers to use open file formats • If there is an acceptable standard, use it • Make sure you are using the right kind of format for each purpose http://www.amphora-research.com/ 20
  21. 21. Records considerations • Not all the “Stuff” that’s generated during the research process is the same • Some of if needs to be kept for a long time • Some is only useful for the moment • Some will be benefit anyone • Some is only really useful for the person who created it (using specialized tools) • Some material is suitable for long term preservation, some isn’t • You can go crazy getting into this in too much detail • But you also need to make sure your tools and processes do allow you to manage the data/ records you’re creating http://www.amphora-research.com/ 21
  22. 22. Scalability • Geographical space • In wide area networks, latency becomes the most noticeable issue • Over multiple timezones, acceptable “Maintenance Windows” disappear • More data • Number of data items • Size of individual data items • Number of users • Larger populations generally mean more disparate requirements • How many people will get upset if the system goes down http://www.amphora-research.com/ 22
  23. 23. Latency • The science-specific “Deep” systems • Often highly interactive • Lots of round trips to the server for data etc. • This is what makes them cool • You can’t beat the speed of light (and network hardware add significant latency) • Therefore need to have a server close to the end user • Federation will give you a single overview • “Broad” systems have different usage characteristics • Very much like a normal web site, latency is much less of a problem • Very easy to have one system for worldwide use, even for large companies • Building large systems quite easy http://www.amphora-research.com/ 23
  24. 24. Web-based systems • “Web based” has become a bit of a marketing tool • Generally thin clients offer a lower TCO • And hence IT like them • In practice, most science-supporting ELN front ends will be delivered as a “thick” client • There’s a reason it’s called a browser • Wrapping an OLE object in IE is still “thick” • However, “Ajax” systems like GMail and Google Maps show just what you can do with a web-based system • Web based systems should expose a sensbiel URL interface http://www.amphora-research.com/ 24
  25. 25. How your network can help you • There’s a whole load of useful network services and Interfaces that large companies have • Useful ones • Single Sign On • LDAP • Printer/Fileserver etc. • Security/Status monitoring etc. • Beware of Central Digital Signature Infrastructure • Mixing vulnerabilities - leaves you open to accidents • Often not designed for long term use http://www.amphora-research.com/ 25
  26. 26. ELN architecture • Major issues • Diversity & Flexibility • Project size/Justification/ROI • Creating & Preserving Evidence for Patents • Need for long term access to ELN contents • Scale • Web-based systems • Trends • Integration methods • Open Source • In the lab • Ones to watch http://www.amphora-research.com/ 26
  27. 27. Integration methods • RPC-like mechanisms • Service Oriented Architecture • SOAP • REST • Text file passing (files, email, etc.) • URL launching • Often overlooked, but very powerful • What’s important • Loose-coupling • Open, lightweight systems • Consistent, stable keys • Stable URL (& domain) space http://www.amphora-research.com/ 27
  28. 28. Open Source • Definitely one to watch • Not the “Free” lunch you might think, but a pragmatic business too • Examples • Linux • Postgres • JBoss,Tomcat etc. • Ghostscript • Open Source is part of everyone’s infrastructure • Make sure you can run your systems on a variety of platforms http://www.amphora-research.com/ 28
  29. 29. Why? • Good for records • Gives you top-to-bottom control • Good for TCO • We’re finding the Open Source infrastructure easier to setup and reliable than proprietary alternatives • Enables a better solution • Transparent systems mean you can do things the original designers didn't think of • This is especially important for ELNs http://www.amphora-research.com/ 29
  30. 30. Data point • This is just our experience offering people alternatives for the server portion • 2000 - “What's Open Source? What’s Linux?” • 2001 - No way! • 2002 - some pilots underway, some acceptance • 2003 - majority of installations are Open Source infrastructure • 2005 - we’re wondering where Windows is • We’re not abandoning proprietary infrastructure • But it is clear that Open Source is getting serious consideration • Seeing a migration away from proprietary infrastructure to Open Source http://www.amphora-research.com/ 30
  31. 31. In the lab • ELN use in the lab is a hard problem • Tablets, Laptops, Palmtops etc. doesn’t seem to be working • What does seem to work • Small form-factor PCs on the bench • Remote Desktop & Citrix http://www.amphora-research.com/ 31
  32. 32. Ones to watch • Technology • XML generally • Web Services • Bluetooth and WiFi • RSS • OpenOffice • Jabber (as computer messaging and IM framework) • Trends • File format nasties • DMCA and other copyright legislation http://www.amphora-research.com/ 32
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×