Your SlideShare is downloading. ×
0
1
CrossRef 2010 Annual Member Meeting - London
Page 1
CrossRef Annual Meeting – London
Workshops
15 November 2010
2
CrossRef 2010 Annual Member Meeting - London
Page 2
Workshops Agenda
9:30-10:00 Coffee & Tea
10:00-11:30 System Update …...
3
CrossRef 2010 Annual Member Meeting - London
Page 3
System Update
System status
Rewrite review
Rewrite implementation
Di...
4
CrossRef 2010 Annual Member Meeting - London
Page 4
System status
5
CrossRef 2010 Annual Member Meeting - London
Page 5
System status
6
CrossRef 2010 Annual Member Meeting - London
Page 6
7
CrossRef 2010 Annual Member Meeting - London
Page 7
8
CrossRef 2010 Annual Member Meeting - London
Page 8
Old system
New Q system
The switch
9
CrossRef 2010 Annual Member Meeting - London
Page 9
System status
 Deposit processing
 Suspended for 2+ weekends for O...
10
CrossRef 2010 Annual Member Meeting - London
Page 10
11
CrossRef 2010 Annual Member Meeting - London
Page 11
12
CrossRef 2010 Annual Member Meeting - London
Page 12
System rewrite
 The Query System (QS), where are we?
 Its taking...
13
CrossRef 2010 Annual Member Meeting - London
Page 13
⋅ Modularity of design
⋅ Utility of APIs where possible
⋅ Data sto...
14
CrossRef 2010 Annual Member Meeting - London
Page 14
O Unit testing (regression testing)
O Scriptable data ingestion wo...
15
CrossRef 2010 Annual Member Meeting - London
Page 15
System rewrite
 Technical Objectives
 Rework a 9 year old system...
16
CrossRef 2010 Annual Member Meeting - London
Page 16
Late 2010 thru mid 2011
HAProxy
HTTP Traffic
MySQLLucene BerkelyDB...
17
CrossRef 2010 Annual Member Meeting - London
Page 17
Q3 2011
HAProxy
HTTP Traffic
MySQLLucene BerkelyDB
FrontEnd QS
(Sp...
18
CrossRef 2010 Annual Member Meeting - London
Page 18
Deposit DB
(prime)
Oracle Group
System rewrite
Deposit DB
(standby...
19
CrossRef 2010 Annual Member Meeting - London
Page 19
 Query system feature changes
 Tweaks to the matching logic (dis...
20
CrossRef 2010 Annual Member Meeting - London
Page 20
21
CrossRef 2010 Annual Member Meeting - London
Page 21
System rewrite
Simple Text Query
22
CrossRef 2010 Annual Member Meeting - London
Page 22
 Uses refXpress to break free-text into XML suitable for
running ...
23
CrossRef 2010 Annual Member Meeting - London
Page 23
 Uses QS Formatted Citation Parse to break free-text into
XML sui...
24
CrossRef 2010 Annual Member Meeting - London
Page 24
But be careful !
<citation key="b53_366">
<unstructured_citation>
...
25
CrossRef 2010 Annual Member Meeting - London
Page 25
 Deposit system feature changes
 Parse the XML prior to acceptin...
26
CrossRef 2010 Annual Member Meeting - London
Page 26
Andrew
Upcoming SlideShare
Loading in...5
×

System Update 2010 CrossRef Workshops Chuck Koscher

1,036

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,036
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
20
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "System Update 2010 CrossRef Workshops Chuck Koscher"

  1. 1. 1 CrossRef 2010 Annual Member Meeting - London Page 1 CrossRef Annual Meeting – London Workshops 15 November 2010
  2. 2. 2 CrossRef 2010 Annual Member Meeting - London Page 2 Workshops Agenda 9:30-10:00 Coffee & Tea 10:00-11:30 System Update ….. Andrew Gilmartin, Senior Software Developer Chuck Koscher, Director of Technology 11:30-12:00 CrossMark …………Geoff Bilder, Director of Strategic Initiatives 12:00-12:30 CrossCheck ………. Kirsty Meddings, Product Manager 12:30-1:15 Lunch 1:15-2:15 Metadata Quality ….Patricia Feeney, Product Support Manager 2:15-2:45 Cited-by Linking ……Carol Anne Meyer, Business Development and Marketing Manager Chuck Koscher 2:45-3:00 Break 3:00-4:00 DOI Workflow Issues, Working with Vendors ……. Carol Anne Meyer 4:00-4:45 Boot Camp …………Carol Anne Meyer Tim Pickard, System Support Analyst/Administrator 4:45-5:15 Books ……………….Carol Anne Meyer,
  3. 3. 3 CrossRef 2010 Annual Member Meeting - London Page 3 System Update System status Rewrite review Rewrite implementation Discussion
  4. 4. 4 CrossRef 2010 Annual Member Meeting - London Page 4 System status
  5. 5. 5 CrossRef 2010 Annual Member Meeting - London Page 5 System status
  6. 6. 6 CrossRef 2010 Annual Member Meeting - London Page 6
  7. 7. 7 CrossRef 2010 Annual Member Meeting - London Page 7
  8. 8. 8 CrossRef 2010 Annual Member Meeting - London Page 8 Old system New Q system The switch
  9. 9. 9 CrossRef 2010 Annual Member Meeting - London Page 9 System status  Deposit processing  Suspended for 2+ weekends for Oracle DB upgrade (to 11g)  Processing times remain the same. (50% under 5 min, 30% more under 1 hour)  Large re-deposits (Elsevier plans for 2011)  Schema relatively unchanged in 2+ years (keep adding MIME types)  Deposit focus areas for 2011 (other than the re-write)  Investigating a PDF upload option (for depositing a DOI and the article’s references)  Modify WebDeposit to allow users to edit an existing DOI’s metadata  Maintenance on NLM DTD deposit tool
  10. 10. 10 CrossRef 2010 Annual Member Meeting - London Page 10
  11. 11. 11 CrossRef 2010 Annual Member Meeting - London Page 11
  12. 12. 12 CrossRef 2010 Annual Member Meeting - London Page 12 System rewrite  The Query System (QS), where are we?  Its taking longer than we thought.  QS is 99% ready, periodically in service since starting mid Sept.  Last vexing problem solved (database connection dead-lock)?  Performance improvement is very encouraging.  Metrics and measurement capability greatly improved.  The Deposit System (DS), where are we?  Initial design discussions have been held, documentation is under way.  Implementation to start in January  Development will take until mid year, then lots of testing  Data clean up will be part of the migration process (mainly titles) 
  13. 13. 13 CrossRef 2010 Annual Member Meeting - London Page 13 ⋅ Modularity of design ⋅ Utility of APIs where possible ⋅ Data stores that enable XML capabilities ⋅ Minimize dependency on proprietary systems •That CrossRef should ultimately own the intellectual property in the software at the heart of its operations • That CrossRef should not risk or jeopardize the reliability and throughput offered by the existing system • That CrossRef should remain free to develop further applications for other purposes which need to interface to the reference-linking systems and/or its data System rewrite  Rewrite 2 Working Group – Final report November 2008
  14. 14. 14 CrossRef 2010 Annual Member Meeting - London Page 14 O Unit testing (regression testing) O Scriptable data ingestion work flow F Richer metadata querying capability F Integrated data harvesting capabilities F Dealing with references using other character sets F Crawling of content to ingest it Vs. making deposits F Depositing of non journal content F Matching unstructured references using full text of equiv F Querying of non journal content F Real time, cited-by queries - with data-driven APIs F More content types, including language variants F More granular typing of journal articles F Improved reporting facilities F More useful user interface for members System rewrite  Rewrite 2 Working Group – Final report November 2008 A Solve NFS issue A Federate architecture A Database redesign A Redesign event notification model (replace email) O Improved title management and control O Better publisher/member management model O Daily testing/monitoring (data integrity) O Built in health and status monitoring O Performance improvements and queue management Now Soon Later
  15. 15. 15 CrossRef 2010 Annual Member Meeting - London Page 15 System rewrite  Technical Objectives  Rework a 9 year old system  Address a declining performance situation  Improve administrative aspects (better control and reporting)  Facilitate extensibility  Staff’s better able to respond due to operational insight  Business Objectives  Develop internal capabilities ($ for every change Atypon makes)  Secure an independent path (continuity)  Benefit of being on a ‘shared’ platform nearing zero  Maintain access to technical expertise
  16. 16. 16 CrossRef 2010 Annual Member Meeting - London Page 16 Late 2010 thru mid 2011 HAProxy HTTP Traffic MySQLLucene BerkelyDB FrontEnd QS (Spring) (Tomcat) Deposit System (old Atypon EDS) BackEnd ServicesActive MQ (messaging) Oracle (prime) Oracle (active-stndby) Constant Replication Oracle Group New System External messaging (email, etc) System rewrite
  17. 17. 17 CrossRef 2010 Annual Member Meeting - London Page 17 Q3 2011 HAProxy HTTP Traffic MySQLLucene BerkelyDB FrontEnd QS (Spring) (Tomcat) BackEnd ServicesActive MQ (messaging) Oracle (prime) Oracle (active-stndby) Constant Replication Oracle Group New System External messaging (email, etc) Deposit Processing FrontEnd DS (Spring) (Tomcat) • File Upload • Deposit reports System rewrite
  18. 18. 18 CrossRef 2010 Annual Member Meeting - London Page 18 Deposit DB (prime) Oracle Group System rewrite Deposit DB (standby) Oracle Replication Query DB (prime) Query DB (secondary) Oracle Replication New Deposit System Database Updater Primary Datacenter Deposit DB (prime) Query DB (prime) Recovery Datacenter
  19. 19. 19 CrossRef 2010 Annual Member Meeting - London Page 19  Query system feature changes  Tweaks to the matching logic (discoveries made porting the code)  Fixed some nagging characteristics  Aggregate email notices for alerts  Implement HTTP free-text matching (still needs work, ‘alpha’)  Process free-text references for cited-by (done, stable, uses refXpress)  Establish better user model: 1. Username & passwords for members (Query and deposit) 2. Registered email address of non members (Query only) System rewrite Use Registration Form Receive Email Use Validation Form
  20. 20. 20 CrossRef 2010 Annual Member Meeting - London Page 20
  21. 21. 21 CrossRef 2010 Annual Member Meeting - London Page 21 System rewrite Simple Text Query
  22. 22. 22 CrossRef 2010 Annual Member Meeting - London Page 22  Uses refXpress to break free-text into XML suitable for running a metadata query
  23. 23. 23 CrossRef 2010 Annual Member Meeting - London Page 23  Uses QS Formatted Citation Parse to break free-text into XML suitable for running a metadata query, if that fails uses QS Formatted Citation Search (with high threshold) to search Lucene index for a DOI.
  24. 24. 24 CrossRef 2010 Annual Member Meeting - London Page 24 But be careful ! <citation key="b53_366"> <unstructured_citation> 53. O.S. Gudmundsson, S.D.S. Jois, D.G. Vander Velde, T.J. Siahaan, B. Wang, and R.T. Borchardt (1999 ) The effect of conformation on the membrane permeability of coumarinic acid- and phenylpropionic acid-based cyclic prodrugs of opioid peptides.J. Pept. Res.53 , 383 -392 . </unstructured_citation> </citation> <doi type="journal_article"> 10.1034/j.1399-3011.1999.00076.x</doi> <issn type="print">1397-002X</issn> <issn type="electronic">1399-3011</issn> <journal_title>Journal of Peptide Research</journal_title> <contributors> <contributor sequence="first" contributor_role="author"> <given_name>O.S.</given_name> <surname>Gudmundsson</surname> </contributor> </contributors> <volume>53</volume> <issue>4</issue> <first_page>383</first_page> <last_page>392</last_page> <year media_type="print">1999</year> <publication_type>full_text</publication_type> <article_title> The effect of conformation on the membrane permeation of coumarinic acid- and phenylpropionic acid-based cyclic prodrugs of opioid peptides </article_title> <doi type="journal_article"> 10.1034/j.1399-3011.1999.00077.x</doi> <issn type="print">1397-002X</issn> <issn type="electronic">1399-3011</issn> <journal_title>Journal of Peptide Research</journal_title> <contributors> <contributor sequence="first" contributor_role="author"> <given_name>O.S.</given_name> <surname>Gudmundsson</surname> </contributor> </contributors> <volume>53</volume> <issue>4</issue> <first_page>403</first_page> <last_page>413</last_page> <year media_type="print">1999</year> <publication_type>full_text</publication_type> <article_title> The effect of conformation of the acyloxyalkoxy-based cyclic prodrugs of opioid peptides on their membrane permeability </article_title> Still yields this But the correct answer is this
  25. 25. 25 CrossRef 2010 Annual Member Meeting - London Page 25  Deposit system feature changes  Parse the XML prior to accepting the upload  Process XML, register DOIs regardless of metadata ingestion problems  Provide aggregated deposit reports (daily?)  Integrate Schematron checks into deposit process  Robust title ownership model, not based on prefix, with shared ownership options  Separate deposit metadata organization from query metadata organization (ex. Allow title substitution System rewrite
  26. 26. 26 CrossRef 2010 Annual Member Meeting - London Page 26 Andrew
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×