SlideShare a Scribd company logo
1 of 48
NISO Webinar:
Cloud and Web Services for Librarians
Wednesday, October 14, 2015
Presenters:
John “JG” Chirapurath,
Senior Vice President and General Manager, ProQuest Workflow Solutions
Kurt Ewoldsen,
Manager, Infrastructure and Applications Support, California Digital Library,
University of California
Heather Lea Moulaison,
Assistant Professor, The iSchool (School of Information Science & Learning
Technologies), University of Missouri
http://www.niso.org/news/events/2015/webinars/cloud_services/
Utilizing the Cloud to
Empower Research Efforts
John "JG" Chirapurath, Senior Vice President and
General Manager, ProQuest Workflow Solutions
Growing Content Volume & Diversity
Print
Electronic
Digital
+
+
Derived from US Department of Education, NCES Academic Libraries Survey, 1998-2008.
Electronic Resources are the
Majority of Content in Collections
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
1998 2000 2002 2004 2006 2008 2014 2020
Print Books and Jounals
.......
You
are
here
Projected change
.......
Academic Library Expenditures on Purchased and Licensed Content
Cloud-based Systems Benefit
Libraries
• Cloud technologies help libraries avoid spending
limited budgets on IT expenses and system
management
– Reduction in hardware costs, decrease in IT support
time and expense, more easily and rapidly updated
systems
– Provides a full disaster recovery system that libraries
don’t have to administer
• Reallocate budget to additional premium content
that benefit researchers
• Use librarians’ time to share experience and
insights with researchers which brings better value
Cloud-based Systems Support
Scalability & Researchers’ Access Needs
• Cloud-based systems’ scalability inherently
manages electronic resources better - key with
expanding volumes of electronic and digital
content
• Better enable collaboration between librarians
and researchers in multiple
locations
• Support research anytime,
anywhere, on any device
Research & Publishing Has Changed
• Emphasis on more frequent publishing in
academia
– Researchers want to share near real-time
insights gained from open access information
resources
• Focus on research as a competitive
advantage
• Trend of interdisciplinary research and
collaboration
Cloud-based Systems Improve
Discoverability & Collaboration
• Easier to find and share information and
resources – content discoverable more
rapidly
– Tools like the Summon® service make new
content discoverable more rapidly
• 2.5 billion+ records covering more than 90 different
content types and more than 10,000 providers
• Better supports multi-location collaboration &
access on any device
– RefWorks and other reference management
solutions make it easier for researchers to share
and collaborate
Next Generation LSPs Offer
Comprehensive Resource Management
• Supporting management of print & electronic
resources across the entire collection
– Unified workflows
• Assess, manage and track resources across
multiple location/geographic boundaries
• Ability to share e-resources and information for
databases and holdings across different
locations and types of libraries
• Better supports evolving role of librarians
And in the Future: Linked Data
• Cloud-based technologies are what make
Linked Data possible
• Allows libraries to expose relationships among
entities that users can easily follow
• Improves user navigation between
related resources, concepts, and
entities – more serendipitous
discovery
Library => Virtual Knowledge Center
• More diverse content types and growing body of
knowledge accessible through the library
• Researchers accessing information more fluidly
– More seamless collaboration across locations
– Faster insights leading to interdisciplinary
breakthroughs
• Librarians as value-added information
professionals collaborating with researchers at
later stages
Thank you!
Migrating CDL IT Infrastructure
to the AWS Cloud
Presented by:
Kurt Ewoldsen
Director of Infrastructure & Application Support
Kurt.Ewoldsen@ucop.edu
www.cdlib.org
The California Digital Library (CDL)
• Mission: The California Digital Library exists to support the
University of California community’s pursuit of scholarship and to
extend the University’s public service mission.
• Vision: The California Digital Library’s vision is to elevate the digital
library for UC so that it becomes "expansively global and deeply
local". CDL will advance the digital transition of scholarly
information in three spheres:
– Access: Scholars will have access to the highest quality research
collections worldwide through services that support and enable new
scholarship and make it as open as possible.
– Formats: CDL will support all digital formats throughout their life cycle
with a full range of services, especially to surface UC’s unique digital
assets and collections.
– Scale: Through partnerships and alliances, CDL will elevate services to
the network level for maximum impact.
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 14
CDL Infrastructure Timeline
Past Environments across 2 data centers
Migration to VMware environment underway
~90 VMs & ~45 physical systems (Sun/Solaris)
~200TB of SAN storage
2 small AWS accounts for grant-related work
Present Environments across 3 data centers
Final Sun/Solaris systems retired; migration to AWS underway
~30 VMs & ~200TB of SAN storage
AWS account with ~100 EC2 instances & ~50 RDS instances
Future Environments in a single platform (AWS)
VMware environment retired; no more physical infrastructure, all
equipment decommissioned
AWS account with ~150 EC2 instances & ~60 RDS instances
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 15
CDL Services in the Cloud
Customer Services Infrastructure Services
UC Libraries web site Nagios
CDL web site Tripwire
Calisphere LDAP
eScholarship Puppet Enterprise
EZID FTPS/SFTP
Online Archive of California NFS Home Server
Request (ILL) Bastion Servers
UC Library Reprints
HathiTrust Zephir
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 16
Why use the Cloud?
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 17
Cost
ScalabilityAgility
Think Best Value, Not Lowest Cost
• Business case for AWS migration not based on
decreased annual cost (although savings are
expected)
• AWS platform provides significant benefits for the
same spend: HA, DR, managed services (RDS),
and more
• Biggest cost savings is actually cost avoidance
– CDL was due to spend ~$750,000 to refresh
infrastructure equipment over the next 2 years; this is
now unnecessary
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 18
Vendor Management Cost Savings
• To support our physical infrastructure, we had
annual maintenance contracts with 10+
different vendors, totaling ~$60,000/year
• To support our AWS infrastructure, we have a
monthly maintenance fee with a single
vendor, with an annual cost of ~$30,000
• In addition to these direct cost savings, we
regain a significant amount of time formerly
spent on vendor/contract management
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 19
Cost Reporting
• In the typical shared or virtual environment, it is
difficult to determine the actual infrastructure
costs for any given service
• AWS has cost reporting down to a science, and I
now know in detail the infrastructure costs of
every service we support, and can share that
information with CDL managers and staff
• Good cost information leads to good decisions
regarding service development & deployment
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 20
Current AWS Costs
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 21
Cost Reporting By Tags - Monthly
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 22
Cost Reporting By Tags - Daily
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 23
Agility & Scalability
• Our annual budget cycle requires that we forecast
infrastructure needs 18 months in advance, sometimes
more
• However, we frequently get last minute requests for
extraordinary service: significant amounts of data to
archive or processing to complete
• In our legacy environment, these requests were almost
impossible to accept
• In AWS, we have virtually unlimited capacity (provided
someone is willing to pay for it!)
• Conversely, since we have no long-term investment in
infrastructure, we can also scale down to ensure our costs
decrease if service volume or use decreases
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 24
Cost > Innovation > Lower Cost
Application Profile An existing service that gets an occasional data upload that
must be ingested and indexed for discovery
Legacy Configuration Single, large shared system running multiple applications at
the same time.
1st AWS Iteration Application running on a large instance, waiting for data to be
uploaded and then processing the data. The instance mostly
sits idle; so not a very cost-effective solution.
2nd AWS Iteration Watchdog process running a single, small instance waiting for
data to be uploaded. When data is detected, multiple large
instances are started to process the data, then shut down.
This decreases the processing duration and the cost at the
same time.
3rd AWS Iteration Watchdog process running a single, small reserved instance
waiting for data to be uploaded. When data is detected,
multiple large instances are purchased on the “spot” market
and started to process the data, then shut down. Further
reduces costs.
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 25
Grant-funded Efforts
• AWS accounts are a good way to handle grant-
funded efforts that have to be delivered upon
completion (e.g. www.archivesspace.org)
• Simply create a separate AWS account, then
configure the environment and develop the
application
• Collaboration is easy, as you control access to the
environment
• At the end of the grant, simply sign the AWS
account over the to appropriate party (provide
the access keys and update the billing entity)
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 26
Challenges of the Cloud Migration
• Migrations are extra work
• Re-architecting applications for the new
platform is even more work
• The cloud provides more variable
performance
– between local and cloud environments
– between processing cycles in the cloud
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 27
More Challenges
• The rapid rate of change in AWS can make
staying current a challenge; in fact just
understanding all of the AWS services and
potential benefits is a difficult task
• Systems still hang, crash and reboot in the
cloud and because you don’t manage the
hardware directly, you have even less visibility
into the conditions behind these events
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 28
What’s Next
• Complete the migration (by July 2016)
• Move from On-demand to Reserved instances
• Move all appropriate services to a H/A architecture
• Create DR capability in an alternate region
• Evaluate OpsWorks to replace Puppet
• Provide performance & cost visibility to application
development teams
• CloudTrail notifications from CloudWatch
• Look at new databases (Aurora, Redshift, DynamoDB)
• Elastic Beanstalk pilot underway (Iaas > PaaS)
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 29
Thanks
• Special thanks to my system administration
team and CDL application development
teams, who accepted my vision for a better
future, adopted it as their own, and worked
tirelessly to turn the plan into reality. The
success we have achieved to date is due solely
to their dedication and professionalism.
10/1/2015 Migrating CDL Infrastructure to AWS - NISO 30
Surveying the Horizon:
Preservation and the Cloud
October 14, 2015 NISO Webinar:
Cloud and Web Services for Librarians
Heather Lea Moulaison
iSchool at the University of Missouri
Rationale
• Preserving digital information is becoming an ever
more important role of libraries and archives in our
digital society
• Cloud computing and, more specifically, cloud-based
data storage offers some potential to help address the
issues information professionals face
• Yet, not everyone agrees that cloud computing is the best
solution
• By examining the problem, reviewing the literature, and
assessing best practices, it is possible to understand
some of the issues that need to be considered
Agenda
• What is digital preservation and why is it
important?
• Definitions
• Digital preservation challenges and opportunities
• What is cloud computing?
• Definitions
• Cloud computing challenges and opportunities
• Digital preservation and cloud computing
• Examples and case studies
• Benefits and risks
• Discussion: best practices and strategies
What is Digital Preservation?
• “Digital preservation combines policies, strategies and
actions to ensure access to reformatted and born
digital content regardless of the challenges of media
failure and technological change. The goal of digital
preservation is the accurate rendering of authenticated
content over time” (ALCTS, 2007).
• A number of other definitions exist
• All basically require information professionals to ensure
digital content is accessible/usable over a long period of
time.
• To do this requires the right people with the right
technology who have the support and vision they need.
Digital Preservation: Challenges
and Opportunities
• Challenges
• The nature of digital files
• DP is more complex than simply backing up a file
• DP requires additional technical maintenance
along with safeguarding context
• Threats include
• Technological
• Meaning-related (if metadata is lost, the context is
not clear)
• Budgetary support
• Opportunities
• Continued relevancy in digital age
• Ensure access to digital content for future
generations
• Fulfill legal requirements
https://s.yimg.com/fz/api/res/1.2/01LjnAfcStgmzNU
Q1ESasw--
/YXBwaWQ9c3JjaGRkO2g9NDA0O3E9OTU7dz0zNzA-
/http://www.mrmartinweb.com/images/type/broth
erwp3400.jpg
What is Cloud Computing?
• “A model for enabling convenient, on-demand network access
to a shared pool of configurable computing resources (e.g.
networks, servers, storage, applications, and services) that can be
rapidly provisioned and released with minimal management
effort or service provider interaction” (Mell & Grance, 2011 p. 1).
• As with digital preservation, a number of definitions exist
• All basically assume that information is being stored remotely through
the internet.
• Examples of cloud service provider and services: Amazon
(http://www.amazon.com/)
• Amazon Elastic Compute Cloud (EC2)
• Amazon Simple Storage Service (S3)
• Amazon Glacier
The NIST Cloud Computing Service
Models
http://www.servercloudcanada.com/2013/10/defining-the-cloud/
Cloud Computing: Challenges and
Opportunities
• Challenges
• Security in the cloud is a concern for many and can be perceived as
both a positive and a negative aspect of cloud computing
• Placing trust in a third party to store content can also be a positive
or a negative
• The uncertain future: it is possible, but not likely, that a cloud
computing company will go out of business, taking a library’s data
with it
• When data is stored in the cloud, the cloud provider has access to it
• Opportunities
• Perceived lowering of technical and financial barriers
• Increased flexibility to quickly meet increased demands, disaster
recovery, decreased software application and server maintenance,
decreased capital expenses, increased security, and cloud
computing is more environmentally friendly (Salesforce, 2015)
• Geographic diversity of server locations helps!
Digital Preservation and Cloud
Computing: Turnkey Solutions
• Examples:
• Preservica Cloud Edition: SaaS model.
• DuraSpace’s DuraCloud: cloud storage from either Amazon or the
San Diego Supercomputing Center (Schumacher et al., 2014).
• Cloud computing is widely utilized in industry, but Srivastava
and Verma (2015) observe that the “Application of cloud
computing in libraries is a relatively new area as compared
to its applications in business and corporate sector” (p. 33).
• Some information professional points of view:
• There are at least some benefits to using cloud computing for
digital preservation
• Questions: Is cloud computing truly the best approach for digital
preservation? If so, what might be the caveats?
Research into Digital Preservation
and Cloud Computing
• National Digital Stewardship Alliances’ Infrastructure Working
Group survey (Bailey, 2012):
• 74% of the members had a strong preference for controlling their
own preservation storage systems because of cost concerns,
trustworthiness, legal issues and security and risk management.
• Researchers from the University of Cape Town (Poulo, Phiri, &
Suleman, 2014):
• showed that digital library applications in the cloud can provide
adequate response time and that the response time is not
significantly affected by complexity or collection sizes.
• But… is the elasticity the cloud provides worth the cost?
• “Cloud storage really is cheaper if your demand is spiky, but digital
preservation is the canonical base-load application” (Rosenthal,
2014)
• Miller (2014) points out, “some of the most compelling attributes of
the public cloud are best suited to ephemeral or (relatively!) short-
term use cases” (¶ 2).
Research on Costs
• Rosenthal and Vargas (2013) performed a study in
which they experimented with running a Lots of Copies
Keep Stuff Safe (LOCKSS) (http://www.lockss.org/) box
on Amazon’s EC2 cloud backed with Amazon S3 cloud
storage.
• Their study concluded “that current cloud storage services
are not cost-competitive with local hardware for long term
storage, including for LOCKSS boxes” (Rosenthal & Vargas,
2013, p. 107).
• Han (2015), after reviewing this and other studies, “believes
that the combination of big price drops and free data transfer
within the same data zone makes the cloud storage a very
attractive solution for long-term digital preservation,
especially using [Amazon] Glacier” (p. 266).
• Benefits of cloud computing (e.g. geographically-
diverse storage) mean it is not a one-to-one
comparison.
• As Fryer and Brown (2014) write, more study in this area is
Best Practices/Case Studies
• Central Connecticut State University, USA (Iglesias,
2011; Iglesias & Meesangnil, 2010) decided to use
Amazon S3 for storing their long-term archival
masters for digital preservation purposes.
• In evaluating their decision, they determined that using
Amazon S3 for storage was a very good one for them:
costs were low and there was no downtime in the first
year of use (Iglesias & Meesangnil, 2010).
• The Parliamentary Archives, Houses of Parliament,
London, UK, adopted third-party cloud storage to
use with their Preserivca Enterprise digital
preservation system (Fryer & Brown, 2014). In order
to minimize risk, the Parliamentary Archives decided
to:
• Keep sensitive data local
A Graceful Exit
• An exit strategy is important to consider when
choosing to use a cloud computing service.
• Exit strategies are “often neglected because few want to
consider the demise of what is, at the moment, a
seemingly wonderful solution, being adopted and
implemented with great effort and expectation”
(Schaffer, 2014, p. 4).
• But… how does the library get its data back (Robinson,
2015)?
Conclusion
• Digital preservation is not easy. Although there are many
technical threats to digital preservation, ultimately digital
preservation is not merely a technological challenge.
• Policies and procedures need to be in place to ensure ongoing
digital preservation.
• Cloud computing can be used successfully and
economically for digital preservation in the appropriate
situations.
• Platforms used for digital preservation should be routinely reviewed
and revaluated.
• Digital presentation, in the cloud or locally, is not something
that can be done once and then forgotten about.
• Information professionals need to review both their digital
preservation strategy and the applicability of cloud computing as
part of that strategy on an ongoing basis.
Questions?
Thank you!
References
• ALCTS. (2007, 24 June). Definitions of Digital Preservation. Retrieved from http://www.ala.org/alcts/resources/preserv/defdigpres0408
• Anderson, M. (2011, 7 September). B is for Bit Preservation. The Signal. Retrieved from http://blogs.loc.gov/digitalpreservation/2011/09/b-is-for-bit-preservation/
• Corrado, E. M, & Moulaison, H. L. (2015, August 12). Digital preservation and the cloud: Challenges and opportunities. IFLA 2015 Pre-Conference Satellite Meeting Preservation &
Conservation Section, Durban, South Africa, 12-13 August, 2015.
• Fryer, C. & Brown, A. (2014). Case study: Archives in the cloud: Challenges and opportunities, In B. Endicott-Popovsky (Ed.), International Conference on Cloud Security Management
ICCSM-2014: ICCSM2014. London, UK.
• Han, Y. (2015). Cloud storage for digital preservation: Optimal uses of Amazon S3 and Glacier. Library Hi Tech, 33(2), 261-271.
• Iglesias, E. (2011). Using Windows Home Server and Amazon S3 to back up high-resolution digital objects to the cloud. In E. M. Corrado & H. L. Moulaison (Eds.), Getting started with
cloud computing (pp. 143-151). New York: Neal-Schuman.
• Iglesias, E., &Meesangnil, W. (2010). “Amazon S3 in digital preservation in a mid-sized academic library: a case study of CCSU ERIS digital archive system”, The Code4Lib Journal, 12.
Retrieved from: http://journal.code4lib.org/articles/4468
• Mell, P., & Grance, T. (2011, September). The NIST definition of cloud computing, Special Publication 800-145, National Institute of Standards and Technology. Retrieved from:
http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf
• Poulo, L., Phiri, L., & Suleman, H. (2014). Fine-grained scalability of digital library services in the cloud. In J. P. van Deventer, M. C. Matthee, H. Gelderblom, and A. Gerber
(Eds.), Proceedings of the Southern African Institute for Computer Scientist and Information Technologists Annual Conference 2014 onSAICSIT 2014 Empowered by
Technology (SAICSIT '14) (pp. 157-165), ACM, New York, NY, USA. http://doi.acm.org/10.1145/2664591.2664611
• Rackspace Support (2013, October 22). Understanding the Cloud Computing Stack: SaaS, PaaS, IaaS. Retrieved from
https://www.rackspace.com/knowledge_center/whitepaper/understanding-the-cloud-computing-stack-saas-paas-iaas
• Robinson, J. D. (2015). "The dogs bark and the circus moves on", The Bottom Line: Managing library finances, 28, (1/2), 7-18. http://dx.doi.org/10.1108/BL-01-2015-0002
• Rosenthal, D. H. S. & Vargas, D. L. (2013). Distributed digital preservation in the cloud, The International Journal of Digital Curation, 8(1). http://dx.doi.org/10.2218/ijdc.v8i1.248
• Rosenthal, D. H. S. (2014). Talk "Costs: Why do we care?" [blog post]. Retrieved from http://blog.dshr.org/2014/11/talk-costs-why-do-we-care.html
• Ross, S. (2007). Digital preservation, archival science and methodological foundations for digital libraries: Keynote address at the 11th European Conference on Digital Libraries
(ECDL), Budapest (17 September 2007). Retrieved from http://glasgowsciencefestival.org.uk/media/media_113621_en.pdf
• Schaffer, H. (2014 March/April). Will you ever need an exit strategy? IT Pro. Retrieved from : http://www.computer.org/csdl/mags/it/2014/02/mit2014020004.pdf
• Stark, L., & Tierney, M. (2014). Lockbox: Mobility, privacy and values in cloud storage. Ethics and Information Technology, 16(1), pp 1-13.
NISO Webinar • October 14, 2015
Questions?
All questions will be posted with presenter answers on
the NISO website following the webinar:
http://www.niso.org/news/events/2015/webinars/cloud_services/
October 14 NISO Webinar
Cloud and Web Services for Librarians
Thank you for joining us today.
Please take a moment to fill out the brief online survey.
We look forward to hearing from you!
THANK YOU

More Related Content

What's hot

What's hot (20)

Lagace Presentation on the NISO Open Access Metadata and Indicators Project a...
Lagace Presentation on the NISO Open Access Metadata and Indicators Project a...Lagace Presentation on the NISO Open Access Metadata and Indicators Project a...
Lagace Presentation on the NISO Open Access Metadata and Indicators Project a...
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Rusbridge Feb 8 Improving Clarity around Continuing Access
Rusbridge Feb 8 Improving Clarity around Continuing AccessRusbridge Feb 8 Improving Clarity around Continuing Access
Rusbridge Feb 8 Improving Clarity around Continuing Access
 
UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...
UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...
UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...
 
2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery
 
2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery
 
Think like a Digital Curator
Think like a Digital CuratorThink like a Digital Curator
Think like a Digital Curator
 
UKSG webinar - Current Research Information Systems (CRIS): What are they and...
UKSG webinar - Current Research Information Systems (CRIS): What are they and...UKSG webinar - Current Research Information Systems (CRIS): What are they and...
UKSG webinar - Current Research Information Systems (CRIS): What are they and...
 
Ilik - Beyond the Manuscript: Using IRs for Non Traditional Content Types
Ilik - Beyond the Manuscript: Using IRs for Non Traditional Content TypesIlik - Beyond the Manuscript: Using IRs for Non Traditional Content Types
Ilik - Beyond the Manuscript: Using IRs for Non Traditional Content Types
 
Finnie NISO-ICSTI Joint Webinar
Finnie NISO-ICSTI Joint WebinarFinnie NISO-ICSTI Joint Webinar
Finnie NISO-ICSTI Joint Webinar
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Towards effective research recommender systems for repositories
Towards effective research recommender systems for repositoriesTowards effective research recommender systems for repositories
Towards effective research recommender systems for repositories
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Standardising research data policies, research data network
Standardising research data policies, research data networkStandardising research data policies, research data network
Standardising research data policies, research data network
 
AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...
AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...
AgriVIVO: A Global Ontology-Driven RDF Store Based on a Distributed Architect...
 
2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery
 
Wilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of FedoraWilcox - Open Source Repositories and the Future of Fedora
Wilcox - Open Source Repositories and the Future of Fedora
 
Hoeppner Feb 8 Imagining Better E-Resource Access
Hoeppner Feb 8 Imagining Better E-Resource AccessHoeppner Feb 8 Imagining Better E-Resource Access
Hoeppner Feb 8 Imagining Better E-Resource Access
 

Similar to Oct 14 NISO Webinar: Cloud and Web Services for Libraries

Cloud Computing:An Economic Solution for Libraries
Cloud Computing:An Economic Solution for LibrariesCloud Computing:An Economic Solution for Libraries
Cloud Computing:An Economic Solution for Libraries
Amit Shaw
 

Similar to Oct 14 NISO Webinar: Cloud and Web Services for Libraries (20)

Jeff Kratz - Cloud Computing
Jeff Kratz - Cloud ComputingJeff Kratz - Cloud Computing
Jeff Kratz - Cloud Computing
 
교육의 진화, 클라우드는 어떤 역할을 하는가 :: Vincent Quah :: AWS Summit Seoul 2016
교육의 진화, 클라우드는 어떤 역할을 하는가 :: Vincent Quah :: AWS Summit Seoul 2016교육의 진화, 클라우드는 어떤 역할을 하는가 :: Vincent Quah :: AWS Summit Seoul 2016
교육의 진화, 클라우드는 어떤 역할을 하는가 :: Vincent Quah :: AWS Summit Seoul 2016
 
Cloud Computing:An Economic Solution for Libraries
Cloud Computing:An Economic Solution for LibrariesCloud Computing:An Economic Solution for Libraries
Cloud Computing:An Economic Solution for Libraries
 
(ISM203) Enterprise Cloud Adoption Strategies in Higher Education
(ISM203) Enterprise Cloud Adoption Strategies in Higher Education(ISM203) Enterprise Cloud Adoption Strategies in Higher Education
(ISM203) Enterprise Cloud Adoption Strategies in Higher Education
 
HBX: Harvard Business School's Digital Education Goes Data-Centric with Amaz...
HBX:  Harvard Business School's Digital Education Goes Data-Centric with Amaz...HBX:  Harvard Business School's Digital Education Goes Data-Centric with Amaz...
HBX: Harvard Business School's Digital Education Goes Data-Centric with Amaz...
 
Next Generation Education: Technology in the Classroom and Beyond
Next Generation Education: Technology in the Classroom and BeyondNext Generation Education: Technology in the Classroom and Beyond
Next Generation Education: Technology in the Classroom and Beyond
 
Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...
Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...
Cloud Native, Cloud First and Hybrid: How Different Organizations are Approac...
 
Migrating Enterprise Apps
Migrating Enterprise AppsMigrating Enterprise Apps
Migrating Enterprise Apps
 
Cloud presentation NELA
Cloud presentation NELACloud presentation NELA
Cloud presentation NELA
 
CLOUD COMPUTING INTRODUCTION WITH DIAGRAM.ppt
CLOUD COMPUTING INTRODUCTION WITH DIAGRAM.pptCLOUD COMPUTING INTRODUCTION WITH DIAGRAM.ppt
CLOUD COMPUTING INTRODUCTION WITH DIAGRAM.ppt
 
Demystifying Cloud Computing
Demystifying Cloud Computing Demystifying Cloud Computing
Demystifying Cloud Computing
 
Introduction to Cloud Computing in Computer.ppt
Introduction to Cloud Computing in Computer.pptIntroduction to Cloud Computing in Computer.ppt
Introduction to Cloud Computing in Computer.ppt
 
E04432934
E04432934E04432934
E04432934
 
How Enterprises are Using NoSQL for Mission-Critical Applications
How Enterprises are Using NoSQL for Mission-Critical ApplicationsHow Enterprises are Using NoSQL for Mission-Critical Applications
How Enterprises are Using NoSQL for Mission-Critical Applications
 
Top 10 Enterprise Use Cases for NoSQL
Top 10 Enterprise Use Cases for NoSQLTop 10 Enterprise Use Cases for NoSQL
Top 10 Enterprise Use Cases for NoSQL
 
Cloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard UniversityCloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard University
 
Security Requires Visibility-Turn Data Into Security Insight
Security Requires Visibility-Turn Data Into Security InsightSecurity Requires Visibility-Turn Data Into Security Insight
Security Requires Visibility-Turn Data Into Security Insight
 
Amazon AWS vs Azure Cloud vs Kubernetes
Amazon AWS vs Azure Cloud vs KubernetesAmazon AWS vs Azure Cloud vs Kubernetes
Amazon AWS vs Azure Cloud vs Kubernetes
 
AWS Toronto Content Production symposium - Welcome
AWS Toronto Content Production symposium - WelcomeAWS Toronto Content Production symposium - Welcome
AWS Toronto Content Production symposium - Welcome
 
AWS Cloud Solution - An Overview
AWS Cloud Solution - An OverviewAWS Cloud Solution - An Overview
AWS Cloud Solution - An Overview
 

More from National Information Standards Organization (NISO)

More from National Information Standards Organization (NISO) (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 

Oct 14 NISO Webinar: Cloud and Web Services for Libraries

  • 1. NISO Webinar: Cloud and Web Services for Librarians Wednesday, October 14, 2015 Presenters: John “JG” Chirapurath, Senior Vice President and General Manager, ProQuest Workflow Solutions Kurt Ewoldsen, Manager, Infrastructure and Applications Support, California Digital Library, University of California Heather Lea Moulaison, Assistant Professor, The iSchool (School of Information Science & Learning Technologies), University of Missouri http://www.niso.org/news/events/2015/webinars/cloud_services/
  • 2. Utilizing the Cloud to Empower Research Efforts John "JG" Chirapurath, Senior Vice President and General Manager, ProQuest Workflow Solutions
  • 3. Growing Content Volume & Diversity Print Electronic Digital + +
  • 4. Derived from US Department of Education, NCES Academic Libraries Survey, 1998-2008. Electronic Resources are the Majority of Content in Collections 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 1998 2000 2002 2004 2006 2008 2014 2020 Print Books and Jounals ....... You are here Projected change ....... Academic Library Expenditures on Purchased and Licensed Content
  • 5. Cloud-based Systems Benefit Libraries • Cloud technologies help libraries avoid spending limited budgets on IT expenses and system management – Reduction in hardware costs, decrease in IT support time and expense, more easily and rapidly updated systems – Provides a full disaster recovery system that libraries don’t have to administer • Reallocate budget to additional premium content that benefit researchers • Use librarians’ time to share experience and insights with researchers which brings better value
  • 6. Cloud-based Systems Support Scalability & Researchers’ Access Needs • Cloud-based systems’ scalability inherently manages electronic resources better - key with expanding volumes of electronic and digital content • Better enable collaboration between librarians and researchers in multiple locations • Support research anytime, anywhere, on any device
  • 7. Research & Publishing Has Changed • Emphasis on more frequent publishing in academia – Researchers want to share near real-time insights gained from open access information resources • Focus on research as a competitive advantage • Trend of interdisciplinary research and collaboration
  • 8. Cloud-based Systems Improve Discoverability & Collaboration • Easier to find and share information and resources – content discoverable more rapidly – Tools like the Summon® service make new content discoverable more rapidly • 2.5 billion+ records covering more than 90 different content types and more than 10,000 providers • Better supports multi-location collaboration & access on any device – RefWorks and other reference management solutions make it easier for researchers to share and collaborate
  • 9. Next Generation LSPs Offer Comprehensive Resource Management • Supporting management of print & electronic resources across the entire collection – Unified workflows • Assess, manage and track resources across multiple location/geographic boundaries • Ability to share e-resources and information for databases and holdings across different locations and types of libraries • Better supports evolving role of librarians
  • 10. And in the Future: Linked Data • Cloud-based technologies are what make Linked Data possible • Allows libraries to expose relationships among entities that users can easily follow • Improves user navigation between related resources, concepts, and entities – more serendipitous discovery
  • 11. Library => Virtual Knowledge Center • More diverse content types and growing body of knowledge accessible through the library • Researchers accessing information more fluidly – More seamless collaboration across locations – Faster insights leading to interdisciplinary breakthroughs • Librarians as value-added information professionals collaborating with researchers at later stages
  • 13. Migrating CDL IT Infrastructure to the AWS Cloud Presented by: Kurt Ewoldsen Director of Infrastructure & Application Support Kurt.Ewoldsen@ucop.edu www.cdlib.org
  • 14. The California Digital Library (CDL) • Mission: The California Digital Library exists to support the University of California community’s pursuit of scholarship and to extend the University’s public service mission. • Vision: The California Digital Library’s vision is to elevate the digital library for UC so that it becomes "expansively global and deeply local". CDL will advance the digital transition of scholarly information in three spheres: – Access: Scholars will have access to the highest quality research collections worldwide through services that support and enable new scholarship and make it as open as possible. – Formats: CDL will support all digital formats throughout their life cycle with a full range of services, especially to surface UC’s unique digital assets and collections. – Scale: Through partnerships and alliances, CDL will elevate services to the network level for maximum impact. 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 14
  • 15. CDL Infrastructure Timeline Past Environments across 2 data centers Migration to VMware environment underway ~90 VMs & ~45 physical systems (Sun/Solaris) ~200TB of SAN storage 2 small AWS accounts for grant-related work Present Environments across 3 data centers Final Sun/Solaris systems retired; migration to AWS underway ~30 VMs & ~200TB of SAN storage AWS account with ~100 EC2 instances & ~50 RDS instances Future Environments in a single platform (AWS) VMware environment retired; no more physical infrastructure, all equipment decommissioned AWS account with ~150 EC2 instances & ~60 RDS instances 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 15
  • 16. CDL Services in the Cloud Customer Services Infrastructure Services UC Libraries web site Nagios CDL web site Tripwire Calisphere LDAP eScholarship Puppet Enterprise EZID FTPS/SFTP Online Archive of California NFS Home Server Request (ILL) Bastion Servers UC Library Reprints HathiTrust Zephir 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 16
  • 17. Why use the Cloud? 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 17 Cost ScalabilityAgility
  • 18. Think Best Value, Not Lowest Cost • Business case for AWS migration not based on decreased annual cost (although savings are expected) • AWS platform provides significant benefits for the same spend: HA, DR, managed services (RDS), and more • Biggest cost savings is actually cost avoidance – CDL was due to spend ~$750,000 to refresh infrastructure equipment over the next 2 years; this is now unnecessary 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 18
  • 19. Vendor Management Cost Savings • To support our physical infrastructure, we had annual maintenance contracts with 10+ different vendors, totaling ~$60,000/year • To support our AWS infrastructure, we have a monthly maintenance fee with a single vendor, with an annual cost of ~$30,000 • In addition to these direct cost savings, we regain a significant amount of time formerly spent on vendor/contract management 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 19
  • 20. Cost Reporting • In the typical shared or virtual environment, it is difficult to determine the actual infrastructure costs for any given service • AWS has cost reporting down to a science, and I now know in detail the infrastructure costs of every service we support, and can share that information with CDL managers and staff • Good cost information leads to good decisions regarding service development & deployment 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 20
  • 21. Current AWS Costs 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 21
  • 22. Cost Reporting By Tags - Monthly 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 22
  • 23. Cost Reporting By Tags - Daily 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 23
  • 24. Agility & Scalability • Our annual budget cycle requires that we forecast infrastructure needs 18 months in advance, sometimes more • However, we frequently get last minute requests for extraordinary service: significant amounts of data to archive or processing to complete • In our legacy environment, these requests were almost impossible to accept • In AWS, we have virtually unlimited capacity (provided someone is willing to pay for it!) • Conversely, since we have no long-term investment in infrastructure, we can also scale down to ensure our costs decrease if service volume or use decreases 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 24
  • 25. Cost > Innovation > Lower Cost Application Profile An existing service that gets an occasional data upload that must be ingested and indexed for discovery Legacy Configuration Single, large shared system running multiple applications at the same time. 1st AWS Iteration Application running on a large instance, waiting for data to be uploaded and then processing the data. The instance mostly sits idle; so not a very cost-effective solution. 2nd AWS Iteration Watchdog process running a single, small instance waiting for data to be uploaded. When data is detected, multiple large instances are started to process the data, then shut down. This decreases the processing duration and the cost at the same time. 3rd AWS Iteration Watchdog process running a single, small reserved instance waiting for data to be uploaded. When data is detected, multiple large instances are purchased on the “spot” market and started to process the data, then shut down. Further reduces costs. 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 25
  • 26. Grant-funded Efforts • AWS accounts are a good way to handle grant- funded efforts that have to be delivered upon completion (e.g. www.archivesspace.org) • Simply create a separate AWS account, then configure the environment and develop the application • Collaboration is easy, as you control access to the environment • At the end of the grant, simply sign the AWS account over the to appropriate party (provide the access keys and update the billing entity) 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 26
  • 27. Challenges of the Cloud Migration • Migrations are extra work • Re-architecting applications for the new platform is even more work • The cloud provides more variable performance – between local and cloud environments – between processing cycles in the cloud 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 27
  • 28. More Challenges • The rapid rate of change in AWS can make staying current a challenge; in fact just understanding all of the AWS services and potential benefits is a difficult task • Systems still hang, crash and reboot in the cloud and because you don’t manage the hardware directly, you have even less visibility into the conditions behind these events 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 28
  • 29. What’s Next • Complete the migration (by July 2016) • Move from On-demand to Reserved instances • Move all appropriate services to a H/A architecture • Create DR capability in an alternate region • Evaluate OpsWorks to replace Puppet • Provide performance & cost visibility to application development teams • CloudTrail notifications from CloudWatch • Look at new databases (Aurora, Redshift, DynamoDB) • Elastic Beanstalk pilot underway (Iaas > PaaS) 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 29
  • 30. Thanks • Special thanks to my system administration team and CDL application development teams, who accepted my vision for a better future, adopted it as their own, and worked tirelessly to turn the plan into reality. The success we have achieved to date is due solely to their dedication and professionalism. 10/1/2015 Migrating CDL Infrastructure to AWS - NISO 30
  • 31. Surveying the Horizon: Preservation and the Cloud October 14, 2015 NISO Webinar: Cloud and Web Services for Librarians Heather Lea Moulaison iSchool at the University of Missouri
  • 32. Rationale • Preserving digital information is becoming an ever more important role of libraries and archives in our digital society • Cloud computing and, more specifically, cloud-based data storage offers some potential to help address the issues information professionals face • Yet, not everyone agrees that cloud computing is the best solution • By examining the problem, reviewing the literature, and assessing best practices, it is possible to understand some of the issues that need to be considered
  • 33. Agenda • What is digital preservation and why is it important? • Definitions • Digital preservation challenges and opportunities • What is cloud computing? • Definitions • Cloud computing challenges and opportunities • Digital preservation and cloud computing • Examples and case studies • Benefits and risks • Discussion: best practices and strategies
  • 34. What is Digital Preservation? • “Digital preservation combines policies, strategies and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time” (ALCTS, 2007). • A number of other definitions exist • All basically require information professionals to ensure digital content is accessible/usable over a long period of time. • To do this requires the right people with the right technology who have the support and vision they need.
  • 35. Digital Preservation: Challenges and Opportunities • Challenges • The nature of digital files • DP is more complex than simply backing up a file • DP requires additional technical maintenance along with safeguarding context • Threats include • Technological • Meaning-related (if metadata is lost, the context is not clear) • Budgetary support • Opportunities • Continued relevancy in digital age • Ensure access to digital content for future generations • Fulfill legal requirements https://s.yimg.com/fz/api/res/1.2/01LjnAfcStgmzNU Q1ESasw-- /YXBwaWQ9c3JjaGRkO2g9NDA0O3E9OTU7dz0zNzA- /http://www.mrmartinweb.com/images/type/broth erwp3400.jpg
  • 36. What is Cloud Computing? • “A model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g. networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction” (Mell & Grance, 2011 p. 1). • As with digital preservation, a number of definitions exist • All basically assume that information is being stored remotely through the internet. • Examples of cloud service provider and services: Amazon (http://www.amazon.com/) • Amazon Elastic Compute Cloud (EC2) • Amazon Simple Storage Service (S3) • Amazon Glacier
  • 37. The NIST Cloud Computing Service Models http://www.servercloudcanada.com/2013/10/defining-the-cloud/
  • 38. Cloud Computing: Challenges and Opportunities • Challenges • Security in the cloud is a concern for many and can be perceived as both a positive and a negative aspect of cloud computing • Placing trust in a third party to store content can also be a positive or a negative • The uncertain future: it is possible, but not likely, that a cloud computing company will go out of business, taking a library’s data with it • When data is stored in the cloud, the cloud provider has access to it • Opportunities • Perceived lowering of technical and financial barriers • Increased flexibility to quickly meet increased demands, disaster recovery, decreased software application and server maintenance, decreased capital expenses, increased security, and cloud computing is more environmentally friendly (Salesforce, 2015) • Geographic diversity of server locations helps!
  • 39. Digital Preservation and Cloud Computing: Turnkey Solutions • Examples: • Preservica Cloud Edition: SaaS model. • DuraSpace’s DuraCloud: cloud storage from either Amazon or the San Diego Supercomputing Center (Schumacher et al., 2014). • Cloud computing is widely utilized in industry, but Srivastava and Verma (2015) observe that the “Application of cloud computing in libraries is a relatively new area as compared to its applications in business and corporate sector” (p. 33). • Some information professional points of view: • There are at least some benefits to using cloud computing for digital preservation • Questions: Is cloud computing truly the best approach for digital preservation? If so, what might be the caveats?
  • 40. Research into Digital Preservation and Cloud Computing • National Digital Stewardship Alliances’ Infrastructure Working Group survey (Bailey, 2012): • 74% of the members had a strong preference for controlling their own preservation storage systems because of cost concerns, trustworthiness, legal issues and security and risk management. • Researchers from the University of Cape Town (Poulo, Phiri, & Suleman, 2014): • showed that digital library applications in the cloud can provide adequate response time and that the response time is not significantly affected by complexity or collection sizes. • But… is the elasticity the cloud provides worth the cost? • “Cloud storage really is cheaper if your demand is spiky, but digital preservation is the canonical base-load application” (Rosenthal, 2014) • Miller (2014) points out, “some of the most compelling attributes of the public cloud are best suited to ephemeral or (relatively!) short- term use cases” (¶ 2).
  • 41. Research on Costs • Rosenthal and Vargas (2013) performed a study in which they experimented with running a Lots of Copies Keep Stuff Safe (LOCKSS) (http://www.lockss.org/) box on Amazon’s EC2 cloud backed with Amazon S3 cloud storage. • Their study concluded “that current cloud storage services are not cost-competitive with local hardware for long term storage, including for LOCKSS boxes” (Rosenthal & Vargas, 2013, p. 107). • Han (2015), after reviewing this and other studies, “believes that the combination of big price drops and free data transfer within the same data zone makes the cloud storage a very attractive solution for long-term digital preservation, especially using [Amazon] Glacier” (p. 266). • Benefits of cloud computing (e.g. geographically- diverse storage) mean it is not a one-to-one comparison. • As Fryer and Brown (2014) write, more study in this area is
  • 42. Best Practices/Case Studies • Central Connecticut State University, USA (Iglesias, 2011; Iglesias & Meesangnil, 2010) decided to use Amazon S3 for storing their long-term archival masters for digital preservation purposes. • In evaluating their decision, they determined that using Amazon S3 for storage was a very good one for them: costs were low and there was no downtime in the first year of use (Iglesias & Meesangnil, 2010). • The Parliamentary Archives, Houses of Parliament, London, UK, adopted third-party cloud storage to use with their Preserivca Enterprise digital preservation system (Fryer & Brown, 2014). In order to minimize risk, the Parliamentary Archives decided to: • Keep sensitive data local
  • 43. A Graceful Exit • An exit strategy is important to consider when choosing to use a cloud computing service. • Exit strategies are “often neglected because few want to consider the demise of what is, at the moment, a seemingly wonderful solution, being adopted and implemented with great effort and expectation” (Schaffer, 2014, p. 4). • But… how does the library get its data back (Robinson, 2015)?
  • 44. Conclusion • Digital preservation is not easy. Although there are many technical threats to digital preservation, ultimately digital preservation is not merely a technological challenge. • Policies and procedures need to be in place to ensure ongoing digital preservation. • Cloud computing can be used successfully and economically for digital preservation in the appropriate situations. • Platforms used for digital preservation should be routinely reviewed and revaluated. • Digital presentation, in the cloud or locally, is not something that can be done once and then forgotten about. • Information professionals need to review both their digital preservation strategy and the applicability of cloud computing as part of that strategy on an ongoing basis.
  • 46. References • ALCTS. (2007, 24 June). Definitions of Digital Preservation. Retrieved from http://www.ala.org/alcts/resources/preserv/defdigpres0408 • Anderson, M. (2011, 7 September). B is for Bit Preservation. The Signal. Retrieved from http://blogs.loc.gov/digitalpreservation/2011/09/b-is-for-bit-preservation/ • Corrado, E. M, & Moulaison, H. L. (2015, August 12). Digital preservation and the cloud: Challenges and opportunities. IFLA 2015 Pre-Conference Satellite Meeting Preservation & Conservation Section, Durban, South Africa, 12-13 August, 2015. • Fryer, C. & Brown, A. (2014). Case study: Archives in the cloud: Challenges and opportunities, In B. Endicott-Popovsky (Ed.), International Conference on Cloud Security Management ICCSM-2014: ICCSM2014. London, UK. • Han, Y. (2015). Cloud storage for digital preservation: Optimal uses of Amazon S3 and Glacier. Library Hi Tech, 33(2), 261-271. • Iglesias, E. (2011). Using Windows Home Server and Amazon S3 to back up high-resolution digital objects to the cloud. In E. M. Corrado & H. L. Moulaison (Eds.), Getting started with cloud computing (pp. 143-151). New York: Neal-Schuman. • Iglesias, E., &Meesangnil, W. (2010). “Amazon S3 in digital preservation in a mid-sized academic library: a case study of CCSU ERIS digital archive system”, The Code4Lib Journal, 12. Retrieved from: http://journal.code4lib.org/articles/4468 • Mell, P., & Grance, T. (2011, September). The NIST definition of cloud computing, Special Publication 800-145, National Institute of Standards and Technology. Retrieved from: http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf • Poulo, L., Phiri, L., & Suleman, H. (2014). Fine-grained scalability of digital library services in the cloud. In J. P. van Deventer, M. C. Matthee, H. Gelderblom, and A. Gerber (Eds.), Proceedings of the Southern African Institute for Computer Scientist and Information Technologists Annual Conference 2014 onSAICSIT 2014 Empowered by Technology (SAICSIT '14) (pp. 157-165), ACM, New York, NY, USA. http://doi.acm.org/10.1145/2664591.2664611 • Rackspace Support (2013, October 22). Understanding the Cloud Computing Stack: SaaS, PaaS, IaaS. Retrieved from https://www.rackspace.com/knowledge_center/whitepaper/understanding-the-cloud-computing-stack-saas-paas-iaas • Robinson, J. D. (2015). "The dogs bark and the circus moves on", The Bottom Line: Managing library finances, 28, (1/2), 7-18. http://dx.doi.org/10.1108/BL-01-2015-0002 • Rosenthal, D. H. S. & Vargas, D. L. (2013). Distributed digital preservation in the cloud, The International Journal of Digital Curation, 8(1). http://dx.doi.org/10.2218/ijdc.v8i1.248 • Rosenthal, D. H. S. (2014). Talk "Costs: Why do we care?" [blog post]. Retrieved from http://blog.dshr.org/2014/11/talk-costs-why-do-we-care.html • Ross, S. (2007). Digital preservation, archival science and methodological foundations for digital libraries: Keynote address at the 11th European Conference on Digital Libraries (ECDL), Budapest (17 September 2007). Retrieved from http://glasgowsciencefestival.org.uk/media/media_113621_en.pdf • Schaffer, H. (2014 March/April). Will you ever need an exit strategy? IT Pro. Retrieved from : http://www.computer.org/csdl/mags/it/2014/02/mit2014020004.pdf • Stark, L., & Tierney, M. (2014). Lockbox: Mobility, privacy and values in cloud storage. Ethics and Information Technology, 16(1), pp 1-13.
  • 47. NISO Webinar • October 14, 2015 Questions? All questions will be posted with presenter answers on the NISO website following the webinar: http://www.niso.org/news/events/2015/webinars/cloud_services/ October 14 NISO Webinar Cloud and Web Services for Librarians
  • 48. Thank you for joining us today. Please take a moment to fill out the brief online survey. We look forward to hearing from you! THANK YOU

Editor's Notes

  1. Volume and content type is growing as well as the number of locations that shared content might be in. This dramatically increases storage requirements and accessibility needs.
  2. Modern users are demanding access to content in different ways and this has resulted in a shift in collections libraries are building. This is data from the National Center for Education Statistics and it illustrates the shift from print to electronic in collections. You can see that we have reached the “tipping point” where a small percent of collections budgets are spent on print.. Your print collection is still important, especially as a research collection, but electronic outweighs print in terms of time and priorities for support . You need to focus on your future and transform how you manage your collections.
  3. Cloud-based systems offer expense and time savings Cloud-based systems are designed to be more scalable for improved use, management and storage of data
  4. Not only is content more diverse and the volume being managed by librarians growing, and the role and pace of research changing, the research process has evolved too.
  5. Discovery services – large content volume to be searched, need for continual updates & updated indexing (AWS) Improved processing for more frequent content indexing and metadata management Summon is the largest unified index in the industry. Open access information resources being ingested into cloud-based discovery systems help corporate researchers gain access to more current research.
  6. Older technological systems are based on inventory and management of print resources and aren’t built to support the volume of electronic and digital content – opportunity for unified workflow - Cloud systems better manage the rapid growth and complexity of content
  7. Facilitates exposure of library data on the Web Simplifies the processes associated with describing resources…and people, concepts, places, etc. Simplifies management through new models for authority control Reduces level of effort associated with traditional catalog management Linked Data effort re-envisions a new bibliographic environment for libraries that makes the “network” central and interconnectedness commonplace Makes library information accessible in the places where users are working = everywhere Lowers metadata creation and maintenance costs Librarians bring their unique expertise to the work of identifying and establishing more relationships between and among resources
  8. Cloud based library systems better support the model of the library as a virtual knowledge center
  9. Hello, my name is Kurt Ewoldsen and I am the Director of Infrastructure and Application Support for the CDL (which is just a fancy way of saying I am the IT manager here) and I am going to share a little about how we are moving the entire infrastructure supporting CDL services into the AWS cloud.
  10. The California Digital Library supplies system-wide services to the University of California Interesting Fact: The Department of Library Automation, which was the predecessor to the CDL, brought the first system-wide data network to UC, to support the Melvyl online library catalog, in 1982
  11. Over the past several years, we have migrated our infrastructure from physical Sun/Solaris systems to Vmware VMs to AWS instances. That is an evolution through 3 different technology paradigms in a very short period of time.
  12. We are a little more than 75% complete with our migration to AWS and have a number of customer-facing services in production on that platform, as well as all of the infrastructure services required to support our application development and deployment environments.
  13. There were a number of factors that led us to move to the AWS cloud, but today I will focus on these three.
  14. There are certainly less expensive hosting providers than AWS, but none that have the level of innovation and breadth of services that Amazon provides. The significant benefits come once you start fully leveraging the AWS ecosystem and all of the services that are available.
  15. Every year at budget time, I would have to contact all of our support vendors and request a quote for the annual maintenance renewal. Depending on how organized the vendor was, this could turn into a drawn out process: my contact from last year may have changed roles or left the company, they may have changed they way they assign contracts, etc. I would use those estimates as part of my budget forecast. Later in the year, when the actual renewal date approached, I would have to contact them all again. First to renew the quotes, which are generally only good for 30 days, and then to work through the actual renewal process: Generating the purchase request, ensuring the PO was created and sent to the vendor, receiving the invoice, making sure the invoice was paid, and finally verifying that the vendor support portal or customer management system reflected the new expiration date for our account. This is a lot of effort that does not provide direct benefit to users of our services, so I am happy to reduce the amount of my time dedicated to this activity.
  16. When you have services sharing physical servers or VMs sharing physical hosts, it can be difficult to calculate the actual cost of the infrastructure dedicated to a particular service. There are different types of costs to consider (capital and operational) and putting a price tag on virtual CPUs or memory assigned to a VM is challenging. Because it is at the core of their business, AWS has cost reporting down to a science, and it is easy for this cost information to be shared within the organization.
  17. We can track our spend on a daily basis, if we so desire, and break the costs down in a number of ways. This shows a breakdown by costs by AWS service type.
  18. These are the monthly costs for our eScholarship service
  19. These are the daily costs for our eScholarship service
  20. This is one example of how an understanding of service costs led to several cycles of application architecture modifications that resulted in performance improvements and lowered costs.
  21. AWS accounts are a good way to handle grand-funded efforts that have to be turned over at completion. Instead of worrying about how to re-create the environment at another location when the grant ends, simply create a separate AWS account and perform the work in it. At the end of the grant, simply turn over the account credentials and change the billing information to the new owner. We have done this with ArchivesSpace
  22. Infrastructure migrations take time away from application development or improvement. While the long-term benefits are compelling, the short-term impact is real and can lead to delayed release of new services or features. Modifying your applications to take advantage of the AWS platform and see the benefits of their H/A capabilities and other services takes even more effort, which takes even more time away from application development or improvement. Performance in AWS is variable on all levels. Processing cycles will vary between legacy performance and performance in AWS. Even cycles in AWS can vary significantly between runs; a job that usually completes in 6 hours can take occasionally take twice as long, for no apparent reason.
  23. The rate of innovation at AWS is both a blessing and a curse. It is difficult to keep up with the rate of change within the AWS environment, and with the new products and services that are continually introduced. Just last week at their annual user conference, AWS introduced more than 20 new services or significant enhancements to existing services. We are using only 11 of their 50+ services at this time. Using the cloud just means that you are using someone else’s equipment. Technology will always have problems, no matter who is managing it, so you will still experience system problems while using the cloud, and you will need to plan your cloud deployment to accommodate these issues.
  24. This list was created in August, and is already out of date (remember what I said about the rate of change in the AWS environment) One of the things we are interested in pursuing is use of Elastic Beanstalk or the AWS Container Service, to move us from IaaS to PaaS
  25. I’m using a photo from my summer vacation here -- as the leaves begin to change, I’m keen on preserving the memory of summer travels! The title: Surveying the horizon – not a completely nuts-and-bolts approach, but enough to get you thinking, I hope. Before I go too far, acknowledge co-author Edward M. Corrado, Associate Dean for Technology at the Univ of Alabama Lib -- without whom this presentation wouldn’t have been possible. [NISO -- Kindly limit your presentation to 20 minutes to allow for sufficient Q&A time. Preserving digital information is becoming an ever more important role of libraries and archives in our digital society. Cloud computing and, more specifically, cloud-based data storage offers some potential to help address the issues information professionals face. Yet, not everyone agrees that cloud computing is the best solution. By examining the problem, reviewing the literature, and assessing best practices, it is possible to understand some of the issues that need to be considered.]
  26. Getting into this – in terms of the rationale… Preserving digital information is becoming an ever more important part of what info pros do. Digital preservation is an economical, managerial, and technological challenge. So is cloud computing, in many ways. Cloud computing and, more specifically, cloud-based data storage, offers some potential to help address issues faced – yet there is little in the way of out-of-the box guidance and help for preserving digital content in the cloud. Additionally, it’s not entirely clear that cloud computing is digital preservation’s best bet. [next] [Again, this is not meant to be a workshop-like presentation, but more of an overview of points to consider and possible approaches when thinking about digital preservation and the cloud.] [next]
  27. In terms of what we’ll cover in the next 15 or 20 minutes – We’ll begin by talking about digital preservation – there are a lot of definitions out there, so I’ll explain what I mean when I say digital preservation and I’ll mention some of the tricky aspects of digital preservation to get us started. Next, we’ll do the same thing for cloud computing. I’d like to then explore the two topics together, and finish by talking about some implications. [next]
  28. There are a number of definitions of digital preservation but they all point to the need to ensure that digital content is accessible over some period of time. One definition of digital preservation prepared by the American Library Association’s Association for Library Collections and Technical Services (ALCTS) states that “Digital preservation combines policies, strategies and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time” (ALCTS, 2007). Sometimes we hear information professionals and researchers talk about “digital curation” – their definition probably overlaps quite well with the spirit of these definitions of “digital preservation” [I mentioned a few slides ago that digital preservation is an economical, managerial, and technological challenge – this is because planning and oversight are a huge part of making digital content available into the future. What do I mean?] [next]  
  29. What do I mean? In terms of the Digital Preservation: Challenges It is important to remember that “backup, alone, does not serve as an appropriate solution to digital archiving” (Payette, 2008). Digital preservation is not just safeguarding the bits. When something goes wrong and meaning or content is lost, we cannot say a digital object has been preserved. Part of the problem is the nature of digital objects. Digital objects, unlike many physical objects archived in libraries and archives, are relatively fragile. Technological threats can include physical deterioration of the storage medium used to store the digital object, being unable to read the file because the hardware and/or software is no longer supported or accessible, and loss of the software programs that can interpret the digital object (Waugh, Wilkinson, Hills, & Dell’oro, 2000). *Nothing illustrates a point quite like an embarrassing story from personal experience – I will share with you that in the late 1980’s/early 1990s I bought a word processor that was essentially an electric typewriter with a stand alone monitor and the ability to save files to a diskette. Frankly, I couldn’t believe I actually found a photo of it online – so here it is ; this is what it looked like. In any event, I held on to all those diskettes for a long, long time after the machine was gone – I finally threw them away about 10 or 15 years ago after giving up on ever getting the contents formatted in a way that could be accessed – right about the time that I acknowledged that the stuff I was saving on them as a freshman in college wasn’t really probably worth saving anyway.*   There are other threats to long-term digital preservation besides technical failure, including budgetary. Digital preservation can be expensive and the benefits of digital preservation may not be immediately apparent to administers since the benefits are mostly in the future. and Opportunities But, the benefits are amazing when these challenges can be overcome! Society can use content into the future that otherwise would have been lost. This gives information professionals yet another way to be relevant in the information age. Depending on the kind of institution, saving necessary digital content can help fulfil legal requirements, or having explicit metadata explain future use scenarios has the potential to respect the wishes of donors or to better honor intellectual property rights for content, depending on the situation. [next]
  30. What is Cloud Computing The United States’ National Institute of Standards and Technology (NIST) defines cloud computing as “a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g. networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction” (Mell & Grance, 2011 p. 1). There is more to it than this, though. The NIST report from which this definition is drawn goes on to explain that that there are three service models, five essential characteristics including elasticity, and four deployment models involved with cloud computing. Although this quite specific NIST understanding of cloud computing is useful for technologists, it might be more useful for non-IT information professionals to think of library “cloud computing as library data and services hosted beyond the library’s walls and accessible via the web” (Corrado & Moulaison, 2012). The online sales giant Amazon (http://www.amazon.com/) is one example of a cloud computing services provider, offering services such as the Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), and Amazon Glacier.
  31. This is a pretty straightforward graphic showing the kinds of products involved with each service model. In the last slide, I mentioned three service models as defined by NIST. One of the most common cloud computing service models used by libraries is SaaS (pronounced “sass”), where software is used remotely through the cloud. In libraries, SpringShare’s offerings including LibGuides, LibAnswers, and LibCal, are popular SaaS offerings. PaaS is another of the service models set forth in the NIST definition. PaaS “is the set of tools and services designed to make coding and deploying [cloud-based] applications quick and efficient” (Rackspace Support, 2013). Libraries with their own software developers may make use PaaS offerings such as Google App Engine, Heroku, and Microsoft Azure Services. In these cases, the library’s IT staff will develop software and then load content onto an already-robust technology platform that is running in a specific computing environment. The third cloud computing service model is Infrastructure as a Service (IaaS) – “servers, storage and networking— on demand, in a pay-as-you-go model” (IBM, 2015). With IaaS, it is not necessary to make a significant up-front investment in computing hardware.
  32. Cloud challenges: As far as challenges go -- security in the cloud is a concern for many and can be perceived as both a positive and a negative aspect of cloud computing. Since systems administrators at the local institution don’t have as much control of security as they would with a locally-hosted applications, this concern is a valid one. However, since cloud services are typical provided by large organizations that have the resources and incentives to invest in security professionals, the cloud may provide better security than locally hosted options, especially when the library does not have a full complement of IT professionals (Srivastava & Verma, 2015). Additionally, the issue of placing trust in a third party to store content can also be a positive or a negative. Although it is possible that a large company’s servers might go off-line, it is probably less likely that a large company’s servers will go off-line than a single library’s. It is also not likely that a cloud computing company will go out of business, taking a library’s data with it, but again, it is possible. Other potentially negative aspects of cloud computing that libraries need to consider is privacy of personal information or the protection of content. When data is stored in the cloud, the cloud-provider has access to it. As a result, there may be various legal and organizational policy implications. Dark archives where content cannot be shared with users without permission must be sensitive to potential issues of privacy and security of content. Additionally, some governments do not permit certain data to be stored outside of their country or in specific countries due to the legislation under which the cloud computing providers are required to operate. Opportunities: Cloud computing has been touted as a solution for many technological challenges because of its primary perceived benefit: lowering technical and financial barriers. Other benefits of cloud computing include increased flexibility to quickly meet increased demands, disaster recovery, decreased software application and server maintenance, decreased capital expenses, increased security, and … cloud computing is more environmentally friendly (Salesforce, 2015). These benefits can also apply when hosting digital library content in the cloud. In particular, cloud-based storage permits on-demand provisioning and, because of its geographic diversity with servers located in different geographical regions, natural disasters such as earthquakes or floods might affect one set of servers where the information is housed, but not the other sets of servers located on the other side of the world. Keeping copies of data in multiple geographic areas, something cloud-based storage is designed to do, aligns well with digital preservation best practices of maintaining between two and six copies of digital content (Anderson, 2011).
  33. Some turnkey solutions already exist for carrying out digital preservation in the cloud. Services such as Preservica Cloud Edition provide cloud-based digital preservation services using the SaaS model. DuraSpace also provides preservation services via their DuraCloud offering that utilizes cloud storage from either Amazon or the San Diego Supercomputing Center (Schumacher et al., 2014). Cloud computing is widely utilized in business, industry, and in the corporate sector, but is somewhat new in libraries. Cloud computing is even more new when considering its use in digital preservation in libraries. Many libraries see that there are benefits to using cloud computing for digital preservation, but there is not universal agreement that cloud computing is the best approach to use for digital preservation -- and the research at present is divided. [next]
  34. Although cloud-based preservation has the many benefits of cloud computing mentioned earlier, there are some potential drawbacks. For instance, a survey of the National Digital Stewardship Alliances’ Infrastructure Working Group showed that 74% of the members had a strong preference for controlling their own preservation storage systems because of cost concerns, trustworthiness, legal issues and security and risk management (Bailey, 2012). Researchers from the University of Cape Town conducted a set of experiments to investigate “the scalability of typical digital library services that use cloud computing facilities for core processing and storage” (Poulo, Phiri, & Suleman, 2014, p. 157). Their experiments showed that digital library applications in the cloud can provide adequate response time and that the response time is not significantly affected by complexity or collection sizes. Yet, Rosenthal reminds us that for libraries to be able scale up quickly and recruit additional computing power from the cloud is great, but libraries tend to have a very predictable need for storage that does not necessitate this kind of elasticity. On a somewhat related note, Miller (2014) points out, essentially, that the purported benefits of cloud computing, especially the financial savings, are achieved because of the elasticity that allows users to only pay for the computing resources they need, so if lots of computing power is needed quickly, cloud computing can support that need instantly. This benefit, then, isn’t really a benefit when it comes to long-term storage of digital content. [next]
  35. Cost can be a major issue in digital preservation and also in cloud computing. To investigate the costs of using cloud storage for long term storage using the Lots of Copies Keep Stuff Safe (LOCKSS) program from Stanford University (http://www.lockss.org/), Rosenthal and Vargas (2013) performed a study in which they experimented with running a LOCKSS box on Amazon’s EC2 cloud backed with Amazon S3 cloud storage. Their study concluded “that current cloud storage services are not cost-competitive with local hardware for long term storage, including for LOCKSS boxes” (Rosenthal & Vargas, 2013, p. 107). One of the major advantages of using cloud-based storage versus local storage for digital preservation is that it typically includes redundant and geographically-diverse storage, which is considered a best practice for digital preservation. Therefore, it is not a one-to-one price comparison to the costs of storage on a local machine. As Fryer and Brown (2014) write, more study in this area is needed to determine the economic reality of using cloud storage for digital preservation. One thing that is relatively clear, however, is that it is important to understand the costs involved with cloud computing and cloud storage. In most cases the costs are not just based on the amount of data stored, but also on the bandwidth used. Therefore, information professionals need to factor in the potential data-transfer fees which can be a large part of the overall costs if cloud storage (Han, 2015).
  36. Of course, when in doubt, a great option is always to see what others have done! There have been a few case studies published that involve preserving digital objects using the cloud. Two are presented here. Edward Iglesias – who just recently left Central Connecticut State University -- decided a few years ago to use Amazon S3 for storing their long-term archival masters for digital preservation purposes. After reviewing various options, they decided on Amazon S3 because the overall costs were lower than other options and it offered immediate access to archival files and provided redundant, geographically diverse storage. In evaluating their decision, they determined that using Amazon S3 for storage was a very good one for them and that the costs were low and there was no downtime in the first year of use (Iglesias & Meesangnil, 2010). The Parliamentary Archives, Houses of Parliament, London, United Kingdom, adopted third-party cloud storage to use with their Preserivca Enterprise digital preservation system. Although the Preservica Enterprise software is installed on a local server, most of the digital objects are stored in the cloud. Fryer and Brown’s case study explored some of the opportunities and challenges of using the cloud for digital preservation. One of the biggest challenges was how to deal with sensitive data. The Parliamentary Archives decided to keep sensitive data local while storing the bulk of their data, which is publicly accessible, using cloud storage. As mentioned, one of the risks associated with cloud computing is the cloud provider going out of business or otherwise doing something to make the digital objects in the cloud inaccessible. In order to minimize this risk, the Parliamentary Archives decided to store their data with two different cloud providers.
  37. And, finally, An exit strategy is important to consider when choosing to use a cloud computing service. Exit strategies are often neglected -- However, no solution lasts forever and the perfect solution today may not be in the future! It is important to ask if a library stores data in the cloud, but the cloud provider goes out of business, or raises their prices, or for whatever reason the library simply does not like its service any more, how does the library get its data back (Robinson, 2015)? This is especially an issue when dealing with large amounts of data that might be stored for digital preservation. These questions should be addressed and answered when signing a contract if at all possible. If the library can’t negotiate with the provider (a small, single library will not likely have much leverage negotiating with Amazon or Google) it is important for the library to understand the terms of service and what they mean for the libraries’ ability to migrate to a different solution in the future. Librarians need to determine who owns the data that they enter into a cloud system, how they will store and manage their data in the cloud, and how they get their data back out. These issues can be even more pronounced in a PaaS or SaaS solution since the data may be intrinsically tied to the cloud application or platform.
  38. In conclusion… *** Digital preservation isn’t easy. Although there are many technical threats to digital preservation, ultimately digital preservation is not merely a technological challenge. Policies and procedures need to be in place to ensure ongoing digital preservation. Cloud Computing can be successfully and economically used for digital preservation in the appropriate situations. However, cloud computing platforms used for digital preservation should be routinely reviewed and revaluated. Digital presentation is not something that can be done once and then forgotten about. Likewise, cloud computing is a quickly evolving technical field. Information professionals need to review both their digital preservation strategy and the applicability of cloud computing as part of that strategy on an ongoing basis.