SlideShare a Scribd company logo
1 of 72
Download to read offline
exploring perspectives on the evaluation of
digital libraries
Giannis Tsakonas & Christos Papatheodorou
{gtsak, papatheodor}@ionio.gr
Database and Information Systems Group,          Digital Curation Unit,
Laboratory on Digital Libraries and Electronic   Institute for the Management of Information Systems,
Publishing, Department of Archives and Library   ‘Athena’ Research Centre,
Sciences, Ionian University,                     Athens, Greece
Corfu, Greece.
introduction

a few things about this tutorial
what is it about?
• It is about evaluation!
• Evaluation is not only about finding what is better for our case, but
  also how far we are from it.
• erefore this tutorial is about locating us in a research space and
  helping us find the way.




                                                                         3
what are the benefits?
• Viewing things in a coherent way.
• Familiarizing with the idea of the interdisciplinarity of the domain
  and the need of collaboration with other agents to perform better
  evaluation activities.
• Forming the basis to judge potential avenues in your evaluation
  planning.




                                                                         4
how it is structured?
• Part one: fundamentals of DL evaluation (1,5 hour)
  – reasons to evaluate
  – what to evaluate
  – agents of the evaluation
  – methods to evaluate
  – outcomes to expect
• Break (0,5 hour)
• Part two: formal description and hands on session
  – a formal description of DL evaluation (0,5 hour)
  – hands on session (0,5 hour)
• Discussion and closing (0,5 hour)

                                                       5
part one

fundamentals of digital library
evaluation
evaluation tutorials in DL conferences
                                                                        ECDL
                        Evaluating, using, and publishing eBooks 2001
                                                                        JCDL
                        Evaluating Digital Libraries for Usability 2002 ICADL
                         Usability Evaluation of Digital Libraries
                                                                   2003
                         Usability Evaluation of Digital Libraries
                                     Evaluating Digital Libraries
                                                                   2004
  ualitative User-Centered Design in Digital Library Evaluation
                                     Evaluating Digital Libraries
                                                                   2005
                                   Evaluation of Digital Libraries
                                                                 2006

                                                                 2007

                                                                 2008

                                                                 2009

                   Lightweight User Studies for Digital Libraries 2010
                                                                                7
what is evaluation?
• Evaluation is the process of assessing the value of a definite product
  or an operation for the benefit of an organization at a given
  timeframe.




                                                                          8
how to start evaluating?

• From objects?
  – digital libraries are complex systems
  – depending on the application field their synthesis increases
• ...or from processes?
  – evaluation processes vary because of the many agents around
      them
  – many disciplines, each one having many backgrounds
• is dualism will escort us all the way.




                                                                  9
metaphor: a usual Rubik’s cube
• Roles: content managers (librarians, archivists, etc.)
  computer scientists, managers, etc.
• Research fields: information science, social
  sciences, human-computer interaction, information
  retrieval, visualization, data-mining, user modeling,
  knowledge management, cognitive psychology,
  information architecture and interaction design,
  etc.




                                                           10
metaphor continues: several unusual cubes
• Different stakeholders have different views
  – two different agents see different Rubik cubes to solve




               e developer’s cube




                                       e funder’s cube

                                                            11
evaluation framework
•   Why: the first, but also the most difficult, question.
•   What: the second question, with a rather obvious answer.
•   Who: the third question, with a more obvious answer.
•   How: the fourth question; a quite puzzling one.




                                                               12
questions

why
reasons to evaluate [a]
• We evaluate in order to improve our systems; a generic beneficiary
  aim.
• We evaluate in order to increase our knowledge capital on the value
  of our system:
  – to describe our current state (recording our actions)
  – to justify our actions (auditing our actions)
  – to redesign our system (revising our actions)
• We evaluate to make other understand what is the value of our
  system:
  – to describe how others use our system
  – to justify why they are using it this way (or why they don’t use it)
  – to redesign the system and its services to be better and more used
                                                                      14
reasons to evaluate [b]
• But, is it clear to us why we are evaluating?
  – the answer shows commitment
  – is it for internal reasons (posed by the organization, e.g.
     monitoring), or external reasons (posed by the environment of
     the organization, e.g. accountability)?
• Strongly related to the context of working.




                                                                     15
scope of evaluation

• Input-output evaluation
  – about effective production
• Performance measurement
  – about efficient operation
• Service quality
  – about meeting the goals of the served audience
• Outcomes assessment
  – about meeting the goals of the hosting environment
• Technical excellence
  – about building better systems


                                                         16
defining the scope
Levels

social

institutional                                 outcomes
                                             assessment
personal                       performance                service
                               measurement                quality
interface

engineering                                                         technical
                                                                    excellence
processing
                effectiveness

content


                Scope
                                                                                 17
find a reason
• Understanding the reasons and the range of our evaluation will help
  us formulate better research statements.
• erefore, before initiating, we need:
  – to identify the scope of our research,
  – to clearly express our research statements,
  – to imagine what kind of results we will have,
  – to link our research statements with anticipated findings (either
      positive, or negative).




                                                                    18
questions

what
what to evaluate?

• Objects
  – parts or the whole of a DL.
  – system or/and data.
• Operations
  – the purposive use of specific parts of DLs by human or machine
    agents.
  – usage of system or/and usage of data.




                                                                    20
objects
• Interfaces
  – retrieval interfaces, aesthetics, information architecture
• Functionalities
  – search, annotations, storage, sharing, recommendations,
     organization
• Technologies
  – algorithms, items provision, protocols compliance, preservation
     modules, hardware
• Collections
  – size, type of objects, growth rate


                                                                      21
operations
• Retrieving information
  – precision/recall, user performance
• Integrating information
• Usage of information objects
  – patterns, preferences, types of interaction
• Collaborating
  – sharing, recommending, annotating information objects
• Harvesting
• Crawling/indexing
• Preservation procedures


                                                            22
questions

who
agents of evaluation
• Who evaluates our digital library?
• Our digital library is evaluated by our funders, our users, our peers,
  but most important by us.
   – we plan, we collect, we analyze, we report and (‘unfortunately’)
     we redesign.




                                                                           24
we against the universe
• We evaluate and are evaluated against someone(~thing) else.
  – this can be a standard, a best practice, a protocol, a verification
    service, a benchmark
• Oen, we are compared against ourselves
  – we in the past (our previous achievements)
  – we in the future (our future expectations)
              Funders     Peers         Users       Societies -
                                                   Communities




external                          DLx                        DLy
internal

                                                                         25
we and the universe
• We collaborate with third parties.
  – for instance, vendors providing usage data, IT specialists
    supporting our hardware, HCI researchers enhancing our
    interface design, associations conducting comparative surveys,
    librarians specifying metadata schemas, etc.
• We need to think in inter-disciplinary way
  – to be able to contribute to the planning, to check the reliability
    of data and to control the experiments, to ensure the collection of
    comparable data, etc.




                                                                     26
questions

how
methods to evaluate
• Many methods to select and to combine.
• No single method can yield the best results.
• Methods are classified in two main classes:
  – qualitative methods
  – quantitative methods
• But more important is to select ‘methodologies’.




                                                     28
methods, methods, methods...
•   interviews                 •   laboratory studies
•   focus groups               •   expert studies
•   surveys                    •   comparison studies
•   traffic/usage analysis       •   observations
•   logs/keystrokes analysis   •   ethnography/field studies




                                                              29
metaphor: the pendulum
• Oen is hard to decide if our research will be qualitative or
  quantitative.
• Sometimes it is easy; predetermined by the context and the scope of
  evaluation.
• uantitative try to verify a phenomenon, usually a recorded
  behavior, while qualitative approaches try to explain it, identifying
  the motives and the perceptions behind it.




                                                                      30
metaphor continues: the pendulum
• However, it is not only about the data. It is about having an
  approach during the collection, the analysis and the interpretation of
  our data.
  – for instance, microscopic log analysis in deep log analysis
     methodology can provide qualitative insights.
• ere are inherent limitations, such as resources.
  – for instance, interviews are hard to quantify due to time required
     to record, transcribe and analyze.




                                                                      31
selecting methods
• Various ways to rank these methods:
  – resources (to be able to ‘realize’ each method)
  – expertise (to support the various stages)
  – infrastructure (to perform the various stages)
  – data collection (to represent reality and be relieved from bias),
     see census data or sample data.
• Triangulating methods is essential, but not easy.
  – “...using MMR allows researchers to address issues more widely and
     more completely than one method could, which in turn amplifies the
     richness and complexity of the research findings” Fidel [2008].



                                                                    32
criteria
• Criteria are ‘topics’ or perspectives of measurement or judgement.
• Criteria oen come grouped.
  – for instance, the categories of usability, collection quality, service
     quality, system performance efficiency, user opinion in the study
     of Xie [2008]
• Criteria oen have varied semantics between domains.




                                                    Figure taken by Zhang [2007]
                                                                             33
word cloud: criteria




Criteria cloud derived from the studies of Zhang [2010] and Xie [2008]

                                                                         34
metrics
• Metrics are the measurement units that we need to establish a
  distance between -at least - two states.
   – one ideal (target metric)
   – one actual (reality metric)
• An example, LibUAL’s scale




                                                                  35
examples of metrics
                                 ualitative (insights)


                        Unstructured           Observed aspects of
                       measurement of           performance, e.g.
                        opinions and          selections, patterns of
                         perceptions             interactions, etc.
Goals & Attitudes                                                           Behaviors
(what people say)                                                        (what people do)
                                               Recorded aspects of
                    Scaled measurement of
                                               activity, e.g. time or
                         opinions and
                                             errors, via logs or other
                          perceptions
                                                     methods



                               uantitative (validation)

                                                                                        36
the loneliness of the long distance runner




  object               aim          criteria   metric   instrument




Example taken by Saracevic [2004]
                                                                37
answers

plan
evaluation planning from above
• uestions
  – like these we have asked
• Buy-In
  – what is invested by each
     agent
• Budget
  – how much is available
• Methods
  – like those already
     mentioned

Figure taken by Giersch and Khoo [2009]
                                          39
practical questions
• Having stated our research statements and selected our methods, we
  need:
  – to make an ‘inventory’ of our resources
  – to define what personnel, how skilled and competent is
  – to define what instruments and tools we have
  – to define how much time is available
• All depended on the funding, but some depended on the time of
  evaluation.




                                                                   40
planning an evaluation
• Upper level planning tools
  – Logic models: useful to have an overview of the whole process
    and how is linked with the DL development project.
  – Zachman’s framework: useful to answer practical questions for
    setting our evaluation process.




                                                                    41
logic models
• A graphical representation of the processes inside a project that
  reflects the links between investments and achievements.
  – inputs: project funding and resources
  – activities: the productive phases of the project
  – outputs: short term products/achievements
  – outcomes: long term products/achievements




                                                                      42
logic models: an instance




Figure taken by Giersch & Khoo [2009]

                                        43
Zachman Framework
• Zachman Framework is a framework for enterprise architecture.
• It depicts a high-level, formal and structured view of an
  organization; a taxonomy for the organization of the structural
  elements of an organization under the lens of different views.
• Classifies and organizes in a two-dimensional space all the concepts
  that are essential to be homogeneous and are needed to express the
  different planning views.
  – according to participants (alternative perspectives)
  – according to abstractions (questions)




                                                                    44
Zachman’s matrix
                    What             How              Where             Who            When            Why
                    Data            Process          Location           Work           Timing        Motivation

                    Core             Major
        Scope                                        Business          Principal       Business       Mission
                   Business        Business
     [Planner]                                       Locations          Actors          Events        & Goals
                   Concepts     Transformations

                                                    Business
Business Model                                                        Workflow          Business       Policy
                  Fact Model        Tasks          Connectivity
      [Owner]                                                          Models         Milestones      Charter
                                                      Map

                                                    Platform &                          State
  System Model      Data          Behavior                                                             Rule
                                                  Communications      BRScripts       Transition
    [Evaluator]     Model         Allocation                                                           Book
                                                       Map                            Diagrams

    Technology    Relational                       Technical Plat-   Procedure &     Work ueue
                                   Program                                                              Rule
        Model     Database                        form & Commu-        Interface     & Scheduling
                                 Specifications                                                      Specifications
    [Evaluator]    Design                         nications Design   Specifications     Designs

         Detail
                   Database         Source                           Procedures &    Work ueues        Rule
 representation                                      Network
                   Schema           Code                               Interfaces    & Schedules        Base
    [Evaluator]

                                                                      Operational    Operational
Functioning Bus   Operational    Operational        Operational                                     Operational
                                                                     Procedures &    Work ueues
    [Evaluator]    Database      Object Cod          Network                                          Rules
                                                                       Interfaces    & Schedules

                                                                                                                  45
guiding our steps
• Assuming that we decided what is best for us, can we find out how
  far we are from it?
• Is there a roadmap?




                                                                     46
a roadmap




Figure taken by Nicholson [2004]
                                   47
when do you evaluate?


 Project start         Prototypea        DL release                  Evaluation [a]

            development        development                   use


<user requirements>


                 formative evaluationa                             summative evaluationa




                                             methods inventory              from product to process


                                                                                            48
how much is... many?
• Funding is crucial and must be prescribed in the proposal.
• Anecdotal evidence speaks of 5-10% of overall budget.
• Budget allocation usually stays inside project gulfs (also anecdotal
  evidence).
• Depending on the methods, e.g. logs analysis is considered a low cost
  method, as well as heuristic evaluation techniques are labeled as
  ‘discount’.




                                                                     49
answers

outputs
outcomes to expect
• Usually evaluation in research-based digital libraries is summative
  and has the role of ‘deliverable’.
• What should we expect in return?
  – some positive and some negative results

• Positive                            • Negative
  – a set of meaningful                 – a set of non-meaningful
     findings to transform into            findings that can not be
     recommendations (actions             exploited
     to be taken).                      – highly inconsistent and
  – dependent on the scope of             scarce
     evaluation.                        – non applicable and biased
                                                                        51
careful distinctions
• Demographics and user behavior related data are not evaluation per
  se.
• Assist our analysis and interpretation of data, describing thus their
  status, but they are not evaluating.




                                                                      52
part two

a formal description of the digital
library evaluation domain
by now
• You have started viewing things in a coherent way,
• Familiarizing with the idea of collaborating with other agents to
  perform better evaluation activities and
• Forming the basis to judge potential avenues in your evaluation
  planning.
• But can you do all these things better?
• And how?




                                                                      54
ontologies as a means to an end
• Formal models that help us
  – to understand a knowledge domain and thus the DL evaluation
     field
  – to build knowledge bases to compare evaluation instances
  – to assist evaluation initiatives planning
• Ontologies use primitives such as:
  – classes (representing concepts, entities, etc.)
  – relationships (linking the concepts together)
  – functions (constraining the relationships in particular ways)
  – axioms (stating true facts)
  – instances (reflecting examples of reality)

                                                                    55
a formal description of DL evaluation
• An ontology to
  – perform useful comparisons
  – assist effective evaluation planning
• Implemented in OWL with Protégé Ontology Editor




                                                    56
the upper levels

Levels                                                                                                Research uestions
                              isAffecting / isAffectedBy
content level, processing                                                    isDecomposedTo
level, engineering level,
interface level, individual
level, institutional level,         Dimensions
social level                        effectiveness, performance
                                    measurement, service quality,
                                    technical excellence, outcomes                                          isCharacterizing/
                                    assessment                                    Subjects                  isCharacterizedBy

                                                                                              isOperatedBy
                     isAimingAt
                                                                      isFocusingOn            isOperating Characteristics


                                                                                  Objects                   isCharacterizing/
                                                                                                            isCharacterizedBy

 Goals                                                   hasDiminsionsType
 describe, document, design

                                    Dimensions Type
                                    formative, summative, iterative
                                                                                                                      57
the low levels
                                      Instruments
                                      devices, scales, soware, statistics,
                                      narrative items, research artifacts
                                                                                          isUsedIn/isUsing
                            isSupporting/isSupportedBy

           isReportedIn/isReporting                               hasPerformed/isPerformedIn
                                                                                                            hasMeansType
Findings                              Activity                                      Means                        Means Types
                                      record, measure, analyze, compare,            Comparison studies,          qualitative,
                                      interpret, report, recommend                  expert studies,              quantitative
                                                                                    laboratory studies,
                                hasSelected/isSelectedIn                            field studies, logging
                                                                                    studies, surveys

                                      Criteria
Criteria Categories                                                                             isDependingOn
                                      specific aims, standards, toolkits

  isGrouped/isGrouping


                            isMeasuredBy/isMeasuring                            isSubjectTo

                                      Metrics                                       Factors
                                      content initiated, system initiated,          cost, inastructure,
                                      user initiated                                personnel, time                             58
connections between levels

Levels                        Dimensions                             Subjects
content level, processing     effectiveness, performance
level, engineering level,     measurement, service quality,
interface level, individual   technical excellence, outcomes
level, institutional level,   assessment                             Objects
social level
                                                   hasConstituent
                                                                                 isAppliedTo
                                                   /isConstituting
Research uestions            Activity                               Means
                              record, measure, analyze, compare,     Comparison studies,
isAddressing                  interpret, report, recommend           expert studies,
                                                                     laboratory studies,
Findings
                                                                     field studies, logging
                                                                     studies, surveys



         hasInitiatedFrom


                              Metrics
                              content initiated, system initiated,
                              user initiated                                                   59
use of the ontology [a]
• we use ontology paths to express explicitly a process or a
  requirement.

 Activities/analyze - isPerformedIn - Means/logging studies- hasMeansType - Means Type




                        isPerformedIn                            hasMeansType

Activity                                Means                                   Means Types
record, measure, analyze, compare,      Comparison studies,                     qualitative,
interpret, report, recommend            expert studies,                         quantitative
                                        laboratory studies,
                                          eld studies, logging
                                        studies, surveys

                                                                                               60
use of the ontology [b]
    Level/content level - isAffectedBy - Dimensions/effectiveness - isFocusingOn - Objects/
  usage of content/usage of data - isOperatedBy - Subjects/human agents isCharacterizedby -
  Characteristics/human agents-age, human agents-count, human agents-discipline, human
                                       agents-experience

                              isAffectedBy
Levels
                                   Dimensions                            Subjects
content level, processing
                                   effectiveness, performance             system agents, human
level, engineering level,
                                   measurement, service quality,         agents
interface level, individual
                                   technical excellence, outcomes
                                                                                                isCharacterizedby
level, institutional level,
                                   assessment
social level
                                                                                    isOperatedBy

                                                          isFocusingOn                          Characteristics
                                                                                                age, count, discipline,
                                                                                                experience, profession,
                                                                         Objects
                                                                         usage of content:
                                                                         usage of data, usage
                                                                         of metadata                           61
hands on session

things to do
exercise
• Based on your experience and/or your evaluation planning needs,
  use the ontology schema to map your own project and describe it.
  – If you do not have such experience, please use the case study
     outlined in Hand Out 5.
  – Use HandOuts 3a and 3b as an example and Hand Out 2 to fill
     the fields that you think are important to describe your
     evaluation.
  – Furthermore, report, if applicable, using Hand Out 4:
      • what is missing from your description
      • what is not expressed by the ontology



                                                                 63
ending part

conclusions & discussion
summary
• Evaluation must have a target
  – “...evaluation of a digital library need not inole the whole – it can
     concentrate on given components or functions and their specific
     objectives”. Saracevic, 2009
• Evaluation must have a plan and a roadmap
  – “An evaluation plan is essentially a contract between you ... and the
     other ‘stakeholders’ in the evaluation...”. Reeves et al., 2003
• Evaluation is depended on the context
  – “…is a research process that aims to understand the meaning of some
    phenomenon situated in a context and the changes that take place as
    the phenomenon and the context interact”. Marchionini, 2000


                                                                         65
but why is it so difficult to evaluate?
• Saracevic mentions:
  – DLs are complex
  – Evaluation is still premature
  – ere is no strong interest
  – ere is lack of funding
  – Cultural differences
  – uite cynical: who wants to be judged?




                                             66
can it be easier for you?
• We hope this tutorial made easier for you to address these
  challenges:
  – DLs are complex (you know it, but now you also know that DL
     evaluation is equally complex)
  – Evaluation is still premature (you got an idea of the field)
  – ere is no strong interest (maybe you are strongly motivated to
     evaluate your DL)
  – ere is lack of funding (ok, there is no funding in this room for
     your initiative)
  – Cultural differences (maybe you are eager to communicate with
     other agents)
  – Who wants to be judged? (hopefully, you)
                                                                    67
addendum

resources
essential reading [a]
•   Bertot, J. C., & McClure, C. R. (2003). Outcomes assessment in the networked
    environment: research questions, issues, considerations, and moving forward.
    Library Trends, 51(4), 590-613.
•   Blandford, A., Adams, A., Attfield, S., Buchanan, G., Gow, J., Makri, S., et al.
    (2008). e PRET A Rapporter framework: evaluating digital libraries from the
    perspective of information work. Information Processing & Management, 44(1),
    4-21.
•   Fuhr, N., Tsakonas, G., Aalberg, T., Agosti, M., Hansen, P., Kapidakis, S., et al.
    (2007). Evaluation of digital libraries. International Journal on Digital Libraries, 8
    (1), 21-38.
•   Gonçalves, M. A., Moreira, B. L., Fox, E. A., & Watson, L. T. (2007). “What is a
    good digital library?”: a quality model for digital libraries. Information Processing
    & Management, 43(5), 1416-1437.



                                                                                        69
essential reading [b]
•   Hill, L. L., Carver, L., Darsgaard, M., Dolin, R., Smith, T. R., Frew, J., et al.
    (2000). Alexandria Digital Library: user evaluation studies and system design.
    Journal of the American Society for Information Science and Technology, 51(3),
    246-259.
•   Marchionini, G. (2000). Evaluating digital libraries: a longitudinal and
    multifaceted view. Library Trends, 49(2), 4-33.
•   Nicholas, D., Huntington, P., & Watkinson, A. (2005). Scholarly journal usage:
    the results of deep log analysis. Journal of Documentation, 61(2), 248-289.
•   Powell, R. R. (2006). Evaluation research: an overview. Library Trends, 55(1),
    102-120.
•   Reeves, T. C., Apedoe, X., & Woo, Y. H. (2003). Evaluating digital libraries: a
    user-iendly guide. University Corporation for Atmospheric Research.
•   Saracevic, T. (2000). Digital library evaluation: towards an evolution of concepts.
    Library Trends, 49(3), 350-369.

                                                                                      70
essential reading [c]
•   Wilson, M. L., Schraefel, M., & White, R. W. (2009). Evaluating advanced search
    interfaces using established information-seeking models. Journal of the American
    Society for Information Science and Technology, 60(7), 1407-1422.
•   Xie, H. I. (2008). Users' evaluation of digital libraries (DLs): their uses, their
    criteria, and their assessment. Information Processing & Management, 44(3), 27.
•   Zhang, Y. (2010). Developing a holistic model for digital library evaluation.
    Journal of the American Society for Information Science, 61(1), 88-110.
•   Giannis Tsakonas and Christos Papatheodorou (eds). 2009. Evaluation of digital
    libraries: an insight to useful applications and methods. Oxford: Chandos
    Publishing.




                                                                                     71
tutorial material
• Slides, literature and rest of the training material are available at:
   – http://dlib.ionio.gr/~gtsak/ecdl2010tutorial,
   – http://bit.ly/9iUScR,
     link to the papers’ public collection in Mendeley




            photos by                 bibliography created with
                                                                           72

More Related Content

Similar to Exploring Perspectives on Digital Library Evaluation

Best practices in library services
Best practices in library servicesBest practices in library services
Best practices in library servicesFe Angela Verzosa
 
Gcsv2011 using career portfolios-anna graf williams and emily sellers
Gcsv2011 using career portfolios-anna graf williams and emily sellersGcsv2011 using career portfolios-anna graf williams and emily sellers
Gcsv2011 using career portfolios-anna graf williams and emily sellersServe Indiana
 
Implimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyImplimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyIndiana Online Users Group
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesAlan Said
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Crushing, Blending, and Stretching Data
Crushing, Blending, and Stretching DataCrushing, Blending, and Stretching Data
Crushing, Blending, and Stretching DataRay Schwartz
 
Designing User-Centered Discovery-and-Access Services for Enhanced Virtual Us...
Designing User-Centered Discovery-and-Access Services for Enhanced Virtual Us...Designing User-Centered Discovery-and-Access Services for Enhanced Virtual Us...
Designing User-Centered Discovery-and-Access Services for Enhanced Virtual Us...NASIG
 
Session1 methods research_question
Session1 methods research_questionSession1 methods research_question
Session1 methods research_questionmilolostinspace
 
Improving Library Resource Discovery
Improving Library Resource DiscoveryImproving Library Resource Discovery
Improving Library Resource DiscoveryDanya Leebaw
 
Crushing, Blending, and Stretching Data
Crushing, Blending, and Stretching DataCrushing, Blending, and Stretching Data
Crushing, Blending, and Stretching DataRay Schwartz
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011DLFCLIR
 
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...Hans Põldoja
 
Lightweight Documentation
Lightweight DocumentationLightweight Documentation
Lightweight DocumentationStephen Ritchie
 
Quality in qualitative research the role of the software’s in quality assur...
Quality in qualitative research   the role of the software’s in quality assur...Quality in qualitative research   the role of the software’s in quality assur...
Quality in qualitative research the role of the software’s in quality assur...Merlien Institute
 
Assessment presentation
Assessment presentationAssessment presentation
Assessment presentationjsutclif
 
Crushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional DataCrushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional DataRay Schwartz
 
COSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR ApplicationsCOSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR ApplicationsMark Billinghurst
 
Researching Researchers - Designing the User Experience at ProQuest
Researching Researchers - Designing the User Experience at ProQuestResearching Researchers - Designing the User Experience at ProQuest
Researching Researchers - Designing the User Experience at ProQuestMathias Burton
 

Similar to Exploring Perspectives on Digital Library Evaluation (20)

Best practices in library services
Best practices in library servicesBest practices in library services
Best practices in library services
 
Gcsv2011 using career portfolios-anna graf williams and emily sellers
Gcsv2011 using career portfolios-anna graf williams and emily sellersGcsv2011 using career portfolios-anna graf williams and emily sellers
Gcsv2011 using career portfolios-anna graf williams and emily sellers
 
Implimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled TechnologyImplimenting and Mitigating Change with all of this Newfangled Technology
Implimenting and Mitigating Change with all of this Newfangled Technology
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System Challenges
 
Dlf 2012
Dlf 2012Dlf 2012
Dlf 2012
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Crushing, Blending, and Stretching Data
Crushing, Blending, and Stretching DataCrushing, Blending, and Stretching Data
Crushing, Blending, and Stretching Data
 
Designing User-Centered Discovery-and-Access Services for Enhanced Virtual Us...
Designing User-Centered Discovery-and-Access Services for Enhanced Virtual Us...Designing User-Centered Discovery-and-Access Services for Enhanced Virtual Us...
Designing User-Centered Discovery-and-Access Services for Enhanced Virtual Us...
 
Session1 methods research_question
Session1 methods research_questionSession1 methods research_question
Session1 methods research_question
 
Improving Library Resource Discovery
Improving Library Resource DiscoveryImproving Library Resource Discovery
Improving Library Resource Discovery
 
Crushing, Blending, and Stretching Data
Crushing, Blending, and Stretching DataCrushing, Blending, and Stretching Data
Crushing, Blending, and Stretching Data
 
Sciences Po presentation eng
Sciences Po presentation engSciences Po presentation eng
Sciences Po presentation eng
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011
 
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...
 
Lightweight Documentation
Lightweight DocumentationLightweight Documentation
Lightweight Documentation
 
Quality in qualitative research the role of the software’s in quality assur...
Quality in qualitative research   the role of the software’s in quality assur...Quality in qualitative research   the role of the software’s in quality assur...
Quality in qualitative research the role of the software’s in quality assur...
 
Assessment presentation
Assessment presentationAssessment presentation
Assessment presentation
 
Crushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional DataCrushing, Blending, and Stretching Transactional Data
Crushing, Blending, and Stretching Transactional Data
 
COSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR ApplicationsCOSC 426 Lect. 7: Evaluating AR Applications
COSC 426 Lect. 7: Evaluating AR Applications
 
Researching Researchers - Designing the User Experience at ProQuest
Researching Researchers - Designing the User Experience at ProQuestResearching Researchers - Designing the User Experience at ProQuest
Researching Researchers - Designing the User Experience at ProQuest
 

More from Giannis Tsakonas

Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο Ιστό
Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο ΙστόΑρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο Ιστό
Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο ΙστόGiannis Tsakonas
 
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...Giannis Tsakonas
 
Increasing traceability of physical library items through Koha: the case of S...
Increasing traceability of physical library items through Koha: the case of S...Increasing traceability of physical library items through Koha: the case of S...
Increasing traceability of physical library items through Koha: the case of S...Giannis Tsakonas
 
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικές
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικέςΑκαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικές
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικέςGiannis Tsakonas
 
We were group no 2: notes for the MLAS2015 workshop
We were group no 2: notes for the MLAS2015 workshopWe were group no 2: notes for the MLAS2015 workshop
We were group no 2: notes for the MLAS2015 workshopGiannis Tsakonas
 
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόητα
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόηταΒιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόητα
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόηταGiannis Tsakonas
 
{Tech}changes: the technological state of Greek Libraries.
{Tech}changes: the technological state of Greek Libraries.{Tech}changes: the technological state of Greek Libraries.
{Tech}changes: the technological state of Greek Libraries.Giannis Tsakonas
 
Affective relationships between users & libraries in times of economic stress
Affective relationships between users & libraries in times of economic stressAffective relationships between users & libraries in times of economic stress
Affective relationships between users & libraries in times of economic stressGiannis Tsakonas
 
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...Giannis Tsakonas
 
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο Κύπρου
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο ΚύπρουΔεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο Κύπρου
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο ΚύπρουGiannis Tsakonas
 
FRBR και Linked Data - Σεμινάριο Αθήνας
FRBR και Linked Data -  Σεμινάριο ΑθήναςFRBR και Linked Data -  Σεμινάριο Αθήνας
FRBR και Linked Data - Σεμινάριο ΑθήναςGiannis Tsakonas
 
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...Giannis Tsakonas
 
Policies for geospatial collections: a research in US and Canadian academic l...
Policies for geospatial collections: a research in US and Canadian academic l...Policies for geospatial collections: a research in US and Canadian academic l...
Policies for geospatial collections: a research in US and Canadian academic l...Giannis Tsakonas
 
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...Giannis Tsakonas
 
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...Giannis Tsakonas
 
Path-based MXML Storage and Querying
Path-based MXML Storage and QueryingPath-based MXML Storage and Querying
Path-based MXML Storage and QueryingGiannis Tsakonas
 
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked Data
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked DataΔεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked Data
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked DataGiannis Tsakonas
 
Open Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISOpen Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISGiannis Tsakonas
 
Dileo Presentation (in English)
Dileo Presentation (in English)Dileo Presentation (in English)
Dileo Presentation (in English)Giannis Tsakonas
 
Evaluation Insights to Key Processes of Digital Repositories
Evaluation Insights to Key Processes of Digital RepositoriesEvaluation Insights to Key Processes of Digital Repositories
Evaluation Insights to Key Processes of Digital RepositoriesGiannis Tsakonas
 

More from Giannis Tsakonas (20)

Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο Ιστό
Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο ΙστόΑρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο Ιστό
Αρχειακά Μεταδεδομένα: Πρότυπα και Διαχείριση στον Παγκόσμιο Ιστό
 
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...
The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...
 
Increasing traceability of physical library items through Koha: the case of S...
Increasing traceability of physical library items through Koha: the case of S...Increasing traceability of physical library items through Koha: the case of S...
Increasing traceability of physical library items through Koha: the case of S...
 
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικές
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικέςΑκαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικές
Ακαδημαϊκές Βιβλιοθήκες στην Πάτρα: Προηγμένες υπηρεσίες, δικτύωση & προοπτικές
 
We were group no 2: notes for the MLAS2015 workshop
We were group no 2: notes for the MLAS2015 workshopWe were group no 2: notes for the MLAS2015 workshop
We were group no 2: notes for the MLAS2015 workshop
 
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόητα
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόηταΒιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόητα
Βιβλιοθήκες & Πολιτισμός: τα προφανή και τα ευνόητα
 
{Tech}changes: the technological state of Greek Libraries.
{Tech}changes: the technological state of Greek Libraries.{Tech}changes: the technological state of Greek Libraries.
{Tech}changes: the technological state of Greek Libraries.
 
Affective relationships between users & libraries in times of economic stress
Affective relationships between users & libraries in times of economic stressAffective relationships between users & libraries in times of economic stress
Affective relationships between users & libraries in times of economic stress
 
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
 
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο Κύπρου
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο ΚύπρουΔεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο Κύπρου
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - Σεμινάριο Κύπρου
 
FRBR και Linked Data - Σεμινάριο Αθήνας
FRBR και Linked Data -  Σεμινάριο ΑθήναςFRBR και Linked Data -  Σεμινάριο Αθήνας
FRBR και Linked Data - Σεμινάριο Αθήνας
 
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...
Automatic Medical Document Generation via Spatial SNOMED Elements in Hysteros...
 
Policies for geospatial collections: a research in US and Canadian academic l...
Policies for geospatial collections: a research in US and Canadian academic l...Policies for geospatial collections: a research in US and Canadian academic l...
Policies for geospatial collections: a research in US and Canadian academic l...
 
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...
Developing a Metadata Model for Historic Buildings: Describing and Linking Ar...
 
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
Query Expansion and Context: Thoughts on Language, Meaning and Knowledge Orga...
 
Path-based MXML Storage and Querying
Path-based MXML Storage and QueryingPath-based MXML Storage and Querying
Path-based MXML Storage and Querying
 
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked Data
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked DataΔεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked Data
Δεδομένα Βιβλιοθηκών στο μελλοντικό ψηφιακό περιβάλλον - FRBR και Linked Data
 
Open Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISOpen Bibliographic Data and E-LIS
Open Bibliographic Data and E-LIS
 
Dileo Presentation (in English)
Dileo Presentation (in English)Dileo Presentation (in English)
Dileo Presentation (in English)
 
Evaluation Insights to Key Processes of Digital Repositories
Evaluation Insights to Key Processes of Digital RepositoriesEvaluation Insights to Key Processes of Digital Repositories
Evaluation Insights to Key Processes of Digital Repositories
 

Exploring Perspectives on Digital Library Evaluation

  • 1. exploring perspectives on the evaluation of digital libraries Giannis Tsakonas & Christos Papatheodorou {gtsak, papatheodor}@ionio.gr Database and Information Systems Group, Digital Curation Unit, Laboratory on Digital Libraries and Electronic Institute for the Management of Information Systems, Publishing, Department of Archives and Library ‘Athena’ Research Centre, Sciences, Ionian University, Athens, Greece Corfu, Greece.
  • 2. introduction a few things about this tutorial
  • 3. what is it about? • It is about evaluation! • Evaluation is not only about finding what is better for our case, but also how far we are from it. • erefore this tutorial is about locating us in a research space and helping us find the way. 3
  • 4. what are the benefits? • Viewing things in a coherent way. • Familiarizing with the idea of the interdisciplinarity of the domain and the need of collaboration with other agents to perform better evaluation activities. • Forming the basis to judge potential avenues in your evaluation planning. 4
  • 5. how it is structured? • Part one: fundamentals of DL evaluation (1,5 hour) – reasons to evaluate – what to evaluate – agents of the evaluation – methods to evaluate – outcomes to expect • Break (0,5 hour) • Part two: formal description and hands on session – a formal description of DL evaluation (0,5 hour) – hands on session (0,5 hour) • Discussion and closing (0,5 hour) 5
  • 6. part one fundamentals of digital library evaluation
  • 7. evaluation tutorials in DL conferences ECDL Evaluating, using, and publishing eBooks 2001 JCDL Evaluating Digital Libraries for Usability 2002 ICADL Usability Evaluation of Digital Libraries 2003 Usability Evaluation of Digital Libraries Evaluating Digital Libraries 2004 ualitative User-Centered Design in Digital Library Evaluation Evaluating Digital Libraries 2005 Evaluation of Digital Libraries 2006 2007 2008 2009 Lightweight User Studies for Digital Libraries 2010 7
  • 8. what is evaluation? • Evaluation is the process of assessing the value of a definite product or an operation for the benefit of an organization at a given timeframe. 8
  • 9. how to start evaluating? • From objects? – digital libraries are complex systems – depending on the application field their synthesis increases • ...or from processes? – evaluation processes vary because of the many agents around them – many disciplines, each one having many backgrounds • is dualism will escort us all the way. 9
  • 10. metaphor: a usual Rubik’s cube • Roles: content managers (librarians, archivists, etc.) computer scientists, managers, etc. • Research fields: information science, social sciences, human-computer interaction, information retrieval, visualization, data-mining, user modeling, knowledge management, cognitive psychology, information architecture and interaction design, etc. 10
  • 11. metaphor continues: several unusual cubes • Different stakeholders have different views – two different agents see different Rubik cubes to solve e developer’s cube e funder’s cube 11
  • 12. evaluation framework • Why: the first, but also the most difficult, question. • What: the second question, with a rather obvious answer. • Who: the third question, with a more obvious answer. • How: the fourth question; a quite puzzling one. 12
  • 14. reasons to evaluate [a] • We evaluate in order to improve our systems; a generic beneficiary aim. • We evaluate in order to increase our knowledge capital on the value of our system: – to describe our current state (recording our actions) – to justify our actions (auditing our actions) – to redesign our system (revising our actions) • We evaluate to make other understand what is the value of our system: – to describe how others use our system – to justify why they are using it this way (or why they don’t use it) – to redesign the system and its services to be better and more used 14
  • 15. reasons to evaluate [b] • But, is it clear to us why we are evaluating? – the answer shows commitment – is it for internal reasons (posed by the organization, e.g. monitoring), or external reasons (posed by the environment of the organization, e.g. accountability)? • Strongly related to the context of working. 15
  • 16. scope of evaluation • Input-output evaluation – about effective production • Performance measurement – about efficient operation • Service quality – about meeting the goals of the served audience • Outcomes assessment – about meeting the goals of the hosting environment • Technical excellence – about building better systems 16
  • 17. defining the scope Levels social institutional outcomes assessment personal performance service measurement quality interface engineering technical excellence processing effectiveness content Scope 17
  • 18. find a reason • Understanding the reasons and the range of our evaluation will help us formulate better research statements. • erefore, before initiating, we need: – to identify the scope of our research, – to clearly express our research statements, – to imagine what kind of results we will have, – to link our research statements with anticipated findings (either positive, or negative). 18
  • 20. what to evaluate? • Objects – parts or the whole of a DL. – system or/and data. • Operations – the purposive use of specific parts of DLs by human or machine agents. – usage of system or/and usage of data. 20
  • 21. objects • Interfaces – retrieval interfaces, aesthetics, information architecture • Functionalities – search, annotations, storage, sharing, recommendations, organization • Technologies – algorithms, items provision, protocols compliance, preservation modules, hardware • Collections – size, type of objects, growth rate 21
  • 22. operations • Retrieving information – precision/recall, user performance • Integrating information • Usage of information objects – patterns, preferences, types of interaction • Collaborating – sharing, recommending, annotating information objects • Harvesting • Crawling/indexing • Preservation procedures 22
  • 24. agents of evaluation • Who evaluates our digital library? • Our digital library is evaluated by our funders, our users, our peers, but most important by us. – we plan, we collect, we analyze, we report and (‘unfortunately’) we redesign. 24
  • 25. we against the universe • We evaluate and are evaluated against someone(~thing) else. – this can be a standard, a best practice, a protocol, a verification service, a benchmark • Oen, we are compared against ourselves – we in the past (our previous achievements) – we in the future (our future expectations) Funders Peers Users Societies - Communities external DLx DLy internal 25
  • 26. we and the universe • We collaborate with third parties. – for instance, vendors providing usage data, IT specialists supporting our hardware, HCI researchers enhancing our interface design, associations conducting comparative surveys, librarians specifying metadata schemas, etc. • We need to think in inter-disciplinary way – to be able to contribute to the planning, to check the reliability of data and to control the experiments, to ensure the collection of comparable data, etc. 26
  • 28. methods to evaluate • Many methods to select and to combine. • No single method can yield the best results. • Methods are classified in two main classes: – qualitative methods – quantitative methods • But more important is to select ‘methodologies’. 28
  • 29. methods, methods, methods... • interviews • laboratory studies • focus groups • expert studies • surveys • comparison studies • traffic/usage analysis • observations • logs/keystrokes analysis • ethnography/field studies 29
  • 30. metaphor: the pendulum • Oen is hard to decide if our research will be qualitative or quantitative. • Sometimes it is easy; predetermined by the context and the scope of evaluation. • uantitative try to verify a phenomenon, usually a recorded behavior, while qualitative approaches try to explain it, identifying the motives and the perceptions behind it. 30
  • 31. metaphor continues: the pendulum • However, it is not only about the data. It is about having an approach during the collection, the analysis and the interpretation of our data. – for instance, microscopic log analysis in deep log analysis methodology can provide qualitative insights. • ere are inherent limitations, such as resources. – for instance, interviews are hard to quantify due to time required to record, transcribe and analyze. 31
  • 32. selecting methods • Various ways to rank these methods: – resources (to be able to ‘realize’ each method) – expertise (to support the various stages) – infrastructure (to perform the various stages) – data collection (to represent reality and be relieved from bias), see census data or sample data. • Triangulating methods is essential, but not easy. – “...using MMR allows researchers to address issues more widely and more completely than one method could, which in turn amplifies the richness and complexity of the research findings” Fidel [2008]. 32
  • 33. criteria • Criteria are ‘topics’ or perspectives of measurement or judgement. • Criteria oen come grouped. – for instance, the categories of usability, collection quality, service quality, system performance efficiency, user opinion in the study of Xie [2008] • Criteria oen have varied semantics between domains. Figure taken by Zhang [2007] 33
  • 34. word cloud: criteria Criteria cloud derived from the studies of Zhang [2010] and Xie [2008] 34
  • 35. metrics • Metrics are the measurement units that we need to establish a distance between -at least - two states. – one ideal (target metric) – one actual (reality metric) • An example, LibUAL’s scale 35
  • 36. examples of metrics ualitative (insights) Unstructured Observed aspects of measurement of performance, e.g. opinions and selections, patterns of perceptions interactions, etc. Goals & Attitudes Behaviors (what people say) (what people do) Recorded aspects of Scaled measurement of activity, e.g. time or opinions and errors, via logs or other perceptions methods uantitative (validation) 36
  • 37. the loneliness of the long distance runner object aim criteria metric instrument Example taken by Saracevic [2004] 37
  • 39. evaluation planning from above • uestions – like these we have asked • Buy-In – what is invested by each agent • Budget – how much is available • Methods – like those already mentioned Figure taken by Giersch and Khoo [2009] 39
  • 40. practical questions • Having stated our research statements and selected our methods, we need: – to make an ‘inventory’ of our resources – to define what personnel, how skilled and competent is – to define what instruments and tools we have – to define how much time is available • All depended on the funding, but some depended on the time of evaluation. 40
  • 41. planning an evaluation • Upper level planning tools – Logic models: useful to have an overview of the whole process and how is linked with the DL development project. – Zachman’s framework: useful to answer practical questions for setting our evaluation process. 41
  • 42. logic models • A graphical representation of the processes inside a project that reflects the links between investments and achievements. – inputs: project funding and resources – activities: the productive phases of the project – outputs: short term products/achievements – outcomes: long term products/achievements 42
  • 43. logic models: an instance Figure taken by Giersch & Khoo [2009] 43
  • 44. Zachman Framework • Zachman Framework is a framework for enterprise architecture. • It depicts a high-level, formal and structured view of an organization; a taxonomy for the organization of the structural elements of an organization under the lens of different views. • Classifies and organizes in a two-dimensional space all the concepts that are essential to be homogeneous and are needed to express the different planning views. – according to participants (alternative perspectives) – according to abstractions (questions) 44
  • 45. Zachman’s matrix What How Where Who When Why Data Process Location Work Timing Motivation Core Major Scope Business Principal Business Mission Business Business [Planner] Locations Actors Events & Goals Concepts Transformations Business Business Model Workflow Business Policy Fact Model Tasks Connectivity [Owner] Models Milestones Charter Map Platform & State System Model Data Behavior Rule Communications BRScripts Transition [Evaluator] Model Allocation Book Map Diagrams Technology Relational Technical Plat- Procedure & Work ueue Program Rule Model Database form & Commu- Interface & Scheduling Specifications Specifications [Evaluator] Design nications Design Specifications Designs Detail Database Source Procedures & Work ueues Rule representation Network Schema Code Interfaces & Schedules Base [Evaluator] Operational Operational Functioning Bus Operational Operational Operational Operational Procedures & Work ueues [Evaluator] Database Object Cod Network Rules Interfaces & Schedules 45
  • 46. guiding our steps • Assuming that we decided what is best for us, can we find out how far we are from it? • Is there a roadmap? 46
  • 47. a roadmap Figure taken by Nicholson [2004] 47
  • 48. when do you evaluate? Project start Prototypea DL release Evaluation [a] development development use <user requirements> formative evaluationa summative evaluationa methods inventory from product to process 48
  • 49. how much is... many? • Funding is crucial and must be prescribed in the proposal. • Anecdotal evidence speaks of 5-10% of overall budget. • Budget allocation usually stays inside project gulfs (also anecdotal evidence). • Depending on the methods, e.g. logs analysis is considered a low cost method, as well as heuristic evaluation techniques are labeled as ‘discount’. 49
  • 51. outcomes to expect • Usually evaluation in research-based digital libraries is summative and has the role of ‘deliverable’. • What should we expect in return? – some positive and some negative results • Positive • Negative – a set of meaningful – a set of non-meaningful findings to transform into findings that can not be recommendations (actions exploited to be taken). – highly inconsistent and – dependent on the scope of scarce evaluation. – non applicable and biased 51
  • 52. careful distinctions • Demographics and user behavior related data are not evaluation per se. • Assist our analysis and interpretation of data, describing thus their status, but they are not evaluating. 52
  • 53. part two a formal description of the digital library evaluation domain
  • 54. by now • You have started viewing things in a coherent way, • Familiarizing with the idea of collaborating with other agents to perform better evaluation activities and • Forming the basis to judge potential avenues in your evaluation planning. • But can you do all these things better? • And how? 54
  • 55. ontologies as a means to an end • Formal models that help us – to understand a knowledge domain and thus the DL evaluation field – to build knowledge bases to compare evaluation instances – to assist evaluation initiatives planning • Ontologies use primitives such as: – classes (representing concepts, entities, etc.) – relationships (linking the concepts together) – functions (constraining the relationships in particular ways) – axioms (stating true facts) – instances (reflecting examples of reality) 55
  • 56. a formal description of DL evaluation • An ontology to – perform useful comparisons – assist effective evaluation planning • Implemented in OWL with Protégé Ontology Editor 56
  • 57. the upper levels Levels Research uestions isAffecting / isAffectedBy content level, processing isDecomposedTo level, engineering level, interface level, individual level, institutional level, Dimensions social level effectiveness, performance measurement, service quality, technical excellence, outcomes isCharacterizing/ assessment Subjects isCharacterizedBy isOperatedBy isAimingAt isFocusingOn isOperating Characteristics Objects isCharacterizing/ isCharacterizedBy Goals hasDiminsionsType describe, document, design Dimensions Type formative, summative, iterative 57
  • 58. the low levels Instruments devices, scales, soware, statistics, narrative items, research artifacts isUsedIn/isUsing isSupporting/isSupportedBy isReportedIn/isReporting hasPerformed/isPerformedIn hasMeansType Findings Activity Means Means Types record, measure, analyze, compare, Comparison studies, qualitative, interpret, report, recommend expert studies, quantitative laboratory studies, hasSelected/isSelectedIn field studies, logging studies, surveys Criteria Criteria Categories isDependingOn specific aims, standards, toolkits isGrouped/isGrouping isMeasuredBy/isMeasuring isSubjectTo Metrics Factors content initiated, system initiated, cost, inastructure, user initiated personnel, time 58
  • 59. connections between levels Levels Dimensions Subjects content level, processing effectiveness, performance level, engineering level, measurement, service quality, interface level, individual technical excellence, outcomes level, institutional level, assessment Objects social level hasConstituent isAppliedTo /isConstituting Research uestions Activity Means record, measure, analyze, compare, Comparison studies, isAddressing interpret, report, recommend expert studies, laboratory studies, Findings field studies, logging studies, surveys hasInitiatedFrom Metrics content initiated, system initiated, user initiated 59
  • 60. use of the ontology [a] • we use ontology paths to express explicitly a process or a requirement. Activities/analyze - isPerformedIn - Means/logging studies- hasMeansType - Means Type isPerformedIn hasMeansType Activity Means Means Types record, measure, analyze, compare, Comparison studies, qualitative, interpret, report, recommend expert studies, quantitative laboratory studies, eld studies, logging studies, surveys 60
  • 61. use of the ontology [b] Level/content level - isAffectedBy - Dimensions/effectiveness - isFocusingOn - Objects/ usage of content/usage of data - isOperatedBy - Subjects/human agents isCharacterizedby - Characteristics/human agents-age, human agents-count, human agents-discipline, human agents-experience isAffectedBy Levels Dimensions Subjects content level, processing effectiveness, performance system agents, human level, engineering level, measurement, service quality, agents interface level, individual technical excellence, outcomes isCharacterizedby level, institutional level, assessment social level isOperatedBy isFocusingOn Characteristics age, count, discipline, experience, profession, Objects usage of content: usage of data, usage of metadata 61
  • 63. exercise • Based on your experience and/or your evaluation planning needs, use the ontology schema to map your own project and describe it. – If you do not have such experience, please use the case study outlined in Hand Out 5. – Use HandOuts 3a and 3b as an example and Hand Out 2 to fill the fields that you think are important to describe your evaluation. – Furthermore, report, if applicable, using Hand Out 4: • what is missing from your description • what is not expressed by the ontology 63
  • 65. summary • Evaluation must have a target – “...evaluation of a digital library need not inole the whole – it can concentrate on given components or functions and their specific objectives”. Saracevic, 2009 • Evaluation must have a plan and a roadmap – “An evaluation plan is essentially a contract between you ... and the other ‘stakeholders’ in the evaluation...”. Reeves et al., 2003 • Evaluation is depended on the context – “…is a research process that aims to understand the meaning of some phenomenon situated in a context and the changes that take place as the phenomenon and the context interact”. Marchionini, 2000 65
  • 66. but why is it so difficult to evaluate? • Saracevic mentions: – DLs are complex – Evaluation is still premature – ere is no strong interest – ere is lack of funding – Cultural differences – uite cynical: who wants to be judged? 66
  • 67. can it be easier for you? • We hope this tutorial made easier for you to address these challenges: – DLs are complex (you know it, but now you also know that DL evaluation is equally complex) – Evaluation is still premature (you got an idea of the field) – ere is no strong interest (maybe you are strongly motivated to evaluate your DL) – ere is lack of funding (ok, there is no funding in this room for your initiative) – Cultural differences (maybe you are eager to communicate with other agents) – Who wants to be judged? (hopefully, you) 67
  • 69. essential reading [a] • Bertot, J. C., & McClure, C. R. (2003). Outcomes assessment in the networked environment: research questions, issues, considerations, and moving forward. Library Trends, 51(4), 590-613. • Blandford, A., Adams, A., Attfield, S., Buchanan, G., Gow, J., Makri, S., et al. (2008). e PRET A Rapporter framework: evaluating digital libraries from the perspective of information work. Information Processing & Management, 44(1), 4-21. • Fuhr, N., Tsakonas, G., Aalberg, T., Agosti, M., Hansen, P., Kapidakis, S., et al. (2007). Evaluation of digital libraries. International Journal on Digital Libraries, 8 (1), 21-38. • Gonçalves, M. A., Moreira, B. L., Fox, E. A., & Watson, L. T. (2007). “What is a good digital library?”: a quality model for digital libraries. Information Processing & Management, 43(5), 1416-1437. 69
  • 70. essential reading [b] • Hill, L. L., Carver, L., Darsgaard, M., Dolin, R., Smith, T. R., Frew, J., et al. (2000). Alexandria Digital Library: user evaluation studies and system design. Journal of the American Society for Information Science and Technology, 51(3), 246-259. • Marchionini, G. (2000). Evaluating digital libraries: a longitudinal and multifaceted view. Library Trends, 49(2), 4-33. • Nicholas, D., Huntington, P., & Watkinson, A. (2005). Scholarly journal usage: the results of deep log analysis. Journal of Documentation, 61(2), 248-289. • Powell, R. R. (2006). Evaluation research: an overview. Library Trends, 55(1), 102-120. • Reeves, T. C., Apedoe, X., & Woo, Y. H. (2003). Evaluating digital libraries: a user-iendly guide. University Corporation for Atmospheric Research. • Saracevic, T. (2000). Digital library evaluation: towards an evolution of concepts. Library Trends, 49(3), 350-369. 70
  • 71. essential reading [c] • Wilson, M. L., Schraefel, M., & White, R. W. (2009). Evaluating advanced search interfaces using established information-seeking models. Journal of the American Society for Information Science and Technology, 60(7), 1407-1422. • Xie, H. I. (2008). Users' evaluation of digital libraries (DLs): their uses, their criteria, and their assessment. Information Processing & Management, 44(3), 27. • Zhang, Y. (2010). Developing a holistic model for digital library evaluation. Journal of the American Society for Information Science, 61(1), 88-110. • Giannis Tsakonas and Christos Papatheodorou (eds). 2009. Evaluation of digital libraries: an insight to useful applications and methods. Oxford: Chandos Publishing. 71
  • 72. tutorial material • Slides, literature and rest of the training material are available at: – http://dlib.ionio.gr/~gtsak/ecdl2010tutorial, – http://bit.ly/9iUScR, link to the papers’ public collection in Mendeley photos by bibliography created with 72

Editor's Notes