SlideShare a Scribd company logo
1 of 60
Big and Small Web Data


Marieke Guy, Institutional Support Officer,
Digital Curation Centre, UKOLN, University of Bath, UK

Institutional Web Management Workshop 2012




                                                 UKOLN is supported by:



           This work is licensed under a Creative Commons Licence
           Attribution-ShareAlike 2.0


  1
Who Am I?

    • Have worked for UKOLN for over 12 years
    • Worked on variety of projects:
      Subject portals project, IMPACT, Good APIs,
      JISC Observatory, cultural heritage work,
       digital preservation work, …etc
    • Remote worker, into amplified events
    • Co-chair of IWMW for a number of years

    • Now working for Digital Curation Curation
    • Institutional Support Officer helping HEIs with their RDM
    • New to data….



2
The Digital Curation Centre

     • A consortium comprising units from the Universities of Bath
       (UKOLN), Edinburgh (DCC Centre) and Glasgow (HATII)
     • launched 1st March 2004 as a national centre for solving
       challenges in digital curation that could not be tackled by
       any single institution or discipline
     • Funded by JISC with additional HEFCE funding from 2011
       for the provision of support to national cloud services
     • Targeted institutional development
     • http://www.dcc.ac.uk/




3
Assessing Data Use




4
Data Management Tools




5
Advocacy and Training


                      • Informatics: disciplinary
                        metadata schema,
                        standards, formats,
                        identifiers, ontologies
                      • Storage: file-store, cloud,
                        data centres, funder policy
                      • Access: embargoes, FOI
                      • Policy: making the case



How to cite data
 6
Who Are You?

     • Are you part of a Web team?
     • Are you part of a MIS team?
     • Are you a researcher?

     • Do you know what data is?

     • Do you use structured data?
     • Do you manage data?




7
Today‘s Workshop: A Data Journey!

     • Presentation: What is data anyway?
       Looking at current data trends and what it has to do with
       Web managers

     • Break out groups: What data do you deal with?
       Anything goes from personnel data to key information sets
       and Web stats…

     • Presentation/Show and Tell: Taster of tools that help with
       data (mining, citation, visualization, analytics, etc.)

     • Presentation: Case study - Data @ Southampton

     • Discussion and buzzword bingo
8
Today‘s Resources

     • All urls at:
       http://www.delicious.com/mariekeguy/iwmw12

     • All slides at:
       http://www.slideshare.net/MariekeGuy

     • Also on IWMW12 Web site




9
http://www.google.co.uk/imgres?q=illumina+bgi&hl=en&client=firefox-
                                               a&hs=Jl2&rls=org.mozilla:en-GB:official&biw=1366&bih




http://www.flickr.com/photos/think
                                     What is Data
                                     Anyway?
mulejunk/352387473/



 http://www.flickr.com/photos/usf
 sregion5/4546851916//




                                                                                             http://www.flickr.com/photos/wasp
                                     http://www.flickr.com/photos/charleswelch/3             _barcode/4793484478/
  10                                 597432481//
A Data Definition

      • Datum is / data are (!!!):
         – Facts and statistics collected together for reference or
           analysis
         – Typically the results of measurements
         – Can be qualitative or quantitative
         – Unstructured or structured
         – Raw data, field data, experimental data
         – Data – information – knowledge
         – Data is the lowest level of abstraction
      • Even researchers don‘t know what data is….




11
A Data Present

           “Data underpins our economy and
           our society - data about how
           much is being spent and where,
           data about how schools, hospitals
           and police are performing, data
           about where things are and data
           about the weather.”

           Tim Berners Lee, director of W3C.



12
Some Flavours of Data

      •   Big data
      •   DIY data
      •   Consumer data
      •   Activity data
      •   Crowd Sourced data
      •   Linked data/ Web of data / semantic Web
      •   Open data




13
Big Data




      ―big data people obviously like alliteration – ―volume,
      velocity, variety, value‖ ―speed, size, scope‖ Andy
      Powell



                             ―Data that is too big to manage
14                           using ‗normal‘ (database) tools.‖
Big Data
                “I worry there won’t be enough people around
                to do the analysis”
                Chris Ponting, University of Oxford

                 “Raw image files for a single human genome
                 have been estimated at 28.8 terabytes, which
                 is approaching 30,000 gigabytes”

“The cost of sequencing DNA has taken a
nosedive...and is now dropping by 50% every 5 months”
“The 1000 Genomes Project generated more DNA sequence
data in its first 6 months than GenBank had accumulated in its
entire 21 year existence”

 “A single sequencer can now generate in a day what it took
 10 years to collect for the Human Genome Project”
15
Big Data

      • 3 Vs: volume, velocity and variety
      • Could include scientific & research data, data Web logs,
        RFID data, social data, search data, video, e-commerce
      • Likely to require different tools and practices from what
        ‗we are used to‘
      • Technologies include massively parallel processing (MPP)
        databases, datamining grids, distributed file systems,
        distributed databases, cloud computing platforms and
        scalable storage systems
      • Example tools are Hadoop, NoSQL, CouchDB,
      • Issues regarding storage, speed of access, exponential
        growth, infrastructure, complexity



16
DIY Data
                                                Kyle Machulis




 “DIY”
Human
physiology
data
17              http://www.technologyreview.com/biomedicine/37784/
Consumer Data




18
Consumer Data




19            http://www.touchagency.com/free-twitter-infographic/
Consumer Data

                              1 in every 9
                              people on
                              Earth is on
                              Facebook

     30 billion pieces of                            Google has been
     content are shared                              estimated to run over 1
     on Facebook each                                million servers in data
     month                                           centers around the
                       Walmart take data
                                                     world
                       from 1 million
                       customer transactions
                       per hour
                                         There are over 6
                                         billion photos on
                                         Flickr
20
Activity Data


      • ―Data about users‘ actions and attention‖
      • Access, attention and activity
      • Many systems in institutions store data about the actions of
        students, teachers and researchers
      • It‘s good business
      • http://www.activitydata.org/
      • JISC Projects:
         – Recommender systems
         – Improving the student experience
         – Resource management
      • JISC Info kit – Business intelligence
      • Student retention
21
22
Crowd Sourced Data




“Crowd-sourced” astronomy



23
Open Data

      • ―A piece of content or data is open if anyone is free to use,
        reuse, and redistribute it — subject only, at most, to the
        requirement to attribute and share-alike.‖ Open Knowledge
        Foundation
      • Why? Use of public money, advancement of science
      • Why not? Commercial and reputation reasons, cost of
        preparing data
      • ―You can do all types of stuff with data‖ TBL
      • But tricky to open access to data (cost, preparation,
        capturing meaning, annotations, context, meaning etc.)
      • Data is more valuable when accessible
      • Open data on Web: CKAN, open.gov, infochimps,
        openstreetmap, dbpedia, freebase, numbrary, etc.

24
Linked Data

                                                           • Repurposing and
                                                             aggregating data
                                                             in machine
                                                             readable format
                                                           • Southampton
                                                           • data.open.ac.uk
                                                           • Lucero project
                                                           • Linkeduniversitie
                                                             s.org
                                                           • XCRI
                                                           • Lincoln
                                                           • Data.gov.uk

 http://www.flickr.com/photos/reedsturtevant/4288406572/

25
The Key Data Issues

      • Scale and complexity – data deluge – volume, pace,
        infrastructure
      • Sensitivity of data
      • Openness – why aren‘t people sharing?
      • Quality of data
      • Reputation – FOI, DPA, computer misuse
      • Management – Storage, incentive, costs & sustainability
      • Preservation – where is your data?
      • Funding for researchers
      • Analysis

      • Doing something useful with it…

26
Sensitive Data

      • DPA 1998
      – Sensitive Personal Data
        ―Data regarding an individual‘s race or ethnic origin,
        political opinion, religious beliefs, trade union membership,
        physical or mental health, sex life, criminal proceedings or
        convictions…‖
      – Personal data
         • Relates to a living individual
         • The individual can be identified from those data and
            other information
         • Includes any expression of opinion about the individual
      • Data that may incriminate a person
      • Data a person prefers not to share with wider society

27
Openness
Choices are made according to context, with
degrees of openness reached according to:
• The kinds of data to be made available
• The stage in the research process
• The groups to whom data will be made
  available
• On what terms and conditions it will be
  provided

Default position of most:
• YES to protocols, software, analysis tools,
  methods and techniques
• NO to making research data content freely
  available to everyone
After all, where is the incentive?              Angus Whyte, RIN/NESTA, 2010
28
Reputation




29
Data Storage Challenges
      •    Scalable
      •    Cost-effective (rent on-demand)
      •    Secure (privacy and IPR)
      •    Robust and resilient
      •    Low entry barrier / ease-of-use
      •    Has data-handling / transfer /
           analysis capability

      What about Cloud services?




The case for cloud computing in genome informatics.
Lincoln D Stein, May 2010

 30
The Web Managers ask:



      ―So what has all this got to do
              with me..?‖




31
Break Out Groups

     What data do you deal
     with?

     •   Personnel data
     •   Admissions
     •   Timetables
     •   Curriculum
     •   key information sets
     •   Web stats…

     What do you do with this
     data?

     Could you do more? What?

                                http://sidspace.info/
32
Are the Web Managers still asking?



       ―So what has all this got to do
               with me..?‖




33
A Data Future

           “The ability to take data - to be
           able to understand it, to
           process it, to extract value
           from it, to visualise it, to
           communicate it –that‘s going to
           be a hugely important skill in
           the next decades.”

           Hal Varian, Google‘s chief economist.



34
         Hal Varian, Chief Economist, Google
Web Teams and Data

      • Data is relevant to those working with the Web at HEIs
        because:

      • Data will affect your IT infrastructure, if it doesn‘t already
      • Data is becoming increasingly important for the REF and for
        funding so it will be increasingly important to your HEI
      • It is getting easier to ask for data

      • Structured data could make your life easier
      • The Web itself is becoming more structured
      • Data can show impact

      • It‘s all about the data….
35
Web Teams and Data

      • Unstructured data accounts for more than 90% of digital
        universe (2011 Digital Universe study)
      • Structured data on the rise for some time – deep web,
        annotation schemes, search data
      • In the past web pages have contained information, now is
        the time for them to contain data

      • Some key data areas Web teams need to think about:
         – Structure
         – Metrics
         – Patterns, data mining and analytics
         – Preservation (maybe one for another day?)


36
Web Data: Structure

      • Move toward a Web that‘s more fluid, less fixed, and more
        easily accessed on a multitude of devices
      • futurefriend.ly‘s Brad Frost, ―get your content ready to go
        anywhere because it‘s going to go everywhere.‖
      • Karen McGrane: calls them ―content blobs‖ – ―we can
        embrace meaningful, modular chunks that are ready to
        travel‖
      • Google Knowledge Graph: ―currently contains more than
        500 million objects, as well as more than 3.5 billion facts
        about and relationships between these different objects‖
      • Schema.org: ―a collection of schemas, i.e., html tags, that
        webmasters can use to markup their pages in ways
        recognized by major search providers‘‖


37
Preparing for Structure

      • There is a need for structured content in Web sites
      • ‗Future ready content‘ - Sara Wachter-Boettcher
         – 1. Get Purposeful – why do users want this content?
         – 2. Get Micro – get granular, break content down
           (schema.org – microdata)
         – 3. Get Meaningful – considering the meaning of
           elements
         – 4. Get Organised – looking at your CMS
         – 5. Get Structured – DITA? XML? HTML5 (microdata)
      • ‗Create once, publish everywhere‘ idea – mobile, apis, etc.




38
Web Data: Metrics

      • Metrics – the new black? Kristen Ratan
      • ―The more you know the more you realise you don‘t know‖
      • What should we be tracking? e.g. Figures opened,
        downloaded, inks clicked, time spent on article page,
        supplemental info viewed, authors‘ info viewed
      • Look at the pathways that info travels
      • Data can drive tenure and promotion, grants, reputation,
        discovery, prioritization, attention
      • Issues: Missed citation data, data sources that aren‘t
        reliable, digital addresses change, usage doesn‘t mean
        useful




39
Web Data: Patterns

          “In other words, we no longer
          need to speculate and
          hypothesise; we simply need to let
          machines lead us to the patterns,
          trends, and relationships in social,
          economic, political, and
          environmental relationships.”

          Mark Graham, Big Data blog, the Guardian.



40
         Hal Varian, Chief Economist, Google
Web Data: Analytics

      • Customers expect us to be leveraging their activity to
        benefit their user experience
      • ―the process of developing actionable insights through
        problem definition and the application of statistical models
        and analysis against existing and/or simulated future data.‖
        Adam Cooper, CETIS
      • Reporting and descriptive methods Vs inferential and
        predictive methods
      • Data driven decisions? ―human decisions supported by the
        use of good tools to provide us with data-derived insights‖
      • Don‘t ―let the numbers speak for themselves‖ – data only
        one input to decision process
      • Data specialists and domain specialists work together
      • Need to ask the right questions
41
Web Data: Learning Analytics

      • ―The measurement, collection, analysis and reporting of
        data about learners and their contexts, for purposes of
        understanding and optimising learning and the
        environments in which it occurs.‖ 1st International
        Conference on Learning Analytics & Knowledge
      • Open University Learner Analytics Project
         – Looked at withdrawals - e.g. when students stop study
           before completion of a module towards a degree
         – Possible to map what points on paths of study
           withdrawals occur.
      • Other uses: personalisation, recommendation, research
        profiles, marketing and surveys, help desk, CRM, library
      • Looking at disabled students/accessibility – linking learner
        analytics and web metrics
42
Protection of Freedoms Bill

      • The Protection of Freedoms Bill is a UK parliamentary bill
        introduced in February 2011
      • Has completed it‘s readings – now passing through house of
        Lords
      • 102 - amendments to FOIA - mandatory for public
        authorities to permit re-use of datasets when
        communicating them in response to a FOI request
      • Datasets are collections of information held in electronic
        form i.e. 'raw data' gathered or created in connection with
        the university's functions or 'services‘
      • Government‘s Innovation and Research Strategy for Growth
        - "a transformation in the accessibility of research and
        data‖


43
Tools that Could Help




        http://www.flickr.com/photos/luc/5418037955/
44
Tools: Structure

      • Schema.org
      • Google Rich Snippets testing tool – tests microdata,
        microformats, RDFa
      • List of tools on Semanticweb.org




45
Tools: Metrics & Text Mining

      •   Google Analytics
      •   Elsevier
      •   total-impact
      •   altmetric.com




46
Tools: Analytics

      •   SNAPP: Social Networks Adapting Pedagogical Practice
      •   GLASS (Gradient‘s Learning Analytics System)
      •   International Educational Data Mining society
      •   Learning Analytics and Knowledge Conference




47
Data Visualisations

      • Use your IT and your graphics design department
      • Make it interactive
      • Getting Awesome Results from Data
        Visualisation – Rich Kirk
      • Data visualisation strategy
         – Have a purpose
         – Have measurable KPIs vs purpose
         – Plan distribution in advance
         – Resource
         – Ensure visualisation matches purpose
      • Chart chooser (Gene Zelazny's Saying It With Charts)
      • Measurement: pageviews, buzz, links, key word ranking
      • ―Tell a story with your data‖ – Ewan McIntosh at IDCC11
48
Data Visualisation Help

      • Great Web sites
         –   Ewan McIntosh
         –   Information is Beautiful
         –   Pinterest
         –   Guardian data blog
         –   Flowing data
         –   Infosthetics – information aesthetics – where form follows data
      • Great tools
         –   Manyeyes
         –   Chartsbin, icharts, Google chart tools – Google developer
         –   Google Fusion tables
         –   Tableau public
         –   Datamarket
         –   Colour Brewer



49
Visualisations: Google Maps




50
Data Case Study: Southampton

• Not big data
  but small data
• Got to be
  useful!!




       Chris Gutteridge - http://blogs.ecs.soton.ac.uk/data/



51
Southampton Data

     • Places: Buildings, Rooms, Campuses, Counties, Disabled Access
     • Organisation Structure
     • Products & Services: Coffee, Sandwiches, Library Services,
       Recycle Points
     • Points of Service: Coffee Shops, Swimming Pools, Libraries,
       Receptions
     • Teaching: Courses, Modules, Statistics, Student Satisfaction
     • Travel: Stations, Bus-Stops, Bus-Routes, Bus Times
     • Resources: EPrints, Videos, Learning Objects
     • People: Contact Information, Experts for the Media
     • Events: Open Days, University History
     • Jargon


52
Southampton Open Data




53
Southampton Uses…

     • Google docs, excel spread sheets, RDF, triples
     • Grinder – github
     • Graphite – php library
     • Graphite (publishing RDF). Required skills:
        – RDF structure
        – RDF/XML
        – XSLT
     • Graphite (consuming RDF). Required skills:
        – RDF structure
        – PHP




54
Data Case Study: Aberdeen

         ―I managed the Web and then
         inherited MIS. These two have now
         converged so that Web is using much
         better, structured data and
         standardising and consolidating
         sources. The MIS brings discipline to
         the Web – much needed if you ask
         me, anarchist though I am...”

         Mike McConnell, Head of Web Services, University of
         Aberdeen.
55
Student Attendance Data

      • Loughborough University‘s Pedestal for Progression
      • Roehampton University‘s fulCRM
      • Southampton Student Dashboard at the University of
        Southampton
      • tutees, directory info, whether coursework has been
        handed in, and attendance.
      • University of Derby‘s SETL (Student Engagement Traffic
        Lighting)
      • The ESCAPES (Enhancing Student Centred Administration for
        Placement ExperienceS) project at the University of
        Nottingham




56
Conclusions

     • At the moment it‘s all about the data… (whether you like it or
       not!)
     • Be aware of what is happening with data at your institution –
       data repository, MIS, RIM, CRIS, repository etc. Where do you
       sit in the picture?
     • Structure your Web data – it makes sense
     • You can start with ‗little data‘…
     • Think about what strategic questions you want to ask
     • Be grounded – efficiency and effectiveness
     • Start from the user end - think about the uses and output
     • Follow up from the IT end – how can you automate processes?
     • What can you use your data for? Can you show impact/success?
     • How about telling a story with it?

57
Buzzword Bingo
                                                      data
                                          Linked      wrangler
     cloud              Big                data
     computing          data

                                                        para
                                                        data
 Data-Driven Decision making
                                     data mining
           data          data                          data
        journalism       scientist                     tsunami
knowledge
discovery in         clustering         predictive analytics
data (KDD)
58
What Data Can and Cannot Do

      • From Guardian Datablog, by Johnathan Gray

      •   Data is not a force unto itself.
      •   Data is not a perfect reflection of the world.
      •   Data does not speak for itself.
      •   Data is not power.
      •   Interpreting data is not easy.




59
Thanks!!


       ―The data that is valuable to you is
       already passing through your hands" ”

       Doug Cutting, Chairman, Apache Software Foundation




60

More Related Content

What's hot

SC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 Workshop
SC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 WorkshopSC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 Workshop
SC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 WorkshopBigData_Europe
 
Advancing the National Digital Platform: Survey Findings
Advancing the National Digital Platform: Survey FindingsAdvancing the National Digital Platform: Survey Findings
Advancing the National Digital Platform: Survey FindingsOCLC
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataHamilton Public Library
 
WWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & EducationWWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & EducationStefan Dietze
 
Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionEUCLID project
 
Research Data Management at Edinburgh: Effecting Culture Change
Research Data Management at Edinburgh: Effecting Culture ChangeResearch Data Management at Edinburgh: Effecting Culture Change
Research Data Management at Edinburgh: Effecting Culture ChangeEDINA, University of Edinburgh
 
University of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyondUniversity of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyondRobin Rice
 
Search, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataSearch, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataNattiya Kanhabua
 
Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Robin Rice
 
Benefits of Open Data and Policy Developments, perspectives from research ins...
Benefits of Open Data and Policy Developments, perspectives from research ins...Benefits of Open Data and Policy Developments, perspectives from research ins...
Benefits of Open Data and Policy Developments, perspectives from research ins...Academy of Science of South Africa (ASSAf)
 
MPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for PresentationMPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for PresentationShawn Day
 
Keystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenanceKeystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenancePaolo Missier
 
Do & don't of supporting Open Science
Do & don't of supporting Open ScienceDo & don't of supporting Open Science
Do & don't of supporting Open ScienceSarah Jones
 
New data sources for statistics: Experiences at Statistics Netherlands.
New data sources for statistics: Experiences at Statistics Netherlands.New data sources for statistics: Experiences at Statistics Netherlands.
New data sources for statistics: Experiences at Statistics Netherlands.Piet J.H. Daas
 
Web search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introductionWeb search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introductionAli Dasdan
 
Turning Data into Knowledge (KESW2014 Keynote)
Turning Data into Knowledge (KESW2014 Keynote)Turning Data into Knowledge (KESW2014 Keynote)
Turning Data into Knowledge (KESW2014 Keynote)Stefan Dietze
 

What's hot (20)

African Open Science Platform
African Open Science PlatformAfrican Open Science Platform
African Open Science Platform
 
Open Data in a Day - Introduction to Open Data
Open Data in a Day - Introduction to Open DataOpen Data in a Day - Introduction to Open Data
Open Data in a Day - Introduction to Open Data
 
SC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 Workshop
SC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 WorkshopSC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 Workshop
SC6 Workshop 1: From your data to data stories - BigDataEurope, SC6 Workshop
 
Advancing the National Digital Platform: Survey Findings
Advancing the National Digital Platform: Survey FindingsAdvancing the National Digital Platform: Survey Findings
Advancing the National Digital Platform: Survey Findings
 
Research Data Management: Policy Development
Research Data Management: Policy DevelopmentResearch Data Management: Policy Development
Research Data Management: Policy Development
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with Data
 
WWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & EducationWWW2013 Tutorial: Linked Data & Education
WWW2013 Tutorial: Linked Data & Education
 
Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An Introduction
 
Data education and skills initiatives
Data education and skills initiativesData education and skills initiatives
Data education and skills initiatives
 
Research Data Management at Edinburgh: Effecting Culture Change
Research Data Management at Edinburgh: Effecting Culture ChangeResearch Data Management at Edinburgh: Effecting Culture Change
Research Data Management at Edinburgh: Effecting Culture Change
 
University of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyondUniversity of Edinburgh RDM Training: MANTRA & beyond
University of Edinburgh RDM Training: MANTRA & beyond
 
Search, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataSearch, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving Data
 
Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...Designing and delivering an international MOOC on Research Data Management an...
Designing and delivering an international MOOC on Research Data Management an...
 
Benefits of Open Data and Policy Developments, perspectives from research ins...
Benefits of Open Data and Policy Developments, perspectives from research ins...Benefits of Open Data and Policy Developments, perspectives from research ins...
Benefits of Open Data and Policy Developments, perspectives from research ins...
 
MPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for PresentationMPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for Presentation
 
Keystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenanceKeystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenance
 
Do & don't of supporting Open Science
Do & don't of supporting Open ScienceDo & don't of supporting Open Science
Do & don't of supporting Open Science
 
New data sources for statistics: Experiences at Statistics Netherlands.
New data sources for statistics: Experiences at Statistics Netherlands.New data sources for statistics: Experiences at Statistics Netherlands.
New data sources for statistics: Experiences at Statistics Netherlands.
 
Web search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introductionWeb search-metrics-tutorial-www2010-section-1of7-introduction
Web search-metrics-tutorial-www2010-section-1of7-introduction
 
Turning Data into Knowledge (KESW2014 Keynote)
Turning Data into Knowledge (KESW2014 Keynote)Turning Data into Knowledge (KESW2014 Keynote)
Turning Data into Knowledge (KESW2014 Keynote)
 

Viewers also liked

B3: The Economical way to Amplify Your Event: Why and What?
B3: The Economical way to Amplify Your Event:  Why and What?B3: The Economical way to Amplify Your Event:  Why and What?
B3: The Economical way to Amplify Your Event: Why and What?Marieke Guy
 
B3: The Economical way to Amplify Your Event: Opportunities & Concerns
B3: The Economical way to Amplify Your Event: Opportunities & ConcernsB3: The Economical way to Amplify Your Event: Opportunities & Concerns
B3: The Economical way to Amplify Your Event: Opportunities & ConcernsMarieke Guy
 
B3: The Economical way to Amplify Your Event: How and When?
B3: The Economical way to Amplify Your Event: How and When?B3: The Economical way to Amplify Your Event: How and When?
B3: The Economical way to Amplify Your Event: How and When?Marieke Guy
 
What do good partnerships look like? Lessons learned from complex partnerships
What do good partnerships look like?  Lessons learned from complex partnershipsWhat do good partnerships look like?  Lessons learned from complex partnerships
What do good partnerships look like? Lessons learned from complex partnershipsMarieke Guy
 
Vidi webinar for Developers
Vidi webinar for DevelopersVidi webinar for Developers
Vidi webinar for DevelopersMarieke Guy
 
Home Working and the University of Bath
Home Working and the University of BathHome Working and the University of Bath
Home Working and the University of BathMarieke Guy
 
How do we group higher education providers and students?
How do we group higher education providers and students?How do we group higher education providers and students?
How do we group higher education providers and students?Marieke Guy
 

Viewers also liked (8)

The Iwmw 2008 Song
The Iwmw 2008 SongThe Iwmw 2008 Song
The Iwmw 2008 Song
 
B3: The Economical way to Amplify Your Event: Why and What?
B3: The Economical way to Amplify Your Event:  Why and What?B3: The Economical way to Amplify Your Event:  Why and What?
B3: The Economical way to Amplify Your Event: Why and What?
 
B3: The Economical way to Amplify Your Event: Opportunities & Concerns
B3: The Economical way to Amplify Your Event: Opportunities & ConcernsB3: The Economical way to Amplify Your Event: Opportunities & Concerns
B3: The Economical way to Amplify Your Event: Opportunities & Concerns
 
B3: The Economical way to Amplify Your Event: How and When?
B3: The Economical way to Amplify Your Event: How and When?B3: The Economical way to Amplify Your Event: How and When?
B3: The Economical way to Amplify Your Event: How and When?
 
What do good partnerships look like? Lessons learned from complex partnerships
What do good partnerships look like?  Lessons learned from complex partnershipsWhat do good partnerships look like?  Lessons learned from complex partnerships
What do good partnerships look like? Lessons learned from complex partnerships
 
Vidi webinar for Developers
Vidi webinar for DevelopersVidi webinar for Developers
Vidi webinar for Developers
 
Home Working and the University of Bath
Home Working and the University of BathHome Working and the University of Bath
Home Working and the University of Bath
 
How do we group higher education providers and students?
How do we group higher education providers and students?How do we group higher education providers and students?
How do we group higher education providers and students?
 

Similar to Big and Small Web Data

Supporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data ManagementSupporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data ManagementMarieke Guy
 
Sirris innovate2011 - Smart Products with smart data - introduction, Dr. Elen...
Sirris innovate2011 - Smart Products with smart data - introduction, Dr. Elen...Sirris innovate2011 - Smart Products with smart data - introduction, Dr. Elen...
Sirris innovate2011 - Smart Products with smart data - introduction, Dr. Elen...Sirris
 
Informatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeInformatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeLiz Lyon
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
 
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...LIBER Europe
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptxAkhirulAminulloh2
 
Cosi Usage Data
Cosi   Usage DataCosi   Usage Data
Cosi Usage Datadaveyp
 
Graham Pryor
Graham PryorGraham Pryor
Graham PryorEduserv
 
ISWC 2012 Keynote
ISWC 2012 KeynoteISWC 2012 Keynote
ISWC 2012 KeynoteJeanne Holm
 
Managing and Sharing Research Data
Managing and Sharing Research DataManaging and Sharing Research Data
Managing and Sharing Research DataMartin Donnelly
 
Why manage research data?
Why manage research data?Why manage research data?
Why manage research data?Graham Pryor
 
Meeting Federal Research Requirements
Meeting Federal Research RequirementsMeeting Federal Research Requirements
Meeting Federal Research RequirementsICPSR
 

Similar to Big and Small Web Data (20)

Supporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data ManagementSupporting Libraries in Leading the Way in Research Data Management
Supporting Libraries in Leading the Way in Research Data Management
 
Sirris innovate2011 - Smart Products with smart data - introduction, Dr. Elen...
Sirris innovate2011 - Smart Products with smart data - introduction, Dr. Elen...Sirris innovate2011 - Smart Products with smart data - introduction, Dr. Elen...
Sirris innovate2011 - Smart Products with smart data - introduction, Dr. Elen...
 
Informatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data DecadeInformatics Transform : Re-engineering Libraries for the Data Decade
Informatics Transform : Re-engineering Libraries for the Data Decade
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
 
Open Sesame: Open Data, Data Liberation and Opportunities for Librarians
Open Sesame: Open Data, Data Liberation and Opportunities for LibrariansOpen Sesame: Open Data, Data Liberation and Opportunities for Librarians
Open Sesame: Open Data, Data Liberation and Opportunities for Librarians
 
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...
 
Big dataorig
Big dataorigBig dataorig
Big dataorig
 
CAEPIA 2011
CAEPIA 2011CAEPIA 2011
CAEPIA 2011
 
Lecture4 Social Web
Lecture4 Social Web Lecture4 Social Web
Lecture4 Social Web
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptx
 
Big Data
Big Data Big Data
Big Data
 
Cosi Usage Data
Cosi   Usage DataCosi   Usage Data
Cosi Usage Data
 
Big Data World
Big Data WorldBig Data World
Big Data World
 
Graham Pryor
Graham PryorGraham Pryor
Graham Pryor
 
ISWC 2012 Keynote
ISWC 2012 KeynoteISWC 2012 Keynote
ISWC 2012 Keynote
 
DBMS
DBMSDBMS
DBMS
 
Managing and Sharing Research Data
Managing and Sharing Research DataManaging and Sharing Research Data
Managing and Sharing Research Data
 
Data 101: A Gentle Introduction
Data 101: A Gentle IntroductionData 101: A Gentle Introduction
Data 101: A Gentle Introduction
 
Why manage research data?
Why manage research data?Why manage research data?
Why manage research data?
 
Meeting Federal Research Requirements
Meeting Federal Research RequirementsMeeting Federal Research Requirements
Meeting Federal Research Requirements
 

More from Marieke Guy

Ways to ensure “buy in” from the academics in the transition to digitised ass...
Ways to ensure “buy in” from the academics in the transition to digitised ass...Ways to ensure “buy in” from the academics in the transition to digitised ass...
Ways to ensure “buy in” from the academics in the transition to digitised ass...Marieke Guy
 
Assessing for a World Beyond Assessment
Assessing for a World Beyond AssessmentAssessing for a World Beyond Assessment
Assessing for a World Beyond AssessmentMarieke Guy
 
The blandness is its formulaic style’: insights to help understand the impact...
The blandness is its formulaic style’: insights to help understand the impact...The blandness is its formulaic style’: insights to help understand the impact...
The blandness is its formulaic style’: insights to help understand the impact...Marieke Guy
 
Redesigning assessments for a world with artificial intelligence
Redesigning assessments for a world with artificial intelligenceRedesigning assessments for a world with artificial intelligence
Redesigning assessments for a world with artificial intelligenceMarieke Guy
 
Closing remarks: Assessment with Phill Dawson
Closing remarks: Assessment with Phill DawsonClosing remarks: Assessment with Phill Dawson
Closing remarks: Assessment with Phill DawsonMarieke Guy
 
The UCL assessment journey
The UCL assessment journeyThe UCL assessment journey
The UCL assessment journeyMarieke Guy
 
The UCL lockdown browser pilot
The UCL lockdown browser pilotThe UCL lockdown browser pilot
The UCL lockdown browser pilotMarieke Guy
 
Assessment in the time of change
Assessment in the time of changeAssessment in the time of change
Assessment in the time of changeMarieke Guy
 
Digital Assessment Team 2022 - a day in the life.pptx
Digital Assessment Team 2022 - a day in the life.pptxDigital Assessment Team 2022 - a day in the life.pptx
Digital Assessment Team 2022 - a day in the life.pptxMarieke Guy
 
The Digital Assessment Marathon 
The Digital Assessment Marathon The Digital Assessment Marathon 
The Digital Assessment Marathon Marieke Guy
 
Inspired assessments
Inspired assessmentsInspired assessments
Inspired assessmentsMarieke Guy
 
Designing alternative assessments
Designing alternative assessmentsDesigning alternative assessments
Designing alternative assessmentsMarieke Guy
 
MCQs_ The joys of making your mind up.pdf
MCQs_ The joys of making your mind up.pdfMCQs_ The joys of making your mind up.pdf
MCQs_ The joys of making your mind up.pdfMarieke Guy
 
Rubrics_ removing the glitch in the assessment matrix (1).pdf
Rubrics_ removing the glitch in the assessment matrix (1).pdfRubrics_ removing the glitch in the assessment matrix (1).pdf
Rubrics_ removing the glitch in the assessment matrix (1).pdfMarieke Guy
 
Making your mind up: Formalising the evaluation of learning technologies 
Making your mind up: Formalising the evaluation of learning technologies Making your mind up: Formalising the evaluation of learning technologies 
Making your mind up: Formalising the evaluation of learning technologies Marieke Guy
 
Video assessment recipes
Video assessment recipesVideo assessment recipes
Video assessment recipesMarieke Guy
 
Alternative assessments
Alternative assessmentsAlternative assessments
Alternative assessmentsMarieke Guy
 
Connect more: Digital Culture forum - A thousand things, a thousand times
Connect more: Digital Culture forum - A thousand things, a thousand timesConnect more: Digital Culture forum - A thousand things, a thousand times
Connect more: Digital Culture forum - A thousand things, a thousand timesMarieke Guy
 
The Certainty of Uncertainty: Transnational Online Pivot in China
The Certainty of Uncertainty: Transnational Online Pivot in ChinaThe Certainty of Uncertainty: Transnational Online Pivot in China
The Certainty of Uncertainty: Transnational Online Pivot in ChinaMarieke Guy
 
The Transnational Online Pivot: A Case Study Exploring Online Delivery in China
The Transnational Online Pivot: A Case Study Exploring Online Delivery in ChinaThe Transnational Online Pivot: A Case Study Exploring Online Delivery in China
The Transnational Online Pivot: A Case Study Exploring Online Delivery in ChinaMarieke Guy
 

More from Marieke Guy (20)

Ways to ensure “buy in” from the academics in the transition to digitised ass...
Ways to ensure “buy in” from the academics in the transition to digitised ass...Ways to ensure “buy in” from the academics in the transition to digitised ass...
Ways to ensure “buy in” from the academics in the transition to digitised ass...
 
Assessing for a World Beyond Assessment
Assessing for a World Beyond AssessmentAssessing for a World Beyond Assessment
Assessing for a World Beyond Assessment
 
The blandness is its formulaic style’: insights to help understand the impact...
The blandness is its formulaic style’: insights to help understand the impact...The blandness is its formulaic style’: insights to help understand the impact...
The blandness is its formulaic style’: insights to help understand the impact...
 
Redesigning assessments for a world with artificial intelligence
Redesigning assessments for a world with artificial intelligenceRedesigning assessments for a world with artificial intelligence
Redesigning assessments for a world with artificial intelligence
 
Closing remarks: Assessment with Phill Dawson
Closing remarks: Assessment with Phill DawsonClosing remarks: Assessment with Phill Dawson
Closing remarks: Assessment with Phill Dawson
 
The UCL assessment journey
The UCL assessment journeyThe UCL assessment journey
The UCL assessment journey
 
The UCL lockdown browser pilot
The UCL lockdown browser pilotThe UCL lockdown browser pilot
The UCL lockdown browser pilot
 
Assessment in the time of change
Assessment in the time of changeAssessment in the time of change
Assessment in the time of change
 
Digital Assessment Team 2022 - a day in the life.pptx
Digital Assessment Team 2022 - a day in the life.pptxDigital Assessment Team 2022 - a day in the life.pptx
Digital Assessment Team 2022 - a day in the life.pptx
 
The Digital Assessment Marathon 
The Digital Assessment Marathon The Digital Assessment Marathon 
The Digital Assessment Marathon 
 
Inspired assessments
Inspired assessmentsInspired assessments
Inspired assessments
 
Designing alternative assessments
Designing alternative assessmentsDesigning alternative assessments
Designing alternative assessments
 
MCQs_ The joys of making your mind up.pdf
MCQs_ The joys of making your mind up.pdfMCQs_ The joys of making your mind up.pdf
MCQs_ The joys of making your mind up.pdf
 
Rubrics_ removing the glitch in the assessment matrix (1).pdf
Rubrics_ removing the glitch in the assessment matrix (1).pdfRubrics_ removing the glitch in the assessment matrix (1).pdf
Rubrics_ removing the glitch in the assessment matrix (1).pdf
 
Making your mind up: Formalising the evaluation of learning technologies 
Making your mind up: Formalising the evaluation of learning technologies Making your mind up: Formalising the evaluation of learning technologies 
Making your mind up: Formalising the evaluation of learning technologies 
 
Video assessment recipes
Video assessment recipesVideo assessment recipes
Video assessment recipes
 
Alternative assessments
Alternative assessmentsAlternative assessments
Alternative assessments
 
Connect more: Digital Culture forum - A thousand things, a thousand times
Connect more: Digital Culture forum - A thousand things, a thousand timesConnect more: Digital Culture forum - A thousand things, a thousand times
Connect more: Digital Culture forum - A thousand things, a thousand times
 
The Certainty of Uncertainty: Transnational Online Pivot in China
The Certainty of Uncertainty: Transnational Online Pivot in ChinaThe Certainty of Uncertainty: Transnational Online Pivot in China
The Certainty of Uncertainty: Transnational Online Pivot in China
 
The Transnational Online Pivot: A Case Study Exploring Online Delivery in China
The Transnational Online Pivot: A Case Study Exploring Online Delivery in ChinaThe Transnational Online Pivot: A Case Study Exploring Online Delivery in China
The Transnational Online Pivot: A Case Study Exploring Online Delivery in China
 

Recently uploaded

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersChitralekhaTherkar
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 

Recently uploaded (20)

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of Powders
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 

Big and Small Web Data

  • 1. Big and Small Web Data Marieke Guy, Institutional Support Officer, Digital Curation Centre, UKOLN, University of Bath, UK Institutional Web Management Workshop 2012 UKOLN is supported by: This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 1
  • 2. Who Am I? • Have worked for UKOLN for over 12 years • Worked on variety of projects: Subject portals project, IMPACT, Good APIs, JISC Observatory, cultural heritage work, digital preservation work, …etc • Remote worker, into amplified events • Co-chair of IWMW for a number of years • Now working for Digital Curation Curation • Institutional Support Officer helping HEIs with their RDM • New to data…. 2
  • 3. The Digital Curation Centre • A consortium comprising units from the Universities of Bath (UKOLN), Edinburgh (DCC Centre) and Glasgow (HATII) • launched 1st March 2004 as a national centre for solving challenges in digital curation that could not be tackled by any single institution or discipline • Funded by JISC with additional HEFCE funding from 2011 for the provision of support to national cloud services • Targeted institutional development • http://www.dcc.ac.uk/ 3
  • 6. Advocacy and Training • Informatics: disciplinary metadata schema, standards, formats, identifiers, ontologies • Storage: file-store, cloud, data centres, funder policy • Access: embargoes, FOI • Policy: making the case How to cite data 6
  • 7. Who Are You? • Are you part of a Web team? • Are you part of a MIS team? • Are you a researcher? • Do you know what data is? • Do you use structured data? • Do you manage data? 7
  • 8. Today‘s Workshop: A Data Journey! • Presentation: What is data anyway? Looking at current data trends and what it has to do with Web managers • Break out groups: What data do you deal with? Anything goes from personnel data to key information sets and Web stats… • Presentation/Show and Tell: Taster of tools that help with data (mining, citation, visualization, analytics, etc.) • Presentation: Case study - Data @ Southampton • Discussion and buzzword bingo 8
  • 9. Today‘s Resources • All urls at: http://www.delicious.com/mariekeguy/iwmw12 • All slides at: http://www.slideshare.net/MariekeGuy • Also on IWMW12 Web site 9
  • 10. http://www.google.co.uk/imgres?q=illumina+bgi&hl=en&client=firefox- a&hs=Jl2&rls=org.mozilla:en-GB:official&biw=1366&bih http://www.flickr.com/photos/think What is Data Anyway? mulejunk/352387473/ http://www.flickr.com/photos/usf sregion5/4546851916// http://www.flickr.com/photos/wasp http://www.flickr.com/photos/charleswelch/3 _barcode/4793484478/ 10 597432481//
  • 11. A Data Definition • Datum is / data are (!!!): – Facts and statistics collected together for reference or analysis – Typically the results of measurements – Can be qualitative or quantitative – Unstructured or structured – Raw data, field data, experimental data – Data – information – knowledge – Data is the lowest level of abstraction • Even researchers don‘t know what data is…. 11
  • 12. A Data Present “Data underpins our economy and our society - data about how much is being spent and where, data about how schools, hospitals and police are performing, data about where things are and data about the weather.” Tim Berners Lee, director of W3C. 12
  • 13. Some Flavours of Data • Big data • DIY data • Consumer data • Activity data • Crowd Sourced data • Linked data/ Web of data / semantic Web • Open data 13
  • 14. Big Data ―big data people obviously like alliteration – ―volume, velocity, variety, value‖ ―speed, size, scope‖ Andy Powell ―Data that is too big to manage 14 using ‗normal‘ (database) tools.‖
  • 15. Big Data “I worry there won’t be enough people around to do the analysis” Chris Ponting, University of Oxford “Raw image files for a single human genome have been estimated at 28.8 terabytes, which is approaching 30,000 gigabytes” “The cost of sequencing DNA has taken a nosedive...and is now dropping by 50% every 5 months” “The 1000 Genomes Project generated more DNA sequence data in its first 6 months than GenBank had accumulated in its entire 21 year existence” “A single sequencer can now generate in a day what it took 10 years to collect for the Human Genome Project” 15
  • 16. Big Data • 3 Vs: volume, velocity and variety • Could include scientific & research data, data Web logs, RFID data, social data, search data, video, e-commerce • Likely to require different tools and practices from what ‗we are used to‘ • Technologies include massively parallel processing (MPP) databases, datamining grids, distributed file systems, distributed databases, cloud computing platforms and scalable storage systems • Example tools are Hadoop, NoSQL, CouchDB, • Issues regarding storage, speed of access, exponential growth, infrastructure, complexity 16
  • 17. DIY Data Kyle Machulis “DIY” Human physiology data 17 http://www.technologyreview.com/biomedicine/37784/
  • 19. Consumer Data 19 http://www.touchagency.com/free-twitter-infographic/
  • 20. Consumer Data 1 in every 9 people on Earth is on Facebook 30 billion pieces of Google has been content are shared estimated to run over 1 on Facebook each million servers in data month centers around the Walmart take data world from 1 million customer transactions per hour There are over 6 billion photos on Flickr 20
  • 21. Activity Data • ―Data about users‘ actions and attention‖ • Access, attention and activity • Many systems in institutions store data about the actions of students, teachers and researchers • It‘s good business • http://www.activitydata.org/ • JISC Projects: – Recommender systems – Improving the student experience – Resource management • JISC Info kit – Business intelligence • Student retention 21
  • 22. 22
  • 24. Open Data • ―A piece of content or data is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and share-alike.‖ Open Knowledge Foundation • Why? Use of public money, advancement of science • Why not? Commercial and reputation reasons, cost of preparing data • ―You can do all types of stuff with data‖ TBL • But tricky to open access to data (cost, preparation, capturing meaning, annotations, context, meaning etc.) • Data is more valuable when accessible • Open data on Web: CKAN, open.gov, infochimps, openstreetmap, dbpedia, freebase, numbrary, etc. 24
  • 25. Linked Data • Repurposing and aggregating data in machine readable format • Southampton • data.open.ac.uk • Lucero project • Linkeduniversitie s.org • XCRI • Lincoln • Data.gov.uk http://www.flickr.com/photos/reedsturtevant/4288406572/ 25
  • 26. The Key Data Issues • Scale and complexity – data deluge – volume, pace, infrastructure • Sensitivity of data • Openness – why aren‘t people sharing? • Quality of data • Reputation – FOI, DPA, computer misuse • Management – Storage, incentive, costs & sustainability • Preservation – where is your data? • Funding for researchers • Analysis • Doing something useful with it… 26
  • 27. Sensitive Data • DPA 1998 – Sensitive Personal Data ―Data regarding an individual‘s race or ethnic origin, political opinion, religious beliefs, trade union membership, physical or mental health, sex life, criminal proceedings or convictions…‖ – Personal data • Relates to a living individual • The individual can be identified from those data and other information • Includes any expression of opinion about the individual • Data that may incriminate a person • Data a person prefers not to share with wider society 27
  • 28. Openness Choices are made according to context, with degrees of openness reached according to: • The kinds of data to be made available • The stage in the research process • The groups to whom data will be made available • On what terms and conditions it will be provided Default position of most: • YES to protocols, software, analysis tools, methods and techniques • NO to making research data content freely available to everyone After all, where is the incentive? Angus Whyte, RIN/NESTA, 2010 28
  • 30. Data Storage Challenges • Scalable • Cost-effective (rent on-demand) • Secure (privacy and IPR) • Robust and resilient • Low entry barrier / ease-of-use • Has data-handling / transfer / analysis capability What about Cloud services? The case for cloud computing in genome informatics. Lincoln D Stein, May 2010 30
  • 31. The Web Managers ask: ―So what has all this got to do with me..?‖ 31
  • 32. Break Out Groups What data do you deal with? • Personnel data • Admissions • Timetables • Curriculum • key information sets • Web stats… What do you do with this data? Could you do more? What? http://sidspace.info/ 32
  • 33. Are the Web Managers still asking? ―So what has all this got to do with me..?‖ 33
  • 34. A Data Future “The ability to take data - to be able to understand it, to process it, to extract value from it, to visualise it, to communicate it –that‘s going to be a hugely important skill in the next decades.” Hal Varian, Google‘s chief economist. 34 Hal Varian, Chief Economist, Google
  • 35. Web Teams and Data • Data is relevant to those working with the Web at HEIs because: • Data will affect your IT infrastructure, if it doesn‘t already • Data is becoming increasingly important for the REF and for funding so it will be increasingly important to your HEI • It is getting easier to ask for data • Structured data could make your life easier • The Web itself is becoming more structured • Data can show impact • It‘s all about the data…. 35
  • 36. Web Teams and Data • Unstructured data accounts for more than 90% of digital universe (2011 Digital Universe study) • Structured data on the rise for some time – deep web, annotation schemes, search data • In the past web pages have contained information, now is the time for them to contain data • Some key data areas Web teams need to think about: – Structure – Metrics – Patterns, data mining and analytics – Preservation (maybe one for another day?) 36
  • 37. Web Data: Structure • Move toward a Web that‘s more fluid, less fixed, and more easily accessed on a multitude of devices • futurefriend.ly‘s Brad Frost, ―get your content ready to go anywhere because it‘s going to go everywhere.‖ • Karen McGrane: calls them ―content blobs‖ – ―we can embrace meaningful, modular chunks that are ready to travel‖ • Google Knowledge Graph: ―currently contains more than 500 million objects, as well as more than 3.5 billion facts about and relationships between these different objects‖ • Schema.org: ―a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers‘‖ 37
  • 38. Preparing for Structure • There is a need for structured content in Web sites • ‗Future ready content‘ - Sara Wachter-Boettcher – 1. Get Purposeful – why do users want this content? – 2. Get Micro – get granular, break content down (schema.org – microdata) – 3. Get Meaningful – considering the meaning of elements – 4. Get Organised – looking at your CMS – 5. Get Structured – DITA? XML? HTML5 (microdata) • ‗Create once, publish everywhere‘ idea – mobile, apis, etc. 38
  • 39. Web Data: Metrics • Metrics – the new black? Kristen Ratan • ―The more you know the more you realise you don‘t know‖ • What should we be tracking? e.g. Figures opened, downloaded, inks clicked, time spent on article page, supplemental info viewed, authors‘ info viewed • Look at the pathways that info travels • Data can drive tenure and promotion, grants, reputation, discovery, prioritization, attention • Issues: Missed citation data, data sources that aren‘t reliable, digital addresses change, usage doesn‘t mean useful 39
  • 40. Web Data: Patterns “In other words, we no longer need to speculate and hypothesise; we simply need to let machines lead us to the patterns, trends, and relationships in social, economic, political, and environmental relationships.” Mark Graham, Big Data blog, the Guardian. 40 Hal Varian, Chief Economist, Google
  • 41. Web Data: Analytics • Customers expect us to be leveraging their activity to benefit their user experience • ―the process of developing actionable insights through problem definition and the application of statistical models and analysis against existing and/or simulated future data.‖ Adam Cooper, CETIS • Reporting and descriptive methods Vs inferential and predictive methods • Data driven decisions? ―human decisions supported by the use of good tools to provide us with data-derived insights‖ • Don‘t ―let the numbers speak for themselves‖ – data only one input to decision process • Data specialists and domain specialists work together • Need to ask the right questions 41
  • 42. Web Data: Learning Analytics • ―The measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimising learning and the environments in which it occurs.‖ 1st International Conference on Learning Analytics & Knowledge • Open University Learner Analytics Project – Looked at withdrawals - e.g. when students stop study before completion of a module towards a degree – Possible to map what points on paths of study withdrawals occur. • Other uses: personalisation, recommendation, research profiles, marketing and surveys, help desk, CRM, library • Looking at disabled students/accessibility – linking learner analytics and web metrics 42
  • 43. Protection of Freedoms Bill • The Protection of Freedoms Bill is a UK parliamentary bill introduced in February 2011 • Has completed it‘s readings – now passing through house of Lords • 102 - amendments to FOIA - mandatory for public authorities to permit re-use of datasets when communicating them in response to a FOI request • Datasets are collections of information held in electronic form i.e. 'raw data' gathered or created in connection with the university's functions or 'services‘ • Government‘s Innovation and Research Strategy for Growth - "a transformation in the accessibility of research and data‖ 43
  • 44. Tools that Could Help http://www.flickr.com/photos/luc/5418037955/ 44
  • 45. Tools: Structure • Schema.org • Google Rich Snippets testing tool – tests microdata, microformats, RDFa • List of tools on Semanticweb.org 45
  • 46. Tools: Metrics & Text Mining • Google Analytics • Elsevier • total-impact • altmetric.com 46
  • 47. Tools: Analytics • SNAPP: Social Networks Adapting Pedagogical Practice • GLASS (Gradient‘s Learning Analytics System) • International Educational Data Mining society • Learning Analytics and Knowledge Conference 47
  • 48. Data Visualisations • Use your IT and your graphics design department • Make it interactive • Getting Awesome Results from Data Visualisation – Rich Kirk • Data visualisation strategy – Have a purpose – Have measurable KPIs vs purpose – Plan distribution in advance – Resource – Ensure visualisation matches purpose • Chart chooser (Gene Zelazny's Saying It With Charts) • Measurement: pageviews, buzz, links, key word ranking • ―Tell a story with your data‖ – Ewan McIntosh at IDCC11 48
  • 49. Data Visualisation Help • Great Web sites – Ewan McIntosh – Information is Beautiful – Pinterest – Guardian data blog – Flowing data – Infosthetics – information aesthetics – where form follows data • Great tools – Manyeyes – Chartsbin, icharts, Google chart tools – Google developer – Google Fusion tables – Tableau public – Datamarket – Colour Brewer 49
  • 51. Data Case Study: Southampton • Not big data but small data • Got to be useful!! Chris Gutteridge - http://blogs.ecs.soton.ac.uk/data/ 51
  • 52. Southampton Data • Places: Buildings, Rooms, Campuses, Counties, Disabled Access • Organisation Structure • Products & Services: Coffee, Sandwiches, Library Services, Recycle Points • Points of Service: Coffee Shops, Swimming Pools, Libraries, Receptions • Teaching: Courses, Modules, Statistics, Student Satisfaction • Travel: Stations, Bus-Stops, Bus-Routes, Bus Times • Resources: EPrints, Videos, Learning Objects • People: Contact Information, Experts for the Media • Events: Open Days, University History • Jargon 52
  • 54. Southampton Uses… • Google docs, excel spread sheets, RDF, triples • Grinder – github • Graphite – php library • Graphite (publishing RDF). Required skills: – RDF structure – RDF/XML – XSLT • Graphite (consuming RDF). Required skills: – RDF structure – PHP 54
  • 55. Data Case Study: Aberdeen ―I managed the Web and then inherited MIS. These two have now converged so that Web is using much better, structured data and standardising and consolidating sources. The MIS brings discipline to the Web – much needed if you ask me, anarchist though I am...” Mike McConnell, Head of Web Services, University of Aberdeen. 55
  • 56. Student Attendance Data • Loughborough University‘s Pedestal for Progression • Roehampton University‘s fulCRM • Southampton Student Dashboard at the University of Southampton • tutees, directory info, whether coursework has been handed in, and attendance. • University of Derby‘s SETL (Student Engagement Traffic Lighting) • The ESCAPES (Enhancing Student Centred Administration for Placement ExperienceS) project at the University of Nottingham 56
  • 57. Conclusions • At the moment it‘s all about the data… (whether you like it or not!) • Be aware of what is happening with data at your institution – data repository, MIS, RIM, CRIS, repository etc. Where do you sit in the picture? • Structure your Web data – it makes sense • You can start with ‗little data‘… • Think about what strategic questions you want to ask • Be grounded – efficiency and effectiveness • Start from the user end - think about the uses and output • Follow up from the IT end – how can you automate processes? • What can you use your data for? Can you show impact/success? • How about telling a story with it? 57
  • 58. Buzzword Bingo data Linked wrangler cloud Big data computing data para data Data-Driven Decision making data mining data data data journalism scientist tsunami knowledge discovery in clustering predictive analytics data (KDD) 58
  • 59. What Data Can and Cannot Do • From Guardian Datablog, by Johnathan Gray • Data is not a force unto itself. • Data is not a perfect reflection of the world. • Data does not speak for itself. • Data is not power. • Interpreting data is not easy. 59
  • 60. Thanks!! ―The data that is valuable to you is already passing through your hands" ” Doug Cutting, Chairman, Apache Software Foundation 60

Editor's Notes

  1. Top right: gene sequencing machiinesBejing genomic institute, one of largest genomic institutes in world, crunching out genomic data 24/7Field telescope scanning night sky – streaming in vast amounts of dataSensor equipment to monitor ai quality in desert
  2. Data is a a plural word but is often used as a singular wordThe Data is stored The data are stored
  3. Cern – huge teams to deal with datalarge Hadron Collider produces around 15 petabytes of data annuallyUniversity of Bristol/Cloudant talked about 50 terabyte datasets
  4. Data facts – 1. If you stacked a pile of CD-ROMs on top of one another until you’d reached the current global storage capacity for digital information – about 295 exabytes – if would stretch 80,000 km beyond the moon.2. Every hour, enough information is consumed by internet traffic to fill 7 million DVDs.  Side by side, they’d scale Mount Everest 95 times.3. 247 billion e-mail messages are sent each day… up to 80% of them are spam.4. By 2020, IT departments will be looking after 10 x more servers, 50 x more data and 75 x more files.  Meanwhile, the number of IT administrators keeping track of all that data growth with increase by 1.5 times.5. We can expect a 40-60 per cent projected annual growth in the volume of data generated, while media intensive sectors, including financial services, will see year on year data growth rates of over 120 per cent.6. The world’s 500,000+ data centres are large enough to fill 5,955 football fields.7. 75% of digital information is generated by individuals, whilst enterprises have liability for 80% of digital data at some point in its life.8. There are nearly as many bits of information in the digital universe as there are stars in our actual universe.9. Investment in digital enterprises has increased 50% since 2005.10. There are 30 billion pieces of content shared on Facebook every day.11. In 2010, 28% of the digital universe required some level of security… not all of it had the level of security it required….12. People wishing each other Happy New Year drove a 500% surge in smartphone data within just one year, according to 3UK whose customers used a whopping 80 terabytes (TB) on the 31st December 2011, compared to just 14 TBs on the same day in 2010.http://www.kurtosys.com/blog/12-big-facts-about-big-data/
  5. Kyle Machulis– quantified self movement – body hacking – wrist band monitors – recording bodily functions and using information to improve your lifesyle
  6. Recommender systemsAEIOU – Searching in Welsh repositoriesRISE – recommendations services – Open UniSALT – Library data – MIMASOpenURL activity data – EDINAImproving the student experienceExposing VLE activity data (EVAD) – Uni of CambridgeStudent Tracking And Retention (Next Generation): STAR-Trak: NG – Leeds metLibrary Impact Data Project (LIDP)- HuddersfieldResource managementExploiting Access Grid Activity Data (AGtivity) – Uni of manchester
  7. Open science/citizen science – armchair astronomershttp://en.wikipedia.org/wiki/List_of_crowdsourcing_projectsFamilySearch Indexing is a volunteer project which aims to create searchable digital indexes for scanned images of historical documents. The documents are drawn primarily from a collection of 2.4 million microfilms made of historical documents from 110 countries and principalities.Galaxy Zoo is a citizen science project that lets members of the public classify a million galaxies from the Sloan Digital Sky Survey. The project has led to numerous scientific papers and citizen scientist-led discoveries such as Hanny'sVoorwerp.The Katrina PeopleFinder Project used crowdsourcing to collect data for lost persons. Over 4,000 people donated their time after Hurricane Katrina. It included 90,000 entries.Life in a Day is Kevin Macdonald's 95-minute documentary film comprising an arranged series of video clips selected from 80,000 clips (4500 hours) submitted to the YouTube video sharing website, the clips showing respective occurrences from around the world on a single day.[The Open Dinosaur Project is a community research project to aggregate published measurements of ornithischiandinosaurlimb bones for many different taxa in order to study the multiple evolutionary transitions from bipedality to quadrupedality in this group of dinosaurs.reCAPTCHA uses CAPTCHA to help digitize the text of books while protecting websites from bots attempting to access restricted areas. Humans are presented images of the book, and asked to provide the corresponding text. Twenty years of The New York Times have already been digitized.Secret London is composed mostly of Londoners who use the site to share suggestions and photos of London. Originally started as a Facebook Group in 2010 in response to a competition to win an internship at Saatchi & Saatchi, Secret London gained 150,000 members within 2 weeks.
  8. One of the most impressive linked data projects in UK higher education is the Southampton Open Data Service. This project is taking data sets, used by institutional administrators, and making them available in linked data formats. A number of applications have been built on the data including an interactive university map, a catering menu search function, university telephone directories, and apps making it easier for students to navigate open days.Successful universities and colleges of the future will have to build an infrastructure that turns them into reliable data hubs, able to analyse even very large and complex datasets internally and to pass on their insights – for free to students, for a fee to business.As a technology, linked data is still work in progress and JISC is working to develop its capabilities for further and higher education. Data are only a raw material and their present and future value depends on how we can use them.Find out whether you’re already using the technology. As it works behind the scenes people don’t always know where products are built around linked data.Find out what demand exists for which data within your institution and among partners, and which are your most valuable data.Cultivate an ethos of innovation – experiment with linked data in small-scale, inexpensive projects and in close contact with internal and end users of the data. Share and reuse these innovations.If you do try out linked data within your institution ask your IT team to demonstrate how the end user can access linked data.Find out more about how to publish linked data.
  9. Schema.org – search engine collaboration – Google, Bing, Yahoo, YandexLaunched june 2011 – 300 classes, 261 properties
  10. http://www.alistapart.com/articles/future-ready-content/The future is flexible, and we’re bending with it. From responsive web design to futurefriend.ly thinking, we’re moving quickly toward a web that’s more fluid, less fixed, and more easily accessed on a multitude of devices.As we embrace this shift, we need to relinquish control of our content as well, setting it free from the boundaries of a traditional webpage to flow as needed through varied displays and contexts. In the words of futurefriend.ly’sBrad Frost, “get your content ready to go anywhere because it’s going to go everywhere.”But don’t unlock the shackles just yet: our content is far from future-ready. When extracted from the carefully designed pages on which it lives today, most web content turns into undifferentiated text, its meaning lost as it spills into any container you give it.We can do better. Rather than accept these “content blobs,” as Karen McGrane calls them, we can embrace meaningful, modular chunks that are ready to travel.This is a content strategy problem, true. But listen up, designers, developers, and UXers: you’re not excused just yet. This job takes editorial, architectural, and technical knowledge.This is a project for all of us.Preparing for structureMost conversations about structured content dive headfirst into the technical bits: XML, DITA, microdata, RDF. But structure isn’t just about metadata and markup; it’s what that metadata and markup mean. Before we start throwing around fancy acronyms, we need to get closer to the content itself, creating a framework for making smart decisions about its structure. Only then can we tackle technology in meaningful, useful ways. So hang on—this part’s important.1. GET PURPOSEFULYou’re already designing sites with both user and organizational goals in mind, right? Great. Now you need to translate those goals to a smaller scale, applying them to each type of content you have—like blog posts, articles, rotating features, or product descriptions. To do this, you’ll need to be able to answer questions like: How does this kind of content support the overall site goals? Why would a user want it? What is the organization accomplishing by publishing it? What does the organization want the user to do with it?Just as it’s critical to establish site goals before launching into design decisions, you have to know what each type of content is intended to accomplish before you can make decisions about how you need to treat it in different contexts. Otherwise, how can you ensure that content keeps doing its job as it flexes and twists to meet the needs of each device it’s displayed on?(Now, if you realize your content isn’t accomplishing anything, or you don’t know what kinds of content you’re dealing with, you’ve got a bigger problem on your hands. Before getting friendly with the future, go cozy up to your client or boss and figure out what matters.)2. GET MICROAll right, you know why the articles or recipes or limericks or whatever kinds of content you’re dealing with exist. Good, because now it’s time to get even more granular, breaking these content types down into their core elements.The specific elements you’ll need to consider will vary greatly depending on the type of content you’re working with, so start by identifying all the content chunks you can find in a given type of information. These could be things like titles, teasers, body content, ingredient lists, reviews, pull quotes, excerpts, images, videos, captions, related articles, bylines, directions, addresses, and many more.Take a recipe for asparagus, fingerling potato, and goat cheese pizza from the popular site Epicurious, for example.Recipes are a pretty common type of content, so you may think you’ve got this one figured out already: title, ingredients, directions. But look again, and you’ll see a whole universe of interconnected elements contributing to this single piece of content: Title Publication Attribution Publication Date Byline Yield Teaser Description Image Ingredients Preparation Wine Pairings Ratings Reviews Main Ingredients Cuisine Type Dietary Considerations Related Recipe CollectionsAn information architect or content strategist sure comes in handy in determining these attributes, but everyone on the team needs to be fully engaged—because you’ll need these chunks to make major decisions about how content will respond to changes in device and display.3. GET MEANINGFULUnderstanding which content chunks exist is just the start. Now you need to understand why each one matters to the whole—and how much it matters. This allows us to make decisions about how content is organized, prioritized, and displayed for different screen sizes, contexts, or purposes.You can begin to do this by considering: How does this element contribute to the content’s purpose? What meaning is lost if this element goes away? What relationships exist between this element and the others?If this were my project, I’d do some hefty research into organizational goals, current content use patterns, and user needs well before getting here. But, for example’s sake, we’ll work with assumptions. Since Epicurious is a publisher, let’s assume it wants to increase page views to bump advertising revenue. Since it’s a recipe site, let’s assume users are there to find something suitable to cook.This scenario could translate to a content-level goal like, “recipes should be compelling, specific, and connected—so users want to make them, can easily tell whether they meet their needs, and ultimately want to visit additional Epicurious content.”As you hold that goal up against these content elements, some interesting questions emerge: Removing all those related items may seem like an easy way to reduce clutter for small screen sizes, but will that decrease the number of total pages a user visits? If we make sidebar content push below main content as the screen size narrows, will users be frustrated at wading through ingredients to get to the recipe’s rating? What would happen to users’ interest in the recipe if we removed the image? Does a title, if displayed elsewhere without its teaser description, tell the user enough to be meaningful?These are difficult questions to answer. Wine pairings may be extremely compelling for the aspiring sommelier, and entirely unappealing for a teetotaler. Ingredients may be a critical first stop for someone with food allergies, but secondary to someone without.We may never be able to anticipate each user’s personal preferences, but the more we understand the relationships between information, the more the compromises inherent in any design decision will be clear—and the better prepared we are to make tough calls.For example, in many responsive designs, sidebars are immediately pushed beneath main content for smartphone-sized displays. But is this always the right answer? Here, ratings, reviews, and main ingredients give readers an at-a-glance means to evaluate the recipe, and pushing this information below the ingredient and preparation sections could make them all but useless.That’s the thing about adapting content to varied layouts: each case is different. One-size-fits-all rules about how content should react are unlikely to serve your many content types—which means they won’t serve your users’ needs or your business goals either. And as more devices and technologies emerge, you’ll need to develop new rules and make new compromises as well.Good thing is, we don’t need a crystal ball to start taking action. We can begin today simply by improving the ways our content is stored.4. GET ORGANIZEDThe future is sexy; content management systems are not. And yet, your CMS may well be what’s standing between your carefully considered content and its ability to travel. Think about the elements we’ve identified and the relationships and priorities that define them. Are the CMSes you’ve worked with ready for this level of content? If so, you’re in the minority. The rest of us have some work to do.One organization that’s taken great strides to future-ready its CMS is National Public Radio. Back in 2009, NPR launched a methodology it calls Create Once, Publish Everywhere. With COPE, each story is entered into a set of discrete fields within the CMS, then made available via an API to multiple platforms—such as the NPR website, device-specific applications for iPad and iPhone, the NPR music site, and local NPR affiliate stations’ sites.NPR’s CMS supports a variety of content elements, but only four are required: a title, short slug, longer description, and date line, says Zach Brand, the head of technology for NPR’s digital media. Additional attributes—like images, audio, or bylines—are all optional. Once in the CMS, the story is distributed via API and ultimately published using various combinations of elements determined by the needs of the platform on which it’s being published.If we want systems that can handle this kind of modular, fast-moving content, it’s time we get cozier with our CMSes—and the people who develop, integrate, and customize them. Armed with knowledge from your in-depth analysis, you now have the tools to embrace a strategic approach to content management, which will help you to: Ensure those focused on CMS features and capabilities understand your content and what it’s intended to accomplish. Explain the types of content you’ll need and what elements they require, much like NPR has defined the attributes of its stories. Understand your CMS’s possibilities and limitations, and collaborate on how to deal with them. Ease your technical team’s burden by providing them with thoughtful, specific direction to inform the CMS’s requirements.This groundwork will serve you well even if you’re just managing a basic website, but as you begin to share content across more devices and channels, it becomes critical. With a CMS that’s organized around modular, meaningful chunks of content, you’ll be ready to create rules for how that content should bend and shift—and have the systems in place to actually implement them.5. GET STRUCTUREDThere’s a reason this article didn’t begin with a primer on XML. Technology can’t help you make good decisions; it can only help you implement them. But content elements must eventually become code, so even if writing markup isn’t your job, we could all stand to get more comfortable with the tools out there to do it.Structured content isn’t new. Technical communicators have been pushing DITA (Darwin Information Typing Architecture) for years—and there’s nothing particularly futuristic about it. Based on XML, a markup language that gives content components an inherent meaning when displayed beyond their database, DITA authors and publishes technical information in content modules—small pieces of information designed for reuse and categorized according to topic. [1] Designed by IBM to manage the company’s own technical content, it’s most widely used for things like help documentation.Many technical communicators insist DITA should be the web’s standard structuring approach, but it’s never quite caught on. It’s also not the only way to do it. HTML5 now supports semantic markup through its microdata extension, which goes beyond traditional presentational tags and allows you to mark up content with standards-compliant, semantically rich HTML. [2] Of course, HTML5 itself is still a working draft, and it’s unclear whether microdata will gain widespread use—or offer enough specificity to suit our content. For example, late last year, the “time” element was removed in favor of the more generic “data.”There’s also Schema.org, a microdata-based approach launched in 2010 by Bing, Google, and Yahoo!. Designed to create a common language across search engines, Schema.org arranges microdata into taxonomies of content types that start broadly and branch into ever-more-specific elements. Critics, however, point out that Schema.org is a closed system: the search engines tell us which structures matter, rather than allowing content owners to define them.Many people are passionate about which of these approaches is best, and why everyone else is doing it wrong. I’m not one of them. Fact is, we may be a long way from a definitive markup method, and none of these currently supports all kinds of content, anyway. Use the one that makes the most sense for your project right now—and in fact, that could mean not even worrying about markup yet.Giving life to structureWhat matters much more than markup is the work we put in to get there: the rules and relationships determined through analyzing content closely and caring for its message and purpose. After all, “semantic” connotes meaning—typically, the meaning of language. Whatever markup language you use, it’s not semantic unless it pushes meaning forward—which is why you can’t start with markup; you end with it.This, I think, is why structured content has often been written off as too technical and utilitarian for the mainstream web crowd: because we’ve left the editorial side, the experiential side—the part that lends content life—out of these conversations.This needs to stop. Future-ready content isn’t about becoming an XML expert or assuming microdata will solve your problems. It’s about seeing structures through the lens of meaning and storytelling, and building relationships across disciplines so that our databases reflect this richness and complexity.We don’t have all the answers, but we do have a clear place to start: with our content itself. As we break our content down, analyze its elements, and document the relationships that turn those elements into a meaningful whole, we can begin to create and manage content in a way that endures, wherever the future leads us.Technology will change. Standards will evolve. But the need for understanding our content—its purpose, meaning, structure, relationships, and value—will remain. When we can embrace this thinking, we will unshackle our content—confident it will live on, heart intact, as it travels into the great future unknown. References[1] For an introduction to DITA from the tech-comms perspective, download the Rockley Group’s whitepaper, Preparing for DITA: What you need to know.[2] See Microdata: HTML5’s Best-Kept Secret on Web Monkey and Brian Cray’s HTML5 Microdata: Why isn’t anyone talking about it?.Translations:
Italian
  11. http://www.guardian.co.uk/news/datablog/2012/mar/09/big-data-theory
  12. From Paul Miller According to Brockmeier, the audience (of data scientists) apparently narrowly agreed that their arsenal of tools and algorithms trumped the knowledge and experience of the meteorologists, financiers, and retailers to whose domains data scientists are increasingly turning.Data scientists are an increasingly capable bunch, and the tools at their disposal sometimes appear almost magical in their capability to derive insight.
  13. From Paul Miller According to Brockmeier, the audience (of data scientists) apparently narrowly agreed that their arsenal of tools and algorithms trumped the knowledge and experience of the meteorologists, financiers, and retailers to whose domains data scientists are increasingly turning.Data scientists are an increasingly capable bunch, and the tools at their disposal sometimes appear almost magical in their capability to derive insight.
  14. First 5 minutes - http://blogs.ecs.soton.ac.uk/data/