How can we mine, analyse and visualise the Social Web?
In this lecture, you will learn about mining social web data for analysis. Data preparation and gathering basic statistics on your data.
Lecture 5: Mining, Analysis and VisualisationMarieke van Erp
This is the fourth lecture in the Social Web course at the VU University Amsterdam
Visit the website for more information: <a>Social Web 2012</a>
ScienceOpen "The Big Picture: Open Access content aggregators as drivers of i...ScienceOpen
Presented at ReCon - Research in the 21st Century: Data, Analytics & Impact, Edinburgh 19 June 2015
How do researchers identify the most relevant papers from roughly 1.8 million articles published in ca. 28,000 scholarly journals each year? And how does discovery lead to “impact”? Established aggregators have traditionally depended on citation counts as the principle measure of relevance. As the Open Access movement sets increasing amounts of data (articles and references) free on the internet, new ways to collect, rate and rank content across publishers are being developed for and by the digital generation. The crux of the Open Access movement may well not be its moral imperative or its new business model, but the myriad of projects which can build on access to structured digital information. How will new Open Access aggregators with novel, open measurements of impact affect the current publishing landscape? Case study: the ScienceOpen platform currently aggregates 1.5 million Open Access articles and is developing tools to showcase excellent research across publishers via editorial selection in Collections.
Lecture 5: Mining, Analysis and VisualisationMarieke van Erp
This is the fourth lecture in the Social Web course at the VU University Amsterdam
Visit the website for more information: <a>Social Web 2012</a>
ScienceOpen "The Big Picture: Open Access content aggregators as drivers of i...ScienceOpen
Presented at ReCon - Research in the 21st Century: Data, Analytics & Impact, Edinburgh 19 June 2015
How do researchers identify the most relevant papers from roughly 1.8 million articles published in ca. 28,000 scholarly journals each year? And how does discovery lead to “impact”? Established aggregators have traditionally depended on citation counts as the principle measure of relevance. As the Open Access movement sets increasing amounts of data (articles and references) free on the internet, new ways to collect, rate and rank content across publishers are being developed for and by the digital generation. The crux of the Open Access movement may well not be its moral imperative or its new business model, but the myriad of projects which can build on access to structured digital information. How will new Open Access aggregators with novel, open measurements of impact affect the current publishing landscape? Case study: the ScienceOpen platform currently aggregates 1.5 million Open Access articles and is developing tools to showcase excellent research across publishers via editorial selection in Collections.
A 25 minute talk from a panel on big data curricula at JSM 2013
http://www.amstat.org/meetings/jsm/2013/onlineprogram/ActivityDetails.cfm?SessionID=208664
Good Riddance: Academic Publishers are Abandoning PublishingBjörn Brembs
Talk at RIOT science club on the myriad ways in which science would do so much better if scholarly institutions took their money and spent it on modern information technology instead of antiquated and counter-productive journals.
Overview of data management policies and data management plans, including the DMPTool. For Ecological Society of America 2013 Meeting in Minneapolis, MN 5 August 2013.
Lecture 1: Social Web Introduction (2013)Lora Aroyo
This is the first lecture in the Social Web course (2013) at the VU University Amsterdam
Visit the website for more information: http://semanticweb.cs.vu.nl/socialweb2013/
A 25 minute talk from a panel on big data curricula at JSM 2013
http://www.amstat.org/meetings/jsm/2013/onlineprogram/ActivityDetails.cfm?SessionID=208664
Good Riddance: Academic Publishers are Abandoning PublishingBjörn Brembs
Talk at RIOT science club on the myriad ways in which science would do so much better if scholarly institutions took their money and spent it on modern information technology instead of antiquated and counter-productive journals.
Overview of data management policies and data management plans, including the DMPTool. For Ecological Society of America 2013 Meeting in Minneapolis, MN 5 August 2013.
Lecture 1: Social Web Introduction (2013)Lora Aroyo
This is the first lecture in the Social Web course (2013) at the VU University Amsterdam
Visit the website for more information: http://semanticweb.cs.vu.nl/socialweb2013/
Lecture 1: Social Web Introduction (2014)Lora Aroyo
This is the first lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Lecture 2: Interactions, Frameworks, Privacy & Security on the Social Web (2014)Lora Aroyo
This is the second lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Lecture 5: Personalization on the Social Web (2014)Lora Aroyo
This is the fifth lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Lecture 3: Vocabularies & Data Formats on the Social Web (2014)Lora Aroyo
This is the third lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
The goal of this presentation is to allow researchers to understand the possibilities of Social Media as a research field on the fields related to NLP/IR/DM.
Workshop session given at the Institutional Web Management Workshop 2012 (IWMW 2012) event held at the University of Edinburgh on 18th - 20th June 2012.
Univ. of AZ Global Racing Symposium 2015 - Digital Strategiessmfrisby
Provides a high-level view of how organizations can leverage Big Data in the digital space. Covers topics such as structured vs unstructured data, curating disparate data sources and exploiting the data correlation opportunities.
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social NetworksBang Hui Lim
We study how users of multiple online social net- works (OSNs) employ and share information by studying a common user pool that use six OSNs – Flickr, Google+, Instagram, Tumblr, Twitter, and YouTube. We analyze the temporal and topical signature of users’ sharing behaviour, showing how they exhibit distinct behaviorial patterns on different networks. We also examine cross-sharing (i.e., the act of user broadcasting their activity to multiple OSNs near-simultaneously), a previously unstudied behaviour and demonstrate how certain OSNs play the roles of originating source and destination sinks.
5 Timesaving Tools for Managing the Overwhelming World of Social MediaOff Madison Ave
This presentation covers five key components for tackling any social media marketing challenge for businesses, organizations and associations of all sizes.
Towards Culturally Aware AI Systems - TSDH SymposiumMarieke van Erp
Towards Culturally Aware AI Systems
Presented 23 June 2021
Slide credits: Cultural AI team members Andrei Nesterov, Laura Hollink, Ryan Brate, Valentin Vogelmann + input and inspiration from all Cultural AI Colleagues
Biases in data can be both explicit and implicit. Explicitly, ‘The Dutch Seventeenth Century’ and ‘The Dutch Golden Age’ are pseudo-synonymous and refer to a particular era of Dutch history. Implicitly, the ‘Golden Age’ moniker is contested due to the fact that the geopolitical and economic expansion came with great costs, such as the slave trade. A simple two-word phrase can carry strong contestations, and entire research fields, such as post-colonial studies, are devoted to them. However, these sometimes subtle (and sometimes not so subtle) differences in voice are as yet not often represented well in AI systems.
In this talk, I will discuss how the Cultural AI Lab is working towards creating AI systems that are implicitly or explicitly aware of the subtle and subjective complexity of human culture. I will highlight the different research strands and activities that look at AI from different angles as well as how we engage with our user communities to create synergies between the technology and the daily practice of cultural heritage professionals.
The Human in Digital Humanities
Online Symposium, Tilburg School of Humanities & Digital Sciences
Tilburg University
https://www.digitalhumanitiestilburg.com/
Marieke van Erp & Victor de Boer (2021, June). A Polyvocal and Contextualised Semantic Web. In European Semantic Web Conference (pp. 506-512). Springer, Cham.
Presented on 8 June, 2021
Computationally Tracing Concepts Through Time and SpaceMarieke van Erp
Slides for HNR2020 Keynote presentation
Abstract:
Digitised sources are a treasure trove for scholars, but accessing the information contained in them is far from trivial. Due to scale, traditional methods are insufficient to analyse the big data coming from these sources. Hence, computational methods look to be the solution. Indeed, computational methods can be utilised to identify and model concepts in large digital datasets, however the nature of these datasets as well as that of humanities research questions requires caution. In particular, the ramifications of time and location on understanding concepts cannot be underestimated.
In this talk, Marieke will present ongoing work on computationally tracing concepts through time and across geography using language and semantic web technology. The work illustrates that seemingly simple concepts (e.g. sugar) prove to be much more complex than expected. We discuss the importance of semantics in helping not only to deal with this complexity but reify it so that it can be interrogated both computationally and via expert analysis.
Slides 5, 8, 11, 12, 15, 16, 17, 18, 19, 20 are based the presentation Tabea Tietz gave for the paper "Challenges of Knowledge Graph Evolution from an NLP Perspective" in the WHiSe Workshop @ ESWC 2020 (2 June 2020).
http://hnr2020.historicalnetworkresearch.org/
The Hitchhiker's Guide to the Future of Digital HumanitiesMarieke van Erp
Slides of my DHOxSS closing lecture
Oxford, 26 July 2019
Abstract
In the constellation of research fields, new configurations are continuously reshaping our ideas of what a field should be. This is particularly the case in the young field of digital humanities which, as David M. Berry noted, started with a focus on improving access to digital repositories and then moved to expanding the limits of archives to include born-digital materials as research objects. Both moves greatly impacted our research practice. However, I argue that we have only started scratching the surface of what digital methods can mean for humanities research.
In particular, as our methods and collaborations with other fields have matured, we can now start imagining new types of research questions that go beyond the sum of their ‘digital’ and ‘humanities’ parts -- to fundamentally change the nature of the humanities questions that we can ask. For such a reshaping to occur, we need to deepen the connection to our academic neighbours and keep looking beyond our own research community in order to ask these new questions. In my talk, I will present how multi-disciplinary collaborations between historians, linguists, and computer scientists can bring about new insights that may form the first steps to this future.
Why language technology can’t handle Game of Thrones (yet)Marieke van Erp
Natural language processing (NLP) tools are commonly used in many day-to-day applications such as Siri and Google, but the effectiveness of these technologies is not thoroughly understood. I will present joint work with colleagues from the Vrij Universiteit Amsterdam in which we perform a thorough evaluation of four different name recognition tools on 40 popular novels (including A Game of Thrones). I will highlight why literary texts are so difficult for NLP tools as well as solutions for improving their performance.
Finding common ground between text, maps, and tables for quantitative and qua...Marieke van Erp
Invited talk given at 8th AIUCD Conference 2019 – ‘Pedagogy, teaching, and research in the age of Digital Humanities’
http://aiucd2019.uniud.it/
24 January 2019, Udine, Italy
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchMarieke van Erp
Presented at EKAW 2018
Historical newspapers are a novel source of information for historical ecologists to study the interactions between humans and animals through time and space. Newspaper archives are particularly interesting to analyse because of their breadth and depth. However, the size and the occasional noisiness of such archives also brings difficulties, as manual analysis is impossible. In this paper, we present experiments and results on automatic query expansion and categorisation for the perception of animal species between 1800 and 1940. For query expansion and to the manual annotation process, we used lexicons. For the categorisation we trained a Support Vector Machine model. Our results indicate that we can distinguish newspaper articles that are about animal species from those that are not with an F 1 of 0.92 and the subcategorisation of the different types of newspapers on animals up to 0.84 F 1 .
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Marieke van Erp
Giuseppe Rizzo, Biana Pereira, Andra Varga, Marieke van Erp, Amparo Elizabeth Cano Basave
Presented on Wednesday 10 October at the 17th International Semantic Web Conference (ISWC 2018)
Paper: http://www.semantic-web-journal.net/content/lessons-learnt-named-entity-recognition-and-linking-neel-challenge-series
Conference: http://iswc2018.semanticweb.org/
Entity Typing Using Distributional Semantics and DBpedia Marieke van Erp
Presentation given at NLP&DBpedia workshop on 18 October 2016. The presentation accompanies the work described in: https://nlpdbpedia2016.files.wordpress.com/2016/09/nlpdbpedia2016_paper_9.pdf
The domain as unifier, how focusing on social history can bring technical fie...Marieke van Erp
Invited talk given at the final CEDAR symposium about the interaction between (social) history, language technology, and semantic web.
https://socialhistory.org/en/events/final-cedar-mini-symposium
Evaluating entity linking an analysis of current benchmark datasets and a ro...Marieke van Erp
Marieke van Erp, Pablo Mendes, Heiko Paulheim, Filip Ilievski, Julien Plu, Giuseppe Rizzo and Joerg Waitelonis
Presented at LREC 2016:
http://www.lrec-conf.org/proceedings/lrec2016/pdf/926_Paper.pdf
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...Marieke van Erp
Slides of the NewsReader Computational Models of Narrative Presentation "Finding Stories in 1,784,532 Events: Scaling Up Computational Models of Narrative - Marieke van Erp, Antske Fokkens, and Piek Vossen"
Workshop page: http://narrative.csail.mit.edu/cmn14/
Project page: http://www.newsreader-project.eu
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
1. Social Web
Lecture 4
How can we MINE, ANALYSE and VISUALISE
the Social Web? (1)
Marieke van Erp
The Network Institute
VU University Amsterdam
2. Why?
• UCG provides an enormous wealth of data
• insights in users’ daily lives
• insights in communities
• insights in trends
3. To whom it may
concern
• Politicians
• Companies
• Governmental institutions
• You?
4.
5. The Age of Big Data
• 25 billion tweets on Twitter in 2010, by 175
million users
• 360 billion pieces of contents on Facebook
in 2010, by 600 million different users
• 35 hours of videos uploaded to YouTube
every minute
• 130 million photos uploaded to flickr per
month
6. Questions to Ask
• Who uploads/talks? (age, gender,
nationality, community)
• What are the trending topics?
• What else do these users like?
• Who are the most/least active users?
• etc.
7. What do you prefer?
Image: http://www.co.olmsted.mn.us/prl/propertyrecords/RecordingDocuments/PublishingImages/forms.jpg
8.
9. The Rise of the Data
Scientist
http://radar.oreilly.com/2010/06/what-is-data-science.html
10. The Rise of the Data
Scientist
• Data Science enables the creation of data
products
• Data products are applications that acquire
their value from the data, and create more
data as a result.
• Users are in a feedback loop: they constantly
provide information about the products they
use, which gets used in the data product.
12. Data Mining 101
Data mining is the exploration and analysis of large quantities of
data in order to discover valid, novel, potentially useful, and
ultimately understandable patterns in data.
(Inspired by George Tziralis’ FOSS Conf’09, John Elder IV’s
Salford Systems Data Mining Conf. and Toon Calders’ slides)
http://www.freefoto.com/images/33/12/33_12_7---Pebbles_web.j
20. Data mining algorithms
• Classification: Generalising a known
structure & apply to new data
• Association: Finding relationships between
variables
• Clustering: Discovering groups and
structures in data
21. Mining in ‘LikeMiner’
• Filter users by interests
• Construct user graphs
• PageRank on graphs to mine
representativeness
• Result: set of influential users
• Compare page topics to
user interests to find pages
most representative for
topics
27. Mining Social Web Data
source: http://kunau.us/wp-content/uploads/
2011/02/Screen-shot-2011-02-09-
at-9.03.46-PM-w600-h900.png
28. Single Person
Source: http://infosthetics.com/archives/2011/12/
all_the_information_facebook_knows_about_you.html
See also: http://www.youtube.com/watch?feature=player_embedded&v=kJvAUqs3Ofg
32. Final Assignment:Your SocWeb App
• Create a Social Web app with
your group
• Use structured data,
relationships between entities,
data analysis, visualisation
• Write individual research report
on one of the main aspects of
your app
Image Source: http://blog.compete.com/wp-content/uploads/2012/03/Like.jpg
33. Hands-on Teaser
• Build your own recommender
system 101
• Recommend pages on
del.icio.us
• Recommend pages to your
Facebook friends
image source: http://www.flickr.com/photos/bionicteaching/1375254387/