The document discusses structuring unstructured data on the web for machine understanding. It covers named entity recognition and linking to identify and disambiguate entities in text. Several challenges are presented, along with state-of-the-art systems and the NERD initiative, which combines multiple systems. Live topic generation from social media data is also discussed, including an example of tracking a political event on Twitter. The goal of creating a large entity graph from open web and social data is presented.
Learning with the Web: Spotting Named Entities on the intersection of NERD an...Giuseppe Rizzo
Talk "Learning with the web: spotting named entities on the intersection of nerd and machine learning" event during #MSM'13 (WWW'13), Rio de Janeiro, Brazil
Microposts shared on social platforms instantaneously report facts, opinions or emotions. In these posts, entities are often used but they are continuously changing depending on what is currently trending. In such a scenario, recognising these named entities is a challenging task, for which off-the-shelf approaches are not well equipped. We propose NERD-ML, an approach that unifies the benefits of a crowd entity recognizer through Web entity extractors combined with the linguistic strengths of a machine learning classifier.
Is the grass greener in ireland? A comparison of UX in Dublin and MelbourneCory-Ann Joseph
Blockbuster movie premieres. Beyonce's Formation World Tour. Amazon Prime.
Ever get the feeling Australia is a little left out?
As designers, we spend a lot of time identifying and discussing what’s wrong with the UX industry. Sometimes we become trapped in wistful thinking—someone somewhere else surely has it better than we do here.
But what challenges are UX designers facing overseas? How are our problems the same, and how are they different? Could Australia even be...ahead in some ways?
Independent UX Lead Cory-Ann Joseph recently returned to Melbourne after 8 years in Dublin, Ireland, and will take UX Gatherings on a deep dive into the UX scene there. She’ll compare the highs and lows, and share her insight on what’s next for UX practitioners, agencies and in-house teams
Due to the increasing uptake of semantic technologies, ontologies are becoming part of a growing number of software development projects. As a result, ontology development teams have to combine their activities with software development practices. In this presentation some practices, tools and examples of new trends in ontological engineering are provided.
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...Ghislain ATEMEZING
This talk presents some best practices and ontology engineering applied to internet of things. The talk was presented during the 2nd IEEE World Forum on Internet of Things held in Milan, from December 14th to December 16th, 2015.
Learning with the Web: Spotting Named Entities on the intersection of NERD an...Giuseppe Rizzo
Talk "Learning with the web: spotting named entities on the intersection of nerd and machine learning" event during #MSM'13 (WWW'13), Rio de Janeiro, Brazil
Microposts shared on social platforms instantaneously report facts, opinions or emotions. In these posts, entities are often used but they are continuously changing depending on what is currently trending. In such a scenario, recognising these named entities is a challenging task, for which off-the-shelf approaches are not well equipped. We propose NERD-ML, an approach that unifies the benefits of a crowd entity recognizer through Web entity extractors combined with the linguistic strengths of a machine learning classifier.
Is the grass greener in ireland? A comparison of UX in Dublin and MelbourneCory-Ann Joseph
Blockbuster movie premieres. Beyonce's Formation World Tour. Amazon Prime.
Ever get the feeling Australia is a little left out?
As designers, we spend a lot of time identifying and discussing what’s wrong with the UX industry. Sometimes we become trapped in wistful thinking—someone somewhere else surely has it better than we do here.
But what challenges are UX designers facing overseas? How are our problems the same, and how are they different? Could Australia even be...ahead in some ways?
Independent UX Lead Cory-Ann Joseph recently returned to Melbourne after 8 years in Dublin, Ireland, and will take UX Gatherings on a deep dive into the UX scene there. She’ll compare the highs and lows, and share her insight on what’s next for UX practitioners, agencies and in-house teams
Due to the increasing uptake of semantic technologies, ontologies are becoming part of a growing number of software development projects. As a result, ontology development teams have to combine their activities with software development practices. In this presentation some practices, tools and examples of new trends in ontological engineering are provided.
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...Ghislain ATEMEZING
This talk presents some best practices and ontology engineering applied to internet of things. The talk was presented during the 2nd IEEE World Forum on Internet of Things held in Milan, from December 14th to December 16th, 2015.
Curriculum data enrichment with ontologiesILOT Project
Presentation for the 4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014 > http://wims14.csd.auth.gr
Our article on ACM > http://dl.acm.org/citation.cfm?doid=2611040.2611070
Policies aimed at bringing universities closer together have always been (and still are) sensitive political issues.
Ascertaining the position and weight of UTC in a COMUE* alongside two major French Universities (Paris 4
(Sorbonne) and University of Paris 6 (Pierre & Marie Curie, or UPMC) has been no simple matter. Among the issues
is the place for technology in a world of traditional ‘pure’ science. Another is the pedagogical contribution of the
arts and humanities that have been an integral factor for UTC, in both teaching and research since the beginning.
Transforming repositories: from repository managers to institutional data man...JISC KeepIt project
The last decade has seen support for digital preservation transformed. There are now a multitude of organisations, training courses, and software development tools to help guide managers of digital data towards preservation decisions and solutions. But how well do these approaches understand the needs and requirements of users? This presentation was given at ECA 2010, a conference for digital archiving professionals. But not everyone can be a digital archiving specialist. At a time of exploding volumes of digital content, especially on the Web, many non-specialists need help in preserving digital content. The presentation looks at the applicability and practicality of all this support for one class of user, digital repositories, and in particular institutional repositories (IRs) and their managers. We report on a course on digital preservation tools, designed by repository managers as part of the JISC KeepIt project. Positive feedback from the evaluations of this course have show that the emergence of the tools used in this course is a great story for digital preservation.
Open Data Day 2016, Km4City, L’universita’ come aggregatore di Open Data del ...Paolo Nesi
Open Data Day, UNIMORE, Modena, 5 Marzo 2016.
Aggregazione dati, experienza di Firenze,
Smart City, Km4City,
Smart Decision Support,
Data Ingestion manager,
Data aggregation,
User profiling on demand.
Mobilità: inter-modalità, bigliettazione integrata, sostenibile, scambiatori, sfruttamento stazioni, etc.,
Servizi: gov ..SUAP, edu, turismo, beni culturali, salute, etc.,
Energia: risparmio energetico, riduzione amissioni, inquinamento, etc.,
Ambiente: qualità dell’aria, fiumi, meteo, rifiuti, etc.,
… commercio, industria, etc.
... Infrastrutture critiche. resilienza
Collezionamento dati statici, quasi statici e real time, stream
Dati open: geo localizzati, servizi, statistiche, censimenti, etc.
Dati privati degli operatori: con licenze limitate per non permettere di fare profitto ad altri operatori sulla base dei loro dati
Dati personali delle persone: profili, comportamenti tramite APP, IOT, sensori, web, etc.
Integrazione dati per renderli semanticamente interoperabili, ed operare deduzioni (time, space… )
I tradizionali collettori di open data danno visioni statistiche ma non sono adatti a produrre servizi integrati
Integrazione con modelli semantici unificanti come Km4City
Control Room delle Città Metropolitane devono:
arrivare a supervisionare domini multipli e le interdipendenze fra mobilità, energia, comunicazione, servizi, flussi traffico, flussi pedonali, turismo, etc.
Migliorare la loro Resilienza, capacità di reazione ed assorbimento
ridurre i costi sociali della mobilità per le persone
consentendo minori disagi, maggiore efficienza,
maggiore sensibilità verso le necessità del cittadino,
minori emissioni, migliori condizioni ambientali;
percorsi info-formativi in modo che il cittadino cambi le abitudini non virtuose;
ridurre i costi di trasporto ed i tempi di percorrenza per gli utenti, per i gestori e le amministrazioni, tramite soluzioni di ottimizzazione.
Ontology Building vs Data Harvesting and Cleaning for Smart-city ServicesPaolo Nesi
Presently, a very large number of public and private data sets are available around the local governments. In most cases, they are not semantically interoperable and a huge human effort is needed to create integrated ontologies and knowledge base for smart city. Smart City ontology is not yet standardized, and a lot of research work is needed to identify models that can easily support the data reconciliation, the management of the complexity and reasoning. In this paper, a system for data ingestion and reconciliation of smart cities related aspects as road graph, services available on the roads, traffic sensors etc., is proposed. The system allows managing a big volume of data coming from a variety of sources considering both static and dynamic data. These data are mapped to smart-city ontology and stored into an RDF-Store where they are available for applications via SPARQL queries to provide new services to the users. The paper presents the process adopted to produce the ontology and the knowledge base and the mechanisms adopted for the verification, reconciliation and validation. Some examples about the possible usage of the coherent knowledge base produced are also offered and are accessible from the RDF-Store and related services. The article also presented the work performed about reconciliation algorithms and their comparative assessment and selection. Keywords Smart city, knowledge base construction, reconciliation, validation and verification of knowledge base, smart city ontology, linked open graph.
OpenAIRE at the 8th e-Infrastructure Concetration Meeting Nov 5, 2010 CERN -...OpenAIRE
By Iryna Kuchma (EIFL), Birgit Schmidt (Goettingen State and University Library) presented at the 8th e-Infrastructure Concetration Meeting Nov 5, 2010 CERN - Geneva
Digital Presentation Best Practices: Lessons Learned From Across the PondULB - Bibliothèques
Digital Presentation Best Practices: Lessons Learned From Across the Pond. Slavko Manojlovich (Associate University Librarian (IT) / Manager, Digital Archives Initiative Memorial University St Johns Canada) and Benoit Pauwels (Head, Library Automation Team, Université libre de Bruxelles Belgium)
Digital Preservation Best Practices: Lessons Learned From Across the PondBenoit Pauwels
Digital Preservation Best Practices: Lessons Learned From Across the Pond. Slavko Manojlovich (Associate University Librarian (IT) / Manager, Digital Archives Initiative Memorial University St Johns Canada) and Benoit Pauwels (Head, Library Automation Team, Université libre de Bruxelles Belgium)
Online Index Extraction from Linked Open Data SourcesFabio Benedetti
This presentation has been held by me at the Workshop titled Linked Data for Information Extraction 2014 (LD4IE) held at the International Semantic Web Conference 2014. The related paper is titled "Online Index Extraction from Linked Open Data Sources" and here is the link: http://ceur-ws.org/Vol-1267/LD4IE2014_Benedetti.pdf
The Learning Registry: Social networking for open educational resources?Lorna Campbell
This presentation will reflect on Cetis’ involvement with the Learning Registry and JISC’s Learning Registry Node Experiment at Mimas (The JLeRN Experiment), and their application to UKOER initiatives. Initially funded by the US Departments of Education and Defense, the Learning Registry (LR) is an open source network for storing and distributing metadata and curriculum activity and social usage data about learning resources across diverse educational systems.
Similar to Learning with the Web. Structuring data to ease machine understanding (20)
AI value, tools and applications in public services: the application in easyRights, an H2020 project, for supporting social inclusion and two ongoing studies on AI applied to support the fight against COVID-19. Seminar at Politecnico di Milano
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
2. July 11th, 2013 Università di Torino, Italy 2/44
Google
Knowledge
Graph
Viewer
3. July 11th, 2013 Università di Torino, Italy 3/44
Google Knowledge Graph
4. July 11th, 2013 Università di Torino, Italy 4/44
The Google Knowledge Graph bulk:
encyclopedic sources
5. July 11th, 2013 Università di Torino, Italy 5/44
Web community has highlithed the road,
but ...
6. July 11th, 2013 Università di Torino, Italy 6/44
Vast wealth of unstructured data
“80% of data on the Web and on internal
corporate intranets is unstructured"
“80% of data on the Web and on internal
corporate intranets is unstructured”
“Semantic Web and Information Extraction Workshop”, SWAIE
at RANLP2013
7. July 11th, 2013 Università di Torino, Italy 7/44
The entire digital universe, going to
be part of the Web
“unstructured data will account for 90 percent of
all data created in the next decade”
IDC IVIEW, “Extracting Value from Chaos”, June 2011
8. July 11th, 2013 Università di Torino, Italy 8/44
Structured means
making those
resources available to be easily processed
by machines
9. July 11th, 2013 Università di Torino, Italy 9/44
A Web of Linked Entities
http://wole2013.eurecom.fr
http://wole2012.eurecom.fr
➢ GGG (global giant graph)
http://goo.gl/fH3h
➢ Nodes are Web entities
➢ Entities provide disambiguation
pointers
➢ Entities can be univocally referred
(disambiguated)
➢ Entities as centroids for topic
generation and undestanding
10. July 11th, 2013 Università di Torino, Italy 10/44
Chapter 1:
Named Entity Recognition (NER)
and
Named Entity Linking (NEL)
11. July 11th, 2013 Università di Torino, Italy 11/44
I want to book a room in an hotel located in the
heart of Paris, just a stone’s throw from the
Eiffel Tower
Eric Charton, “Named Entity Detection and
Entity Linking in the Context of Semantic Web:
Exploring the ambiguity question”
12. July 11th, 2013 Università di Torino, Italy 12/44
Part of Speech
I
want
to
book
a
room
in
..
Paris
PRP
VBP
TO
VB
DT
NN
IN
..
NNP
I
want
to
book
a
room
in
..
Paris
NER: What is Paris?
NEL: Which Paris are we
talking about?
13. July 11th, 2013 Università di Torino, Italy 13/44
What is Paris?
Type ambiguity
asteroid location/city film
14. July 11th, 2013 Università di Torino, Italy 14/44
Entity recognition
I
want
to
book
a
room
in
..
Paris
PRP
VBP
TO
VB
DT
NN
IN
..
NNP
I
want
to
book
a
room
in
..
Paris
O
O
O
O
O
O
O
..
LOC
15. July 11th, 2013 Università di Torino, Italy 15/44
NER: State of the art
➢ CRFs (Conditional Random Fields)
➢ FSM (Finite-State Machine)
➢ HMM (Hidden Markov Model)
➢ Gazetteers
➢ Wikipedia/DBpedia
➢ In-house dictionaries
16. July 11th, 2013 Università di Torino, Italy 16/44
Which Paris?
Name ambiguity
Paris, Kentucky Paris, Maine Paris, Tennessee
Paris, France Paris, Ontario
Paris, Idaho
17. July 11th, 2013 Università di Torino, Italy 17/44
Entity linking
I
want
to
book
a
room
in
..
Paris
PRP
VBP
TO
VB
DT
NN
IN
..
NNP
I
want
to
book
a
room
in
..
Paris
O
O
O
O
O
O
O
..
LOC
O
O
O
O
O
O
O
..
http://en.wikipedia.org/wiki/Paris
18. July 11th, 2013 Università di Torino, Italy 18/44
Ambiguity resolution: linking to an
external knowledge base
➢ Wikipedia/DBpedia
➢ Gigaword Corpus
➢ In-house dataset
➢ LOD dataset
➢ DBLP
➢ ACM
➢ BBC
➢ ...
19. July 11th, 2013 Università di Torino, Italy 19/44
NEL: State of the art
➢ Clustering
➢ Vector Space Model (Cosine similarity or
Maximum Entropy) – it requires a priori
knowledge of the spotted entities
➢ Conditional probability – it requires a priori
knowledge of the spotted entities
➢ Dictionaries
➢ Wikipedia/DBpedia
➢ In-house dataset
20. July 11th, 2013 Università di Torino, Italy 20/44
Processing natural language texts
➢ Several attempts from the Web community to
structure the large wealth of data available
➢ Numerous off-the-shelf systems (commercial, and
academic) that perform the NER+NEL chain
➢ AlchemyAPI
➢ DBpedia Spotlight
➢ Wikimeta
➢ TextRazor
➢ Stanford CRF
➢ ...
21. July 11th, 2013 Università di Torino, Italy 21/44
The NERD initiative
http://nerd.eurecom.fr
22. July 11th, 2013 Università di Torino, Italy 22/44
Combination of off-the-shelf systems
and properly trained CRFs
23. July 11th, 2013 Università di Torino, Italy 23/44
The strength of this approach lies in the fact that
the supported off-the-shelf systems have access
to large knowledge bases of entities such as
DBpedia and Freebase, while CRFs are domain
specific
24. July 11th, 2013 Università di Torino, Italy 24/44
Diversity
Alchemy
API
DBpedia
Spotlight
Extractiv Lupedia Open
Calais
Saplo Semi
Tags
Wikimeta Yahoo! Zemanta
Classification
schema
Alchemy DBpedia
FreeBase
Scema.org
Extractiv DBpedia
LinkedM
DB
Open
Calais
Saplo ConLL-
3
ESTER Yahoo FreeBase
Number of
classes
324 320 34 319 95 5 4 7 13 81
25. July 11th, 2013 Università di Torino, Italy 25/44
NERD Ontology
NERD type Occurrence
Person 10
Organization 10
Country 6
Company 6
Location 6
Continent 5
City 5
RadioStation 5
Album 5
Product 5
... ...
The NERD ontology has been integrated in the NIF project, a EU FP7 in the
context of the LOD2: Creating Knowledge out of Interlinked Data
26. July 11th, 2013 Università di Torino, Italy 26/44
Learning with the Web
➢ FSM-core based
➢ combination of the NERD supported off-the-shelf
systems
➢ ML-core based
➢ combination of the NERD supported off-the-shelf
systems
– and a CRF, properly trained with the given corpus
27. July 11th, 2013 Università di Torino, Italy 27/44
Challenges and benchmark
28. July 11th, 2013 Università di Torino, Italy 28/44
ETAPE 2012 - Entity Extraction
Challenge
➢ French transcripts of radio and video programs
➢ Challenge objective: entity typing
➢ Sumitted system:
➢ FSM-core based
➢ Given annotation priority to the systems that have
fine grained classification schemes
➢ Ranked 7th/7
29. July 11th, 2013 Università di Torino, Italy 29/44
#MSM'13 - Concept Extraction
Challenge
➢ English Twitter microposts
➢ Challenge objective: entity typing
➢ Submitted system:
➢ ML-core based: SVM
➢ Features = linguistic features (some of them are
capitalization, 3 chars of prefix and suffix, POS), output
of a CRF properly trained with the challenge training
dataset, outputs of the off-the-shelf systems
➢ Ranked 2nd/22
30. July 11th, 2013 Università di Torino, Italy 30/44
CoNLL-2003
➢
English newswire corpus
➢
Benchmark objective: entity typing
➢
System:
➢
ML-core based: SVM and NB
➢
Features = linguistic features (some of them are capitalization, 3
chars of prefix, 3 chars of suffix, POS), output of a CRF properly
trained with the challenge training dataset, output of the
off-the-shelf systems
➢
Results: outperformed significantly the performances of all
the systems (off-the-shelf) used as inputs and the Stanford
CRF properly trained with the CoNLL-2003 training corpus
31. July 11th, 2013 Università di Torino, Italy 31/44
TAC KBP 2011
➢ English newswire corpus
➢ Benchmark objective: entity linking
➢ System:
➢ FSM-core based
➢ Features: outputs of the off-the-shelf systems,
harmonized with the Gigaword corpus
ongoing
32. July 11th, 2013 Università di Torino, Italy 32/44
NERD in action
http://nerd.eurecom.fr/annotation/247957
33. July 11th, 2013 Università di Torino, Italy 33/44
Chapter 2:
Annotating streams of
heterogeneous data coming from
social platforms for topic
generation
34. July 11th, 2013 Università di Torino, Italy 34/44
The Social Web is growing fast and is becoming
of a crucial importance for research and
companies
35. July 11th, 2013 Università di Torino, Italy 35/44
Social Web = Big Data
Gartner “3V” definition: Volume, Velocity, Variety
of microposts
36. July 11th, 2013 Università di Torino, Italy 36/44
Microposts
➢ Short (~140 characters) and informal text
➢ Grammar free text
➢ Slang
➢ Media items
➢ Picture
➢ Video
37. July 11th, 2013 Università di Torino, Italy 37/44
Can we make sense out of the massive and
rapidly changing amount of information shared in
the Social Web?
38. July 11th, 2013 Università di Torino, Italy 38/44
Live topic generation
http://youtu.be/8iRiwz7cDYY
39. July 11th, 2013 Università di Torino, Italy 39/44
http://mediafinder.eurecom.fr
40. July 11th, 2013 Università di Torino, Italy 40/44
Tracking and analyzing an event
➢ 1 week period
➢ We collected microposts enclosed with pictures
➢ We followed the 2013 Italian Election
➢ We compared the results with the articles
published in those days on famous newspapers
http://youtu.be/jIMdnwMoWnk
41. July 11th, 2013 Università di Torino, Italy 41/44
http://mediafinder.eurecom.fr/story/elezioni2013
42. July 11th, 2013 Università di Torino, Italy 42/44
Outlook: an entity graph from the open and
Social Web
43. July 11th, 2013 Università di Torino, Italy 43/44
Thanks for your time and attention
http://www.slideshare.net/giusepperizzo
44. July 11th, 2013 Università di Torino, Italy 44/44
Do you have any questions?