Text Wandering exchanging experiences: crowdsourced based interactive awareness.
Presented at the Territoires innovants conference (Essauira, Maroc 28 February 2019)
1. Open Learning [0]
Text analysis Basics
[Innovation and Sustainability]
2019_04_11
Text Wandering exchanging experiences: crowdsourced based
interactive awareness.
Stefano Lariccia (Sapienza Università di Roma) stefano.lariccia@uniroma1.it
Fernando Martínez de Carnero Calzada (Sapienza University) fernando.martinez@uniroma1.it
UROMA
28/02/2019 Territoires Innovants: Essaouira 1
2. Creating a new protocol framework for the collection of data and
direct personal experiences.
Authors:
Stefano Lariccia (Sapienza University ) stefano.lariccia@uniroma1.it
Fernando Martínez de Carnero Calzada (Sapienza University) fernando.martinez@uniroma1.it
Giovanni Toffoli (Link R&D) toffoli@linkroma.it
9. ● >>> from nltk.book import * *** Introductory Examples for the NLTK Book *** Loading text1, ...,
text9 and sent1, ..., sent9 Type the name of the text or sentence to view it. Type: 'texts()' or
'sents()' to list the materials.
○ text1: Moby Dick by Herman Melville 1851
○ text2: Sense and Sensibility by Jane Austen 1811
○ text3: The Book of Genesis
○ text4: Inaugural Address Corpus
○ text5: Chat Corpus
○ text6: Monty Python and the Holy Grail
○ text7: Wall Street Journal
○ text8: Personals Corpus
○ text9: The Man Who Was Thursday by G . K . Chesterton 1908 >>>
Computing with Language: Texts and Words
10. ● Any time we want to find out about these texts, we just have to enter their names at the Python
prompt:
○ >>> text1
○ >>> text2
○ >>>
● Now that we can use the Python interpreter, and have some data to work with, we’re ready to get
started.
Computing with Language: Texts and Words
11. ● Searching Text There are many ways to examine the context of a text apart from simply reading it.
A concordance view shows us every occurrence of a given word, together with some context.
● Here we look up the word monstrous in Moby Dick by entering text1 followed by a period, then the
term concordance, and then placing "monstrous" in parentheses:
●
Computing with Language: Texts and Words
12. ● Let’s begin by finding out the length of a text from start to finish, in terms of the words and
punctuation symbols that appear. We use the term len to get the length of something, which we’ll
apply here to the book of Genesis:
>>> len(text3) 44764 >>>
● So Genesis has 44,764 words and punctuation symbols, or “tokens.”
● A token is the technical name for a sequence of characters—such as hairy, his, or :)—that we
want to treat as a group.
● When we count the number of tokens in a text, say, the phrase to be or not to be, we are counting
occurrences of these sequences. Thus, in our example phrase there are two occurrences of to,
two of be, and one each of or and not.
● But there are only four distinct vocabulary items in this phrase. How many distinct words does the
book of Genesis contain?
Computing with Language: Counting Vocabulary
13. ● There are many ways to examine the context of a text apart from simply reading it. A concordance
view shows us every occurrence of a given word, together with some context.
● Here we look up the word monstrous in Moby Dick by entering text1 followed by a period, then the
term concordance, and then placing "monstrous" in parentheses:
●
Computing with Language: Searching Text
14. ● Now, let’s calculate a measure of the lexical richness of the text. The next example shows us that
each word is used 16 times on average (we need to make sure Python uses floating-point
division):
>>> from __future__ import division >>> len(text3) / len(set(text3)) 16.050197203298673 >>>
The constant __future__ makes it possible usage of future version constant in a today version of python)
● Now let’s calculate the variance of each of these texts:
○ Text1
○ Text2
○ Text3
○ Text4
○ Text5
○ Text6
○ Text7
○ Text8
○ Text9
Computing with Language: Counting Vocabulary
15. ● Now, let’s calculate a measure of the lexical richness of the text. The next example shows us that
each word is used 16 times on average (we need to make sure Python uses floating-point
division): >>> from __future__ import division >>> len(text3) / len(set(text3)) 16.050197203298673
>>>
● The constant __future__ makes it possible usage of future version constant in a today version of
python)
● Now let’s calculate the variance of each of these texts:
○ Text1
○ Text2
○ Text3
○ …..
17. 28/02/2019 Territoires Innovants: Essaouira 17
Innovation in creativity: Up2U an Open Flexible ecosistem
(Screen shot of the Up2U gateway)
18. 28/02/2019 Territoires Innovants: Essaouira 18
Innovation in creativity: Up2U an Open Flexible
ecosystem
CommonSpaces Social Learning Platform
19. 28/02/2019 Territoires Innovants: Essaouira 19
Innovation in creativity: Up2U an Open Flexible
ecosystem
Learning Experiences Traceability in Up2U
(Learning Analytics, Learning Locker)
20. 28/02/2019 Territoires Innovants: Essaouira 20
Innovation in creativity: Up2U an Open Flexible
ecosystem
Learning Experiences Traceability in Up2U
(Learning Analytics, Learning Locker)
21. 28/02/2019 Territoires Innovants: Essaouira 21
Innovation in creativity: Up2U an Open Flexible
ecosystem
Learning Experiences Traceability in Up2U
(Learning Analytics, Learning Locker)
22. 28/02/2019 Territoires Innovants: Essaouira 22
• Demo Learning Units preparation (part of Up2U Pedagogial WP5)
• Unit 1: New technologies, new learning methods
• Unit 2: Language Technologies and Learning Assessment ()
• Unit 3: Language analysis and Critical Thinking. How to increase your
ability to understand texts and documents ()
• Unit 4: GDPR (General Data Protection Regulation)
• Unit 5: Climate Change risk mitigation ()
• Unit 6: Education to environmental and seismic emergency and to ()
• Unit 7: Valorisation of Archaeological sites ()
• Unit 8: Landscape monitoring and navigation ()
Innovation in creativity: Up2U an Open Flexible
ecosystem
23. 1. Social Movements fighting to promoting and let emerge Open Source data initiatives are trying to
alert research centres, universities, governments and other institutions on the risk for the people
to be alienated of the oil of the future , Big Data
2. Quali analogie presenta la situazione di alienazione dei dati (information alienation) dalle realtà
locali rispetto alla alienation of commodities being produced by workers? Is there any theoretical,
systematic comparison?
3. Developing countries are reluctant to accept limitations to their growth for Climate Change alerts:
they argument that if western countries did all they wanted for centuries, they should not accept
now limitation in the name of the planet
23Territoires Innovants: Essaouira
24. 1. Why Europe should express a european specific position?
2. Open Source movements are more and more visible and authoritative
3. Linux case
4. Galileo case
5. Up2U case
6. How Up2U could support european schools and international schools to increase
awareness on Climate Change problems and mitigation policies
7. A pan european community of geographical monitoring and survey could be
enlarged to a wider community of countries (mediterranean area, middle-east)
Territoires Innovants: Essaouira
25. ● Why a citizen science?
Academic level
Alan Irwin (1995): collaboratively determine the objectives of the research.
Bonney (1996): collective investigations at the Cornell Ornithology Laboratory.
Institutional level
Holdren (2015): voluntary participation of the public in the scientific process.
● Causes of the displacement of the participation of the scientist to the experimenter
○ Management of business and financial models at the public level: users and quality control.
○ Semantic Web and W30: the user as a data source. From hunter to hunted.
○ The cult of storytelling, micro-stories and emotionality.
○ Interactive devices, smartphone and social networks.
○ Traceability and GPS.
○ Big data, processing capacity and interpretation algorithms.
○ Ethical and unethical uses, with acceptance and ignored. The data and its thefts.
Cambridge analytica as limit.
○ Profiles and behavioral patterns.
26. ● Areas of application of citizen science.
○ A digital revolution tailored to technology or a use of
technology tailored to a cultural model?
○ Transversality and interdisciplinarity.
○ Impact factors: social and educational.
○ Technologies and social networks as development
instruments.
○ Applications for tourist studies.
○ Ecotourism, food and wine tourism and cultural tourism:
sustainability, diversification and complementarity of the
offer.
○ Growing interest in financing research.
29. ● The Ten Principles of Citizen Science.
○ 1. Citizen science projects actively involve citizens in scientific endeavour that generates new
knowledge or understanding.
○ 2. Citizen science projects have a genuine science outcome.
○ 3. Both the professional scientists and the citizen scientists benefit from taking part.
○ 4. Citizen scientists may, if they wish, participate in multiple stages of the scientific process.
○ 5. Citizen scientists receive feedback from the project.
○ 6. Citizen science is considered a research approach like any other, with limitations and biases that
should be considered and controlled for.
○ 7. Citizen science project data and metadata are made publicly available and where possible, results
are published in an open-access format.
○ 8. Citizen scientists are acknowledged in project results and publications.
○ 9. Citizen science programmes are evaluated for their scientific output, data quality, participant
experience and wider societal or policy impact.
○ 10. The leaders of citizen science projects take into consideration legal and ethical issues
surrounding copyright, intellectual property, data-sharing agreements, confidentiality, attribution and
the environmental impact of any activities.
30. ● Tourist experience and experiential tourism.
○ Is a non-experiential tourism possible?
○ Experiential tourism as an emerging
collaborative form.
○ The gaps in the tourism: the demand that
does not cover the offer.
○ Educational instruments and appreciation of
the learning process.
31. [Marco Ramazzotti]
1. What kind of tools we can put at work to automate analysis landscapes
photographs and satellite images to achieve the goal of monitoring changes
trend on our planet?
○ Climate Change Emergency: the lack of time can boost the new adaptation process
2. Neural Network and Machine Learning methods to analyse images (aereo
photogrammetry, drone images and satellite images)
○ Many progresses were made into image analysis and pattern recognition
○ Another big step forward can be introduced by man machine interaction
3. Mixed methods (human and automata collaboration]
○ Students of secondary schools are invited to collaborate in communities between them and
in communities with well trained automata
Territoires Innovants: Essaouira
32. 26/02/2019 32
• Mentoring Pilot organization: Italy
• CYC2 Module 1: 1-2 weeks; CYC2 Module 2: 12 weeks, Online
• Distribution of Learning Path 1, 2 and 3 (the last 2 need to be translated)
• Schools engagement: GARR mailing the schools, and regional districts
• Supporting Pilot organization: Greece
• Schools mailing: expected engagement of 100 schools
• Modeling a common collection of data
• Supporting Pilot organization: Lithuania
• Schools engagement: active engagement of 80 schools
• Modeling a common collection of data
• Supporting Pilot organization: Hungary
● Schools engagement: expected engagement of 50 schools
Essaouira 28 January 2019
33. 26/02/2019 33
• Supporting Pilots, general overview:
• Supporting Pilot organization: Poland
■ Schools mailing: expected engagement of 100 schools
■ Modeling a common collection of data
• Supporting Pilot organization: Portugal
■ Schools mailing: expected engagement of 30 schools
• Supporting Pilot organization: Spain
■ Schools mailing: expected engagement of 30 schools
• Supporting Pilot organization: Switzerland
■ Schools recruitment: expected engagement of 5 schools
Essaouira Up2U January 2019
34. 28/02/2019
Territoires Innovants: Essaouira
34
• Further piloting activities into Up2U ecosystem
• “Light” integration of existing Jupyter notebooks based on
CommonSpaces integrated for formal/informal in Up2U;
• University as a Hub is planning first “flipped” class for Italian secondary
schools based on CommonSpaces. 1 Pilot will start i classroom on February
• Starting planning May Meeting in Rome at SLERD 2019
• In Italy (to be integrated into WP7): started negotiation with Heritage
Ministry to co-authoring a pilot in italian secondary schools about
“Emergency and seismic education and risk mitigation” and “Valorisation
of Archaeological sites” ; this latter activity will converge during SLERD
2019 in a demonstration of Up2U system used in Heritage studies.
January - February 2019 [3]
35. 28/02/2019
Territoires Innovants: Essaouira
35
• Planned piloting activities into Up2U ecosystem
• “Light” integration of existing Jupyter notebooks based on
CommonSpaces integrated for formal/informal in Up2U;
• UaH is planning first “flipped” class for Italian secondary schools based
on CommonSpaces. 1 Pilot will start in classroom on February / March
• Planning May Meeting in Rome at SLERD 2019
• Negotiation with the Italian Heritage Ministry to co-authoring a pilot in
italian secondary schools about “Emergency and seismic education and
risk mitigation” and “Valorisation of Archaeological sites”
March - December 2019